Callaghan, M. and Hand, C. "Presentation and Representation of Implicit Knowledge in the World Wide Web". Workshop on Knowledge Media for Improving Organizational Expertise - Impacts of new methods and enabling technologies, at International Conference on Practical Aspects of Knowledge Management, Basel, Switzerland, 30-31 October 1996.




Presentation and Representation of Implicit Knowledge in the World Wide Web

Mike Callaghan and Chris Hand

Department of Computer Science,
De Montfort University,
Leicester, UK. LE1 9BH.


Abstract
Despite the vast amounts of information available on the World Wide Web today, users are provided with very little in the way of tools to allow them to work with the knowledge which is buried in this information -- if the Web is to become a Knowledge Medium then browsers and other tools need to evolve. This paper suggests that a user-centered approach to Knowledge Media is required, drawing on principles and techniques from Human-Computer Interaction (HCI) instead of, or in addition to, Artificial Intelligence (AI). A further claim of the paper is that Knowledge Media tools should reflect and support the natural human tendency towards fuzzy and imprecise working with knowledge, rather than imposing formal structure.

1. Introduction

Stefik's early vision of Knowledge Media reflected on the lack of richness in computer-mediated communication:

"The networks carry mostly data, not knowledge; low-level facts, not high-level memes. [...] very little of what the computers are transmitting is akin to what people talk about in serious conversation". (Stefik, 1988; p329)

To what extent does this still hold? It is certainly true that there is now a vast amount of information stored as text (or hypertext), both on the Internet (specifically, the World Wide Web) and on corporate intranets. However, the answer to the question of whether this represents data or knowledge depends very much on whether "knowledge" can only be said to exist on a computer if it is in some structured form which enables a degree of automatic inferencing. At the moment, by far the greater proportion of the knowledge which the Internet's information represents is embedded within the text and is thus present only implicitly, requiring parsing and understanding by the human reader before it can fulfil its function as usable knowledge. One could say that the only explicit representation of relations within the documents are in the hypertext links, and even these tend only to represent the most generic form of association between related texts. The "use" of this crude form of machine-supported knowledge is limited to the browser reducing the effort required on the part of the human reader to access related information.

So, at one level, Stefik's criticism still holds; despite the implicit richness of some of the material available on the Internet and accessible through intranets (reports, research papers, standards documents, course material etc.) the World Wide Web as it stands today is not yet being used as a Knowledge Medium. The central thesis of this paper is that there is not one but two key approaches to progressing towards this goal. What is perhaps the mainstream approach is to work from an artificial intelligence (AI) perspective, and to propose conventions and methods by which knowledge bases of formally represented material can be fruitfully interlinked, allowing rich, knowledge-sharing collaboration between their authors and users. While systems such as Ontolingua (Gruber, 1993; Farquhar, 1995) will doubtless play an important role in exploring the interchange and re-use of formally-represented knowledge, it is clear that the success of this kind of project depends on the resolution of problems which have dogged knowledge-based systems from their conception: those of knowledge elicitation and usability.

An alternative emphasis, central to this paper, is to work from a more human-centred perspective and to shift the focus towards supporting the relevant cognitive processes in the human users of Knowledge Media by an appropriate combination of representation and presentation.

In fact, this was also part of Stefik's original vision of Knowledge Media:

"the most commonly understood goal of [AI is] to build intelligent, autonomous thinking machines. [...] In contrast, the goal of building a knowledge medium draws attention to the main source and store of knowledge in the world today: people."

(Stefik, 1988; p327)

This paper first takes a brief look at the limitations of the World Wide Web's current hypermedia model, before examining ways in which visualisation techniques could be applied to improving the presentation of knowledge structures using hypermedia tools. We then look at some of the problems of representing and communicating knowledge, before summarising our current approach to developing knowledge media tools.

2. The limitations of Web-based hypermedia

If one accepts the importance of making use of the existing texts on the Internet as source material for the support of Knowledge Media working, then one faces the immediate problem of the absence of explicity-represented knowledge structures in documents composed in Hypertext Mark-up Language (HTML). This may be partly due to ignorance of the hypertext structuring features of HTML, such as the use of the REL and REV attributes to specify relationships between links and anchors in HREF links (Maloney, 1995), and partly due to a lack of knowledge structuring tools that work within the WWW hypermedia model. For some time there have been hypermedia systems such as Aquanet (Marshall et al, 1991) and SEPIA (Streitz et al, 1992) which allow users to work directly with knowledge structures, but the critical mass of the Web seems to have somewhat deflected the path of hypermedia research and development by providing a widely-used but over-simplified hypertext model. Clearly no-one will go to the trouble of including more explicit structure in their documents if this "added value" information is invisible to most readers.

So, despite the size and potential value of this knowledge resource, the tools which we have at our disposal for working with this textual information are still surprisingly limited in function and sophistication. At the authoring stage, techniques such as outlining enable a degree of freedom in working with the intrinsic structure of rich textual information. For the reader of Web documents, however, common browsers tend to support a very passive mode of assimilation, offering only system-supported "copy and paste" functions for creating summary documents; there is a distinct lack of tools designed to help readers (and potential understanders) of the text to extract, visualise and manipulate the knowledge structures implicit in the text.

3. Harnessing our perceptual strengths

If Knowledge Media are to communicate knowledge between human users, then what is required is a user-centered approach to developing the tools and infrastructure which will allow us to harness the perceptual and cognitive strengths of humans. The full potential of Knowledge Media will not be achieved purely by applying AI techniques, since ultimately we will always have human users "in the loop".

The science of Visualisation has given us tools which take unmanageable quantities of figures and statistics and present them to us in forms which engage the perceptual mechanisms which we use well (and which machines still lack, despite AI's best attempts). But why should these kinds of tools only be applied to numerical data, when so much of the information we must process every day is textual? Knowledge Media needs to produce tools which, although not automating the process, will help users to identify, extract and manipulate the knowledge structures implicit in the increasing number of web documents and legacy text files with which they work on-line.

Compared to many other hypertext systems, the current incarnation of the Web concentrates almost exclusively on navigating hypertext at the node level and includes little support for working with implicit knowledge. Bookmarks or hotlists and sequential history mechanisms do nothing to aid the user in annotating text, extracting or linking related concepts or working at the structural level (as might be provided using graphical overviews for example).

Information visualisation techniques such as Cone Trees and Perspective Walls (Robertson, Card and Mackinlay, 1993) use visual abstraction to increase the speed of pattern detection, while mapping tree structures onto hyperbolic space (Lamping, Rao and Pirolli, 1995) or using 3D representations allows information to increase the density of information. By concentrating on the presentation of the information we can engage the user's perceptual capabilities more efficiently. We hope to see these kinds of technique in next-generation web tools (perhaps via Java extensions), not simply for visualisation but as a way to implement direct manipulation knowledge structuring tools for the Web.

4. Communicating Knowledge

When reading or studying a piece of text we often take notes, make annotations and place bookmarks to help us to assimilate the knowledge which is implicit in the text. Concept maps, which represent concepts and relationships in a graphical form, are a popular and flexible way of taking notes "visually". Many of the advocates of concept maps stress the importance of their flexibility; unlike the many notations commonly used in software design, creators of concept maps are encouraged to use any appropriate ad hoc symbols or sketches to enhance the result. There is certainly some contradiction between the requirement (in order to allow automation and inferencing) for a formal description of the allowable concepts and relationships (ie. an ontology) and the natural human tendency towards free and varied expression.

However, most attempts to provide computer support for concept mapping require that concept maps should have a rigid "visual syntax" which makes them more amenable to computerisation. Gaines and Shaw's (1995) approach to using concept maps as a technique for sharing knowledge structures over the Web would be suitable for representing formal entities such as Toulmin Structures (Toulmin, 1958), but the flexibility of unconstrained sketching with a pen and paper is lost.

Furthermore, the inherent subjectivity of a concept map makes it less than ideal as a way of communicating knowledge to others (one might suggest that, were it otherwise, we would all have abandoned natural language long ago and would now write all scientific papers in some graphical Esperanto). This recognition of the "contingency" of graphical representations of an individual's knowledge structures is symptomatic of the view of language and the limitations of all formal representations which is most cogently expressed by Rorty (Rorty, 1989).

The requirement for computer-supported, "fuzzy", informal working with knowledge suggests that approaches which aim to formally and explicitly represent the knowledge are inappropriate when the knowledge processing will be performed by users themselves. However, the informal fuzziness of concept maps does not mean they are totally incompatible with computer use. Spatial user interfaces, such as that used in the VIKI "spatial hypertext" system (Marshall et al, 1994), allow relationships between entities to be represented even though their meaning is as-yet undefined. Simply grouping items near each other indicates that they share something in common (the Gestalt Law of Closure). This ability to use space in representing relationships and concepts in an imprecise way is indicative of the approach we must take if we are to progress in our design of cognitive tools; the alternative is to constrain ourselves to working within the confines of an impoverished ontology.

Finally, the problem which we find most interesting is almost the inverse to providing computerised tools for creating concept maps. Rather than starting with a blank page on which the concepts are drawn, the process of annotation involves taking the initial text and adds extra markup to highlight ideas and to point at the implicit knowledge. It is extremely interesting that the early Mosaic browser from NCSA supported collaborative annotation of Web documents, and it is equally bewildering that this was somehow lost during Mosaic's evolution. It is likely that as the Web moves more towards becoming a Knowledge Medium we will see increased efforts to re-instate collaborative annotation (eg. Röscheison, 1995) as well as combining techniques and principles from HCI and CSCW with WWW technology.

5. Current Work

We have in the past experimented with the use of free spatial layout techniques in the authoring process (Ehret, 1995) and flexible spatially-oriented note-making tools as extensions to web browser (Aithal, 1995). We are currently incorporating some of the ideas expressed in this paper into a Java-based prototype environment which makes use of the Habañero classes from NCSA (NCSA, 1996). The latter enable the incorporation of groupware facilities in interactive Java applications. The focus of our work is in providing flexible spatially-oriented tools which enable the shared representation and manipulation of loosely-structured concepts and relationships, extracted using direct manipulation techniques from displayed web pages. In parallel with this work on interactive prototypes, we are exploring a range of possible data models to support these kinds of representations, with the aim of making maximum use of accepted standards such as SGML.

Acknowledgements

We would like to thank Andreas Ehret and Vittal Aithal for their work on the hypertext tools mentioned above, and Dr. Thomas W. Routen for his comments on early drafts of this paper.

References

Aithal, V. (1995) "An Examination and Implementation of Possible Extensions to the Current Generation of WWW Browsing and Authoring Tools". Unpublished MSc Thesis, Department of Computer Science, De Montfort University Leicester, UK. September 1995.

Ehret, A. (1995) "An Examination of Intuitive and Informal Hyperdocument Authoring Techniques and the Implementation of a Graphical Authoring Environment". Unpublished MSc Thesis, Department of Computer Science, De Montfort University Leicester, UK. September 1995.

Farquhar A. (1995) "KR and the Web" Position statement for Panel "Building Global Knowledge Webs", Fourth International conference on the World Wide Web, Boston, December 11-14, 1995.
<URL: http://www.w3.org/pub/Conferences/WWW4/Panels/krp/farquhar.html>

Gaines, B. R. and Shaw, M. L. G. (1995) "Concept Maps as Hypermedia Components". International Journal of Human-Computer Studies, 43(3): 323-362.
<URL: http://ksi.cpsc.ucalgary.ca/articles/ConceptMaps/CM.html>

Gruber, T. R. (1993) "A Translation Approach to Portable Ontology Specifications". Knowledge Acquisition, 5(2):199-220, 1993.
<URL: http://www-ksl.stanford.edu/knowledge-sharing/papers/ontolingua-intro.ps>

Lamping, J., Rao, R. and Pirolli, P. (1995) "A Focus+Context Technique Based on Hyperbolic Geometry for Visualizing Large Hierarchies". Proceedings of CHI'95. Denver, Colorado, USA. May, 1995.
<URL: http://www.acm.org/sigchi/chi95/Electronic/documnts/papers/jl_bdy.htm>

Maloney, M. "Hypertext Links in HTML". HTML Working Group Discussion Paper, 19th July 1995.
<URL: http://www.sq.com/papers/Relationships.html>

Marshall, C C, Halasz, F G, Rogers, R A and Janssen Jr, William C. "Aquanet: a hypertext tool to hold your knowledge in place". Proceedings of Hypertext '91, ACM Press, 1991. pp261-275.

Marshall, C.C.; Shipman, F.M.; Coombs, J.H. "VIKI: Spatial Hypertext Supporting Emergent Structure". In Proceedings of the ACM European Conference on Hypermedia Technologies (Edinburgh, Scotland, Sept. 18-23), 1994, pp. 13-23.

NCSA (1996). "Habañero Project Overview". Software Development Group, National Centre for Supercomputing Applications, University of Illinois at Urbana-Champaign, USA.
<URL: http://www.ncsa.uiuc.edu/SDG/Software/Habanero/>

Robertson, G., Card, S. and Mackinlay, J. (1993) "Information Visualization using 3D Interactive Animation". Communications of the ACM, 36(6): 56-71, June 1993.

Rorty, R. "Contingency, Irony and Solidarity". Cambridge University Press, 1989.

Röscheisen, M., Mogensen, C. and Winograd, T. "Beyond Browsing: Shared Comments, SOAPs, Trails and On-line Communities". in Proceedings of WWW'95, Darmstadt, 1995.
<URL: http://www-diglib.stanford.edu/diglib/pub/reports/brio_www95.html>

Stefik, M J. "The Next Knowledge Medium". in The Ecology of Computation, B H Huberman (Ed). Elsevier, 1988. pp315-342.

Streitz, N A, Haake, J, Hannemann, J, Lemke, A, Schuler, W, Schutt, H and Thuring, M. "SEPIA: A cooperative hypermedia authoring environment". in Proceedings ofthe European Conference on Hypertext (ECHT'92), Milan, Italy Nov 30-Dec 4 1992. pp11-22.
<URL: ftp://ftp.darmstadt.gmd.de/pub/wibas/SEPIApaper.ps.Z>

Toulmin, S. (1958). The Uses of Argument. Cambridge, UK, Cambridge University Press.