The Invisible Layers
In order to better understand where representation comes into play, it is useful to distinguish several layers that are at work: "thing", "subject" ,"expression", "interpretation", "perspective", "view".
Thing
The first step is to recognize the existence of things. Things exist per se. A thing is considered to exist in the universe, independently of any observer. It has no name, no description, it simply is. We can't even talk about it. There is nothing anybody can do with it otherwise than to acknowledge its mere existence.
Subject
A subject is an understanding of a thing. There are multiple ways of understanding the same thing. A subject is the process of conceptualizing a thing. This is different from the definition that the current Topic Maps standard gives of a subject: "In the most generic sense, a subject is any thing whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever." In the current presentation, a subject is not a thing any more, it is a way to understand it. A subject is a way to make sense of a thing. It is pure meaning. This is where the semantic universe begins. The difference between a subject and a thing is that a thing becomes a subject when we start talking about it.
Examples of subjects are concepts, ideas, persons we know or are talking about. Subjects can also be relationships. Even documents discussing subjects can be themselves considered subjects. The rationale for their creation, the context in which they have been created, their creators, etc., are important to understand what they stand for. The understanding of a thing as a subject requires complex knowledge, based on one's experience, and it involves senses, feelings, as well as appropriate intellectual skills. The procedure by which the knowledge of a subject is being appropriated is very complex, and not always known, even to ourselves. We don't always know what motivates us, and why we react like we do. Sometimes it's important to learn more about it, sometimes we can't. And there is always a limit to how far we can go to decompose all processes that lead us to the point where we understand something. In brief, subjects are "subjective" !
It all depends what "about" means.
Things become subjects as soon as we start speaking about them. They are like quantum particles that can't be observed without being deeply transformed. In other words, we can't speak about things, because they just are. But we do speak subjects .
Can a subject declare what it is about? Can it be self-documented? Well, not really. Because the definition of a subject is something that can be spoken about, it is a subject in itself. Very often, there can be more than one definition to the same subject, and it's still the same subject despite the fact that different people would use different definitions to describe it. There can be lots of talks about particular definitions of a given subject, and when that happens, a particular definition becomes the subject itself. The same is true for the name of a subject. The same subject can be designated by a variety of names, each of the names can potentially become also a subject of conversation.
In this given universe, all subjects are subjects in their own right. There are no such thing as a subject that contain other subjects. Even when a subject is considered a class for other subjects, like in a taxonomy ("an elephant is a mammal"), from the "subject-ness" point of view there is no difference between the instance and the class. In the example mentioned above, the subject named "mammal" is as potentially controversial—i.e., subject to multiple definitions–than the subject "elephant". In other words, there are no subjects privileged over others.
Expression
Each subject is expressed via a name, a text, an oral phrase, a picture, a file, a Web address (Uniform Resource Identifier), etc. Expressions occupy locations in the semantic space. An expression is therefore an information object, that can be analyzed and processed according to rules.
To communicate meaning, we have to use a language. A language is a set of rules, which can be defined by a combination of signs (such as the twenty-six letters of the Latin alphabet for example). The relationship between a subject and the term(s) used to represent it in a given language, or in a set of languages, is far from simple. It is the classical dichotomy that linguistics introduces by differentiating the "signifier" from the " signified".
Linguistic analysis can be used to evaluate textual expressions, form recognition techniques can be used to analyze graphics, in general computers can be used to compute and store expressions, etc. The name given to a subject is one of its expressions. The expression is an object, that has properties and values for each of these named properties.
Interpretation
An interpretation consists in the process of creating an expression for a subject. In other words, the mere fact to express a subject implies that the subject is being interpreted. Since there are unlimited ways to express subjects, there can be many different expressions for the same subject. Two different people will generally describe the same subject in slightly different ways, even when they agree that they are describing the same subject. They can also represent the subject using different languages. There is never a single way to express a subject. The world of interpretation is unbounded, and it is the human world, marked by the curse of the Tower of Babel. The ability of everybody to express subjects according to his or her own way is also the basis for freedom of speech, and it is a prerequisite for democracy.
A name doesn't characterize the subject per se. In addition, any subject can be given multiple names. In other words, the name of a subject is not uniquely identifying the subject. Otherwise, there would be no homonyms, no "false friend" between languages, etc.
We can't communicate without using expressions.
There are many ways to express things and communicate them to others. For example, painters are able to express emotions and share them in their masterpieces. Photographs, filmmakers or musicians are using other means of expressing things, and they can be very powerful in reaching our emotions in a way where we can feel that we deeply understand what they have expressed. Another way to communicate to others knowledge of things is to talk about them. When a thing is talked about, it is by means of a "subject of discourse", or more shortly, a "subject". Contrarily to a thing that simply is, a subject has a meaning. And when we communicate, we are expecting to transmit its meaning.
But we need to understand that there is a difference between the subject we talk about and what the thing was at the first place. Its meaning can be profoundly distorted, or we may even not understand it if it is expressed in a language we can't decipher. Furthermore, we may think we understand a subject, and the original author, when he or she sees what we made from it, may end up disagreeing because of the gap he or she considers between what he/she originally meant and what is actually understood from it. This situation happens often when someone is interviewed or quoted "out of context". It also happens when a literary work is translated into another language. In brief, distortion is a fact of life, it is more the rule than the exception. It's called " interpretation".
The reason why misunderstandings exist is because language is ambiguous and the reason why language is ambiguous is because we ourselves are ambiguous, by nature, not by choice. We are so complex that there is no way for us to tell anyone else everything we know about and to transmit all the context that made us what we are and think what we think. We are lucky if we are able, sometimes, to transmit a very specialized item of our knowledge to someone else. Anyone who has been confronted to educating his or her own children knows that this is not as easy as it may seem.
There is also the fact that we may want to know what others say or think, even if they are talking to us, and even when they choose to hide what they are doing. Intelligence gathering is about trying to figure out what is going on in a universe where people are not willing to talk, but on the contrary are trying to hide information. connecting the dots in this context requires other skills. But after analyzing the situation in other contexts, it's not entirely different, because we need to use information even if we don't know or understand the whole context in which it has been produced.
The relationship between subjects is as plain as a subject than the subjects that it relates. The fact that a subject is assigned a given name is a "name assignment" subject. There are plenty of different ways to name a subject, therefore the name assignment subject can be itself a multiplicity of subjects .
Subjects are subject to multiple interpretations. A particular interpretation may depend on the context, which can resolve to the individual observer. It may happen that the same subject gets interpreted in a variety of ways, sometimes even contradictory ways, by any other person who interprets it. The context in which a subject is viewed may influence the ways it is understood and used. In other words, there is nothing that prevents anyone to have a different point of view on a subject compared to anyone else. In certain cases, agreement can be found, which makes communication possible at all, but this rule doesn't apply to all subjects , and ignoring it may cause some unexpected side effects. In brief, it is not advisable to consider that everybody will always interpret a given the same subject than anybody else. The fact that the subject is documented may help reduce the misinterpretations, but is not an absolute guarantee, because the definitions themselves may carry some ambiguity.
Everything above had to do with human beings: understanding or misunderstanding, agreeing or disagreeing, etc. Humans use natural languages as a means to communicate, and the languages contain mechanisms to deal with ambiguities: homonyms, synonyms, etc. are well known. Linguists have written extensively about these issues.
[comment6]: What do computers understand?
Now computers come into the picture. Computers don't understand anything, they store data and compute them based on algorithms which can be immensely sophisticated, but resolve eventually to a set of elementary logical operations. Since computers are used to store, communicate, and access information, it is interesting to use them to deal with subjects . Since subjects are not represented by words, they can't be stored in a computer in the form of strings of characters. They need their own storage, an " expression"?. A computer uses a expression to represent a subject.?
The relation between a subject and an expression can be anything at all. Search engines for example yield occurrences based on string recognitions, and the automatic algorithms are not able to determine whether the occurrences are actually relevant to the subject under consideration. The Web doesn't allocate specific space to specific subjects. Information relevant to a given subject can be present in a variety of web sites not offering any connection between them, and the search engines may entirely ignore those occurrences. There is an unbounded number of expressions (web addresses, words used in a search engine ) for a given subject.
However, a semantically integrated environment aims at reaching a one-to-one relationship between a subject and an expression.
Expressions are a set of properties (label-value pairs) and nothing else. For example, a useful label for a expression is the "name" of a subject. The name can itself be considered a subject.
Perspective
Each interpretation of any subject is made within a certain perspective. A perspective is a bias in which the universe is seen that determines the expressions that are uttered to describe it. In a given perspective, subjects are expressed using a given classification, and by applying a given set of rules. The classification schema and the set of rules used are either implicit or explicit. If computerized, they can be made explicit when they follow a model, such as the entity-relationship model. Sometimes they are explicit but remain hidden from the end user. The set of rules and algorithms used by search engines to gather information relevant to a given subject are explicit for those who created the engine, but they are not exposed. Each taxonomy, which is a hierarchical organization of expressions, uses a given perspective. Any ontology is no more and no less than a perspective, because it is made of a set of rules by which expressions are related and computed against each other. It seems that using the word " perspective" is clearer and more precise than using the word "ontology". The word " perspective" conveys the idea that the expression of subjects is somewhat "subjective". A perspective can also be described as a filter or a style sheet for knowledge. Perspectives are creating by uttering the patterns that subject expressions must comply with. Subject patterns express rules for determining when two subjects should be considered the same (identificatio n rules) and rules for merging subject expressions once subjects have been recognized as identical.
A perspective is equivalent to an ontology. In a given perspective, processes are triggered that result in reorganizing the subject expressions (usually by reducing their number).
View
The reason for creating perspectives is to provide views. A view is a set of expressi ons treated as a unit. It is the result of applying a subject pattern. A view is a particular output which fits particular user needs. The same subjects can be filtered through different perspectives and will end up be presented in completely different views. A parallel can be drawn with the markup principle: " one source, several outputs". One set of subjects , several views.
Leonardo da Vinci and Luca Pacioli both rejected the claim by Aristotle that space extends infinitely in three linear dimensions ([http://en.wikipedia.org/wiki/Leonardo_da_Vinci||Wikipedia] ) 1. Two centuries later, they were joined by Gottfried Wilhelm Leibniz (1646-1716), a German philosopher who invented calculus, promoted a universal language, also created an "atomic" theory called the "Monadology"2. According to the theory, things only exist in relation with others, and the most elementary form of relation is one that connects one thing to another: "Now this interlinkage or accommodation of all created things to each other, brings it about that each simple substance has relations that express all the others, and is in consequence a perpetual living mirror of the universe."3 Leibniz's Monadology provides a framework to represent information and its meaning at a very elementary level. Leibniz's vision enables us to "flatten" information and to represent it as a set of binary relations: "And as one and the same town viewed from different sides looks altogether different and is, as it were, perspectively multiplied, it similarly happens that, through the infinite multitude of simple substances, there are, as it were, just as many different universes, which however are only the perspectives of a single one according to the different points of view of each monad."4 Leibniz expresses the fact that once things have been flattened, they can be viewed according to multiple perspectives.
Leibniz has also proposed a principle of the identity of indiscernibles: "if and only if (two or more) [http://en.wikipedia.org/wiki/Object||object] (s), or [http://en.wikipedia.org/wiki/Object||entity/ies] have all their/its [http://en.wikipedia.org/wiki/Property||property/ies] in common then they (it) are identical (are one and the same entity)" .5. See also: [http://www.bun.kyoto-u.ac.jp/~suchii/Leib-Clk/indiscernible.html||http://www.bun.kyoto-u.ac.jp/~suchii/Leib-Clk/indiscernible.html] and Dr. Frank Linhard6 The principle of indiscernibility has a significance in modern physics, especially in quantum statistics, where it distinguishes the Fermi-Dir ac statistics from others. Paul Adrien Maurice Dirac (1902-1984) has played a major role in the inception of quantum physics and has invented a notation to represent quantum states.
Doug Engelbart
-
A hypothesis developed by Nicolaus of Cusa, http://www.schillerinstitute.org/fid_97-01/012_cusa_moved.html ↩
-
See Nicholas Rescher, G.W. Leibniz's Monadology, An Edition for Students, University of Pittsburgh Press, 1991. ↩
-
Leibniz, Monad 56, Monadology, op. cit. ↩
-
Leibniz, Monad 57, Monadology, op. cit. ↩
-
See the article on the "identity of indiscernibles" in Wikipedia. ↩
-
, Ununterscheidbarkeit bei Leibniz und Dirac (Indiscernibility in Leibniz and Dirac) http://www.linhard.com/privacy/cv/cvtalk.html , Lecture Series of the Institute for History of Science, University of Hamburg, 2001, Universität Frankfurt am Main. ↩