Thursday, September 09, 2004
On the requirement for a human in the loop
Critical question regarding conjectures
Reply, critical question regarding conjectures
Paul
This bead [106] contains a passage that affords me the opportunity to illustrate one aspect of my hunch that the technology you describe will not enable you to "wring the sense out of any text or message, no matter what the language."
Here's the passage:
"The tutorial that we need to develop should take a small text, like one of the Aesop fables, and show what the string of significant words are, given a Go-list, then show the linear list of substructural categories, and then the screen shots for the SLIP visualization of the related Orb “measurement” of the co-occurrence patterns (see figure 1 [ 87] ). "The iteration that is possible involves the modification of the Go-list, a type of controlled vocabulary used in “text-understanding” technology, modification of any thesaurus/ontology services that might act on the list of significant words in the fable, and modification due to stemming or not. This iteration can involve other people in a “knowledge management” methodology that uses the Actionable Intelligence Process Model."
The issue is: What's involved in the meaning of an Aesop fable? The cultural value of these fables is that they can be "applied" to situations involving actors and entities other than those in the fables as told. The story about the ant and the grasshopper is not about ants and grasshoppers at all. It's about two different attitudes toward the use of one's time and energy; about two different kinds of economics.
Those attitudes are to be identified in the PATTERN over the entities in the fable, not in the entities themselves.
It is not enough to identify the "semantic primes" or the elements in the "substructural ontology" for the story. You must also identify the relations among those elements. Neither an unordered set nor a list will do the job. You're likely to need a directed graph with the appropriate LABELS (for the relations) on the edges. Where is the graph structure going to come from and how are the labels going to be placed on the edges?
Will your proposed technology be able, upon having encountered a text of "The ant and the grasshopper" to identify the same pattern in a text where all of the actors are human beings? If a human user tells the system that THIS text here, involving human actors, displays the same pattern as THAT text there, Aesop's fable about the ant and the grasshopper, just what will the system do in consequence of that identification? Will it be able to learn the pattern-level similarity? Given several such examples, will the system be able automatically and accurately to flag other texts displaying the same pattern?
I think this is a very difficult problem, one that humans can do rather well, one that has been attacked computationally, but without much success.
I really don't know what you meant by talking of being able to "wring the sense out of any text or message, no matter what the language." But if the technology you are proposing cannot deal with Aesop's fables in the way I have been suggesting, then it utterly misses the sense in those fables.
Now, it may still be very useful technology, it may well be better at whatever it does than existing technology is at the same task. But that is far short of being able to wring the sense out of an Aesop's fable.
Additional comments about this communication should be sent to portal@ontologystream.com
Bill,
Are you missing the importance of the HIP (Human-centric Information Production) part of the Anticipatory Web?
Your note, and the observations you are making allow us to address how what has been said can be easily mis-understood. In fact, the phenomenon of characteristic misunderstandings allow us to seen part of the influences on how the interpretative act occurs as individual human’s make an interpretation and a communicative act.
Respectfully, I suggest that you are missing part of the argument for the National Project, and have not seen why doing the Readware Provenance ™ might be a really good next step. The community that is planning the National Project is looking at the big picture.
In our view of things, there is no dependency on classical deduction, ever. There is the measurement of real structure, the viewing of that structure by human eyes, and the annotation of functions that structures commonly have. So “semantics” is fully separated from “syntax”, and one introduces the additional notion that pragmatics only exists in real time and thus the complete “semantic-syntactic-pragmatic” model is never encoded into a notational system. The notational system is an abstraction that depends on regularity and the formation of categories. In the pragmatics axis, the meaning of “apple” may become “a weapon to use in throwing” in a specific situation.
Formative ontology may be said to have a late binding of abstractions into a real time experience where the measurement of invariance, however defined, is seen objectively and used as a cognitive prime. So we need the human to experience something as part of the act of observing that there is a computed pattern of invariance
In-memory Ontology referential
base home page.
The misunderstanding is very similar to that which does not allow the RDF folks to see why human reification is vital in the development and use of the Topic Map standard. The common misunderstanding, about why human-in-the-loop is vital, is an important cultural barrier that keeps a new type of information science from being deployed. We are not simply being polite to the humans.
The human is rooted in the real time experience, and can use abstractions and concepts to assist in trigging awareness about things seen, read or heard. Computer processes, executing computer programs has no such capability.
Using visualization software, the human will experience a sense of the “meaning” of a compound of atoms.
The Orb and SLIP software allow data mining to occur on sets of order triples in the form:
{ < a, r, b> }
where metadata may be present. Metadata may provide structural information about the types and potential relationships between the atoms. The data mining is transparent. Anyone can define a convolution over the set and anyone can experiment without having a software programmer, or software industry holding one’s hand.
The substructural ontology, say the one “discovered” by Ewell and Adi, is used to measure for patterns of co-occurrence. This “measurement” is merely a structural measurement that our technology does in an optimal fashion.
A human makes sense of the co-occurrence patterns, and if the human wishes to place metadata into the set of triples, then this metadata can be used in future convolutions over the data set. A convolution is a mathematical operator that “looks” at each element of a set and executes a rule. The convolution can be used as a retrieval of some (small) part of the set of triples, and this set can be modified during the convolution so that the small n-aries in the form < r, a(1), a(2), . . . , a(n) > are visualized in the interface to the human.
One of the early prototypes for Orb based suggestion for retrieval words
The Conjecture on Stratification posits stability to substructural ontology (as we see in the physical period table of atoms). The conjecture would imply that all natural language may have a common substructural ontology, and this has been suggested in the linguistic literature in the past. We hold the conjecture to be a conjecture, but the Readware letter semantics can be examined in the context of this conjecture.
The existence of a substructural ontology to a category of phenomenon is a separate question from the conjecture that that specific substructural ontology can be understood and used. The period table of chemical atoms answers both conjectures in the affirmative. However, the Conjecture on Stratification is making a more broad claim.
I am hoping that others will talk to this issue, and that we all realize that something we are not using language that separates us from common misunderstandings.