National Knowledge Project

[165] home [167]

Saturday, November 20, 2004

The BCNGroup Beadgames

National Project à

Challenge Problem à

Center of Excellence Proposal à

White Paper on Incident Information Orb Architecture (IIOA) à

Types of Ontology for Crisis Management à

Adi Structural Ontology Part I à

Cubicon language description à

Orb Notational Paper à

Topic map standard, the RDF and Readware constructs

Communication from Ken Ewell (Readware)

I communicate some similarities and differences between the Topic map standard, the RDF and Readware ontology constructs. They are each similar, of course, because they are each frameworks for representing information resources.

They each employ the elemental notion in graph theory, a triple:

< a , r, b >

where a and b are nodes and r is a connector.

As you know, though there is a big difference in how graphs are used

Where both the Topic Map and the resource definition framework require the author to create an additional and "independent" definition in the structure and relations of the resource -- usually a document -- Readware does not require the XML mark-up or further definition by the author.

Instead Readware parses the text elements in a document or other resource (message). Readware identifies each (parsed) item (by finding it to be a name, number, word, concept, knowledge-type) in the independent Readware ConceptBase (network of vocabulary and taxonomy with weighted associations calculated from the elements of the Adi substructural ontology).

Let me start by offering my characterization of the resource definition framework in comparison to the topic map.

The RDF is a language for representing information about resources on the web -- intended mainly to exchange information between applications.

I see the Topic Map standard as being more about reifying the topics and subjects of any given resource in such a way as to clarify their identity, instances of their occurrence and their associations (by relationships and role). It is claimed that when everyone follows the rules in creating a topic map of a particular resource that it (the topic map) could be merged with other resources whereby people would be able to navigate the web, for example, by topics, and applications could read a topic map to get meta-information about any resource.

This view of a need to reify the meaning of the graph contractions is consistent with the Readware processing model. As the BCNGroup founding committee has suggested, in defining HIP philosophy, meaning has a late binding that best occurs in the present moment as a human interprets information.

The key concepts of the topic map are topics, associations and occurrences. A tree or network model applies. In addition topic maps utilize XML processing methods, and they work with URI-based namespaces.

Many types of simple facts can be expressed though a topic map, but topic maps are not as capable for reasoning purposes as the RDF triples. As far as I am aware, having formal semantics and provable inference was never intended as a design goal of topic maps. HIP philosophy suggests that this absence of theorem proving mechanisms is actually an important positive aspect of topic maps.

The key concepts of RDF are

subject -- predicate-- object.

As in Readware constructions, the Orb constructions and topic maps; the mathematical construction of graphs are used. RDF adds the concepts of a URI-based vocabulary and it has data type and literal concepts.

Simple facts can be expressed in the RDF model and there is a serialization concept.

While having formal semantics and provable inference was intended as a design goal of the RDF standard, the most debatable concept is that of entailment.

According to the W3C Recommendation 10 February 2004 for RDF Semantics

"Entailment is the key idea which connects model-theoretic semantics to real-world applications. As noted earlier, making an assertion amounts to claiming that the world is an interpretation that assigns the value true to the assertion. If A entails B, then any interpretation that makes A true also makes B true, so that an assertion of A already contains the same "meaning" as an assertion of B; one could say that the meaning of B is somehow contained in, or subsumed by, that of A. If A and B entail each other, then they both "mean" the same thing, in the sense that asserting either of them makes the same claim about the world.

The interest of this observation arises most vividly when A and B are different expressions, since then the relation of entailment is exactly the appropriate semantic license to justify an application inferring or generating one of them from the other."

Of exceeding interest here is that last sentence, where a semantic license is give to the computer program [1].

Humans exercise semantic licenses, and sometimes quite poorly. But, I am reminded that most 1980s computer applications could not spell my name right. Even decades later, I get advertising by that wrong name. This demonstrates how long errors can live.

Allowing computer programs to “take semantic license” can mean giving longevity to what otherwise should remain ephemeral.

In general terms, allowing the computer to take semantic license makes the computer the creator of meaning and the creator or error. Who or what will resolve it all?

On the subject of true meaning the RDF standard gets even murkier:

"Exactly what is considered to be the 'meaning' of an assertion in RDF or RDFS in some broad sense may depend on many factors, including social conventions, comments in natural language or links to other content-bearing documents. Much of this meaning will be inaccessible to machine processing and is mentioned here only to emphasize that the formal semantics described in this document is not intended to provide a full analysis of 'meaning' in this broad sense; that would be a large research topic. The semantics given here restricts itself to a formal notion of meaning which could be characterized as the part that is common to all other accounts of meaning, and can be captured in mechanical inference rules."

Of course whether one's semantics restrict themselves to the formal notion of meaning is just another distinction left up to independently thinking authors. Though, as it turns out-- it is the single most important distinction of all. So statements like this one create a polemic that makes it hard to bring the statement into some objective light.

The important thing to remember, when comparing Topic Maps and the RDF standards with Readware, is that both the Topic Maps and RDF are formal and standardized methods for framing and representing knowledge, Readware is software that embodies human knowledge (represented by a language used in social discourse).

The embodied knowledge has three layers the

substructural ontology Q,

the Q-stem-system-function level with a complete vocabulary and

a precise environment ontology

The Q-stem system-function level is akin to topic maps, since in this middle ontology the functions are those that are needed in expressive social discourse. Simple compositions of Q-stems declaring and specifying topics and other subjects of inquiry called Readware Knowledge types.

For Readware, the four knowledge types are:

1. Category (described by weighted inquiries that serve to assign one or more classifiers to an entire resource),

2. Topic (described by inquiries that unambiguously identify the occurrence of a specific subject),

3. Issue (described by inquiries that identify discourse addressing subjective or objective judgments),

4. Probe (described by inquiries that identify discourse addressing the interrogatories: who, where, why, how much, how often, and other subjective questions as to performance, failure, success, etc.)

Our use of the word “probe” takes a while to understand. There is a technical dimension to the use of probe in Readware to ask questions.

The key concepts to the Readware Information processing infrastructures (Readware Ip Servers) are items of discourse, their occurrences in electronic texts and their references or real-world entailments. Readware also sets up names, unknowns, numbers, concepts, along with the knowledge types as items and elements of any resource. Each and any of these items may play an active role in the associations and relevance that humans find in these resources.

Items of discourse are parsed from discourse by string processors that recognize words, do stemming and other sophisticated word analysis. There are different string parsers for different languages. These parsers, written by Tom Adi, pre-date (1986) the Porter stemmer and other public domain software, and are proved to be effective. The details of the occurrences of items are denoted, e.g., the document URL (via a Document number) and byte-offset are processed.

This may not be as formal, by RDF standards, though it is clearly the same operation. Readware does this without the aid of human intervention so the formalization is in the algorithm employed by the program. The program stores the items and their occurrences in a compressed record of the signature database. Some of the items, the classifiers and topics, for example, are stored in ranked order for the purposes of obtaining information about them later.

You may ask where are the associations and where are the predicates that tie those subjects and objects together? In Readware, search concepts takes over at query time

So encoding knowledge of social discourse, as required in the founding committee’s recommended ontology for crisis management is straightforward.

The entailments that swarm around the objects and subjects of inquiry are determined (computed) at run-time and are not stored as a part of the triple as they are in the RDF and the <association> in the Topic Map. The fidelity measured from associating the entailments to the subject and objects of inquiry plays the role of the predicate that ties together the subject and object of inquiry.

Given the Readware ConceptBase and the resources, all the entailments indicated in the inquiry are computed in real-time.

To determine entailment and association, the network of vocabulary in the Readware ConceptBase is tapped, using the items found in the query. Some parameters passed with the query also set up the size of the neighborhood or scope of context in which the inquiry (and its entailments) are acceptable. Traditional and advanced retrieval logic may also be passed with the query.

This sets up the framework and conditions of a computation including the numbers of words included in the analysis whereupon if the literal occurrence or the (proved) occurrence of any entailment of any name, word or concept of inquiry is found to occur, a document or resource is reported to be associated or relevant. As I have said, the matters of relevance and association are computed at run-time and are not pre-determined as they are in other systems of representation.

To approximate the behavior of other systems, Readware employs both compile-time and query-run-time methods for gathering inquiry statistics on relevant resources.

Within Readware, the deduction of relationship or association (we call it fidelity) proceeds from the (pre-computed) weights obtained from substructural ontology of the stems (known terms) and the weights assigned to the constants (names and unknown terms), according to the implicit or explicit logic of the inquiry. Any of these query elements can be self-defined as to unambiguously refer to a resource or a set of related resources according to (enforced by) the world-view and knowledge embodied in the specific instance of the Readware Ip Server. Readware recognizes that some items, words in particular, can refer to different items. These differences can be captured in the normal processing of resources.

There is no truth involved as there is in RDF and OWL and other implementation that rely on the RDF.

Readware flags ambiguity and depends on human judgment either at run-time or in the processing of knowledge-types at compile time.

After that it is only computation. Where others find no structure and make no reliance on the prima-facia evidence (i.e. a text), Readware finds and computes aspects of the natural structure and its intrinsic systematic, i.e., ontological, interrelations.

The following is extracted from the WC3 definition of formal semantics and their explanation on the use of a semantic model.

“The idea is to provide an abstract, mathematical account of the properties that any such interpretation must have, making as few assumptions as possible about its actual nature or intrinsic structure, thereby retaining as much generality as possible. The chief utility of a formal semantic theory is not to provide any deep analysis of the nature of the things being described by the language or to suggest any particular processing model, but rather to provide a technical way to determine when inference processes are valid, i.e. when they preserve truth. This provides the maximal freedom for implementations while preserving a globally coherent notion of meaning.”

While that may be the idea, in practice everything is specified and all inferences are explicitly defined e.g.: <currency=dollars>. While such semantics are intended to be local and ephemeral, the RDF somehow makes them global, formal and concrete.

Truth, in itself is ephemeral. The processes (inferences) built on the foundations of supposed truth must be on very shaky ground.

You know we cannot possible agree with that kind of semantics or formal semantic theory. As I think of it Adi's semantic theory can probably be characterized as the antitheses of RDF's formal semantic theory.

For the record: The chief utility of Readware's semantics is to provide a computable framework accounting for the properties and nature of the interactions between things being described by the language. This framework has no inference rules, there is no notion of truth, other than as a concept of human inquiry. It thereby retains as much utility as possible. This provides the maximal freedom for implementations while preserving a globally coherent notion of meaning.

I suppose to make this complete I should state the design goals of Readware.

· to provide an automatic means to index resources on the web or any network according to the Readware ConceptSpace.

· [This equates to RDF semantics and to Model theory as noted above]

· this would include the capability to separate text from formatting information, to read database fields and records, etc.

· to provide a means of equating information with the entailments of an inquiry into a subject as well as its literal components.

· this is enforced by the program, programmatically and through feature settings.

· to provide a means of classifying related information with the clear and unambiguous entailments of an inquiry.

· to provide a means for cataloging the notions and indicators that unambiguously classify specified types of knowledge.

· to provide recursion through the list of knowledge types, a topic for example,

· so that a primitive topic can be a condition or component of a subsequent topic.

· to provide means to reuse the knowledge-types defined and cataloged in Readware cultures.

· the cataloging means must be clearly readable and writable with a text editor, it will require no programming skill.

· Knowledge types must be re-usable and be able to be easily and readily interpreted by any application or further automation.

Of course, the means, indicated above are embodied with the Readware ConceptBase for a language along with the verbal processors and parsing algorithms embodied in the Readware processing engine and services (Readware Ip Servers). The design goals for the Readware IpServers were.

· to provide stateless, multi-threaded, standards-based services usable over the Internet

· using direct API or CGI modes of operation to support the widest variety applications.

· to be straight-forward in terms of software installation, footprint and configuration using standard conventional language.

· shall be able to work on all standards-based file, data and data mark-up (e.g., XML) and record standards.

· shall work work within established security models (NTLM, HTTPS, Cookies, etc.)

· the system must offer flexible indexing and query parameters,

· these should be defined in plain language and be straight-forward in their usage.

· the design of the query command language shall be formal, complete and concise.

· the system will catalog and re-use topical and classifactory knowledge expressed in plain language.

· the catalog (culture) of topics and classifiers shall be easy and straight-forward to create.

· the information resources can be distributed and stored locally or remotely to the system.

· Information Collections shall be easy to create and they shall be automatically populated given appropriate addressing.

· they must be easily configured for automatic analysis, indexing and query services.

Ken

Please direct comments to portal@ontologystream.com .

[1] Comment by founding committee: Many individuals have comments on the inappropriateness of this feature of RDF, and consequently of OWL. Perhaps, given cultural history, it is understandable why the notion of entailment, e.g. “cause”, is regarded as having a complete fulfillment in classical logic, as seen in the subsumption assertion. The stratified theory developed by Prueitt suggests that in addition to classical “logical” entailment there are both localized and holonomic (field or global) structural entailments. In complex systems, these entailments are brought into the moment by (a) memory mechanisms or (b) field constraints associated with environment affordances. In the structural ontology developed by Adi the three types of entailments are acknowledged. In future work, we hope to show that classical theorem proving type logical entailments does have a place in a stratified and complex model of reality, but only within the scope defined by temporal stability – i. e. only with the confines of process compartments. (See: Chapter 2: Foundations of Knowledge Science) The logical entailment arises out of the physical need for phase coherence, as inherited by complex systems such as living systems in the form of coherence. The non-universality of fixed formal logics upsets the subsumption assertion made by most Semantic Web advocates.

http://www.bcngroup.org/area3/pprueitt/kmbook/Chapter1.htm