Saturday, November 20, 2004
Center of Excellence Proposal
à
White Paper on Incident Information Orb
Architecture (IIOA) à
Types of Ontology for Crisis
Management à
Adi Structural Ontology Part I
à
Cubicon language description
à
Topic map standard, the RDF and Readware constructs
Communication from Ken Ewell (Readware)
I communicate some similarities and differences between the Topic
map standard, the RDF and Readware ontology constructs. They are each similar, of course, because
they are each frameworks for representing information resources.
They each employ the elemental notion in graph theory, a triple:
< a , r, b
>
where a and b are nodes and r is
a connector.
As you know, though there is a big difference in how graphs are
used
Where both the Topic Map and the resource definition framework
require the author to create an additional and "independent"
definition in the structure and relations of the resource -- usually a document
-- Readware does not require the XML mark-up or further definition by the
author.
Instead Readware parses the text elements in a document or other
resource (message). Readware identifies each (parsed) item (by finding it to be
a name, number, word, concept, knowledge-type) in the independent Readware
ConceptBase (network of vocabulary and taxonomy with weighted associations
calculated from the elements of the Adi substructural ontology).
Let me start by offering my characterization of the resource
definition framework in comparison to the topic map.
The RDF is a language for representing information about resources
on the web -- intended mainly to exchange information between applications.
I see the Topic Map standard as being more about reifying the
topics and subjects of any given resource in such a way as to clarify their
identity, instances of their occurrence and their associations (by
relationships and role). It is claimed that
when everyone follows the rules in creating a topic map of a particular
resource that it (the topic map) could be merged with other resources whereby
people would be able to navigate the web, for example, by topics, and
applications could read a topic map to get meta-information about any
resource.
This view of a need to reify the meaning of the graph contractions
is consistent with the Readware processing model. As the BCNGroup founding committee has suggested, in defining HIP philosophy, meaning has a late binding
that best occurs in the present moment as a human interprets information.
The key concepts of the topic map are topics, associations and
occurrences. A tree or network model applies. In addition topic maps utilize XML
processing methods, and they work with URI-based namespaces.
Many types of simple facts can be expressed though a topic map,
but topic maps are not as capable for reasoning purposes as the RDF
triples. As far as I am aware, having
formal semantics and provable inference was never intended as a design goal of
topic maps. HIP philosophy suggests that this absence
of theorem proving mechanisms is actually an important positive aspect of topic
maps.
The key concepts of RDF are
subject -- predicate-- object.
As in Readware constructions, the Orb constructions
and topic maps; the mathematical construction of
graphs are used. RDF adds the concepts
of a URI-based vocabulary and it has data type and literal concepts.
Simple facts can be expressed in the RDF model and there is a
serialization concept.
While having formal semantics and provable inference was intended
as a design goal of the RDF standard, the most debatable concept is that of entailment.
According to the W3C Recommendation 10 February 2004 for RDF
Semantics
"Entailment is the key
idea which connects model-theoretic semantics to real-world applications. As
noted earlier, making an assertion amounts to claiming that the world is an
interpretation that assigns the value true to the assertion. If A entails B,
then any interpretation that makes A true also makes B true, so that an
assertion of A already contains the same "meaning" as an assertion of
B; one could say that the meaning of B is somehow contained in, or subsumed by,
that of A. If A and B entail each other, then they both "mean" the
same thing, in the sense that asserting either of them makes the same claim
about the world.
The interest of this observation arises
most vividly when A and B are different expressions, since then the relation of
entailment is exactly the appropriate semantic license to justify an
application inferring or generating one of them from the other."
Of exceeding interest here is that last sentence, where a semantic
license is give to the computer program [1].
Humans exercise semantic licenses, and sometimes quite
poorly. But, I am reminded that most
1980s computer applications could not spell my name right. Even decades later, I get advertising by
that wrong name. This demonstrates how
long errors can live.
Allowing computer programs to “take semantic license” can mean
giving longevity to what otherwise should remain ephemeral.
In general terms, allowing the computer to take semantic license
makes the computer the creator of meaning and the creator or error. Who or what will resolve it all?
On the subject of true meaning the RDF standard gets even murkier:
"Exactly what is considered to be the
'meaning' of an assertion in RDF or RDFS in some broad sense may depend on many
factors, including social conventions, comments in natural language or links to
other content-bearing documents. Much of this meaning will be inaccessible to
machine processing and is mentioned here only to emphasize that the formal
semantics described in this document is not intended to provide a full analysis
of 'meaning' in this broad sense; that would be a large research topic. The
semantics given here restricts itself to a formal notion of meaning which could
be characterized as the part that is common to all other accounts of meaning,
and can be captured in mechanical inference rules."
Of course whether one's semantics restrict themselves to the
formal notion of meaning is just another distinction left up to independently
thinking authors. Though, as it turns
out-- it is the single most important distinction of all. So statements like this one create a polemic
that makes it hard to bring the statement into some objective light.
The important thing to remember, when comparing Topic Maps and the
RDF standards with Readware, is that both the Topic Maps and RDF are formal and
standardized methods for framing and representing knowledge, Readware is
software that embodies human knowledge (represented by a language used in
social discourse).
The embodied knowledge has three layers the
the Q-stem-system-function
level with a complete vocabulary and
a precise environment ontology
The Q-stem system-function level is akin to topic maps, since in
this middle ontology the functions are those that are needed in expressive
social discourse. Simple compositions
of Q-stems declaring and specifying topics and other subjects of inquiry called
Readware Knowledge types.
For Readware, the four knowledge types are:
1. Category (described by weighted inquiries that
serve to assign one or more classifiers to an entire resource),
2. Topic (described by inquiries that unambiguously identify the
occurrence of a specific subject),
3. Issue (described by inquiries that identify discourse addressing
subjective or objective judgments),
4. Probe (described by inquiries that identify discourse addressing the
interrogatories: who, where, why, how much, how often, and other subjective
questions as to performance, failure, success, etc.)
Our use of the word “probe” takes a while to understand. There is a technical dimension to the use of
probe in Readware to ask questions.
The key concepts to the Readware Information processing
infrastructures (Readware Ip Servers) are items of discourse,
their occurrences in electronic texts and their references or real-world
entailments. Readware also sets up
names, unknowns, numbers, concepts, along with the knowledge types as items and
elements of any resource. Each and any
of these items may play an active role in the associations and relevance that
humans find in these resources.
Items of discourse are parsed from discourse by string processors that recognize
words, do stemming and other sophisticated word analysis. There are different string parsers for
different languages. These parsers,
written by Tom Adi, pre-date (1986) the Porter stemmer and other public domain
software, and are proved to be effective.
The details of the occurrences of items are denoted, e.g., the document
URL (via a Document number) and byte-offset are processed.
This may not be as formal, by RDF standards, though it is clearly
the same operation. Readware does this
without the aid of human intervention so the formalization is in the algorithm
employed by the program. The program
stores the items and their occurrences in a compressed record of the signature
database. Some of the items, the classifiers
and topics, for example, are stored in ranked order for the purposes of obtaining
information about them later.
You may ask where are the associations and where are the
predicates that tie those subjects and objects together? In Readware, search concepts takes over at
query time
<object of inquiry> <has enough fidelity with> <subject of inquiry>
So encoding knowledge of social discourse, as required in the founding committee’s
recommended ontology for crisis management is straightforward.
The entailments that swarm around the objects and subjects of
inquiry are determined (computed) at run-time and are not stored as a part of
the triple as they are in the RDF and the <association> in the Topic
Map. The fidelity measured from
associating the entailments to the subject and objects of inquiry plays the
role of the predicate that ties together the subject and object of
inquiry.
Given the Readware ConceptBase and the resources, all the
entailments indicated in the inquiry are computed in real-time.
To determine entailment and association, the network of vocabulary
in the Readware ConceptBase is tapped, using the items found in the query. Some parameters passed with the query also
set up the size of the neighborhood or scope of context in which the inquiry
(and its entailments) are acceptable.
Traditional and advanced retrieval logic may also be passed with the
query.
This sets up the framework and conditions of a computation
including the numbers of words included in the analysis whereupon if the
literal occurrence or the (proved) occurrence of any entailment of any name,
word or concept of inquiry is found to occur, a document or resource is
reported to be associated or relevant.
As I have said, the matters of relevance and association are computed at
run-time and are not pre-determined as they are in other systems of
representation.
To approximate the behavior of other systems, Readware employs
both compile-time and query-run-time methods for gathering inquiry statistics
on relevant resources.
Within Readware, the deduction of relationship or association (we
call it fidelity) proceeds from the (pre-computed) weights obtained from
substructural ontology of the stems (known terms) and the weights assigned to the
constants (names and unknown terms), according to the implicit or explicit
logic of the inquiry. Any of these
query elements can be self-defined as to unambiguously refer to a resource or a
set of related resources according to (enforced by) the world-view and
knowledge embodied in the specific instance of the Readware Ip Server. Readware recognizes that some items, words
in particular, can refer to different items.
These differences can be captured in the normal processing of
resources.
There is no truth involved as there is in RDF and OWL and other
implementation that rely on the RDF.
Readware flags ambiguity and depends on
human judgment either at run-time or in the processing of knowledge-types at
compile time.
After that it is only computation. Where others find no structure and make no reliance on the
prima-facia evidence (i.e. a text), Readware finds and computes aspects of the
natural structure and its intrinsic systematic, i.e., ontological, interrelations.
The following is extracted from the WC3 definition of formal
semantics and their explanation on the use of a semantic model.
“The idea is to provide an abstract,
mathematical account of the properties that any such interpretation must have,
making as few assumptions as possible about its actual nature or intrinsic
structure, thereby retaining as much generality as possible. The chief utility
of a formal semantic theory is not to provide any deep analysis of the nature
of the things being described by the language or to suggest any particular
processing model, but rather to provide a technical way to determine when
inference processes are valid, i.e. when they preserve truth. This provides the
maximal freedom for implementations while preserving a globally coherent notion
of meaning.”
While that may be the idea, in practice everything is specified
and all inferences are explicitly defined e.g.: <currency=dollars>. While
such semantics are intended to be local and ephemeral, the RDF somehow makes
them global, formal and concrete.
Truth, in itself is ephemeral. The processes (inferences) built on
the foundations of supposed truth must be on very shaky ground.
You know we cannot possible agree with that kind of semantics or
formal semantic theory. As I think of
it Adi's semantic theory can probably be characterized as the antitheses of
RDF's formal semantic theory.
For the record: The chief utility of Readware's semantics is to
provide a computable framework accounting for the properties and nature of the
interactions between things being described by the language. This framework has no inference rules, there
is no notion of truth, other than as a concept of human inquiry. It thereby
retains as much utility as possible.
This provides the maximal freedom for implementations while preserving a
globally coherent notion of meaning.
I suppose to make this complete I should state the design goals of
Readware.
·
to provide
an automatic means to index resources on the web or any network according to
the Readware ConceptSpace.
·
[This equates
to RDF semantics and to Model theory as noted above]
·
this would
include the capability to separate text from formatting information, to read
database fields and records, etc.
·
to provide a
means of equating information with the entailments of an inquiry into a subject
as well as its literal components.
·
this is
enforced by the program, programmatically and through feature settings.
·
to provide a
means of classifying related information with the clear and unambiguous
entailments of an inquiry.
·
to provide a
means for cataloging the notions and indicators that unambiguously classify
specified types of knowledge.
·
to provide
recursion through the list of knowledge types, a topic for example,
·
so that a
primitive topic can be a condition or component of a subsequent topic.
·
to provide
means to reuse the knowledge-types defined and cataloged in Readware cultures.
·
the
cataloging means must be clearly readable and writable with a text editor, it
will require no programming skill.
·
Knowledge
types must be re-usable and be able to
be easily and readily interpreted by any application or further automation.
Of course, the means, indicated above are embodied with the
Readware ConceptBase for a language along with the verbal processors and
parsing algorithms embodied in the Readware processing engine and services
(Readware Ip Servers). The design goals
for the Readware IpServers were.
·
to provide
stateless, multi-threaded, standards-based services usable over the Internet
·
using direct
API or CGI modes of operation to support the widest variety applications.
·
to be
straight-forward in terms of software installation, footprint and configuration
using standard conventional language.
·
shall be
able to work on all standards-based file, data and data mark-up (e.g., XML) and
record standards.
·
shall work
work within established security models (NTLM, HTTPS, Cookies, etc.)
·
the system
must offer flexible indexing and query parameters,
·
these should
be defined in plain language and be straight-forward in their usage.
·
the design
of the query command language shall be formal, complete and concise.
·
the system
will catalog and re-use topical and classifactory knowledge expressed in plain
language.
·
the catalog
(culture) of topics and classifiers shall be easy and straight-forward to
create.
·
the
information resources can be distributed and stored locally or remotely to the
system.
·
Information
Collections shall be easy to create and they shall be automatically populated
given appropriate addressing.
·
they must be
easily configured for automatic analysis, indexing and query services.
Ken
Please direct comments to portal@ontologystream.com .
[1]
Comment by founding committee: Many individuals have comments on the inappropriateness of this feature
of RDF, and consequently of OWL.
Perhaps, given cultural history, it is understandable why the notion of
entailment, e.g. “cause”, is regarded as having a complete fulfillment in
classical logic, as seen in the subsumption assertion. The stratified theory developed by Prueitt
suggests that in addition to classical “logical” entailment there are both
localized and holonomic (field or global) structural entailments. In complex systems, these entailments are brought
into the moment by (a) memory mechanisms or (b) field constraints associated
with environment affordances. In the
structural ontology developed by Adi the three types of entailments are
acknowledged. In future work, we hope
to show that classical theorem proving type logical entailments does have a
place in a stratified and complex model of reality, but only within the scope
defined by temporal stability – i. e. only with the confines of process
compartments. (See: Chapter 2:
Foundations of Knowledge Science) The
logical entailment arises out of the physical need for phase coherence, as
inherited by complex systems such as living systems in the form of
coherence. The non-universality of
fixed formal logics upsets the subsumption assertion made by most Semantic Web
advocates.