Thursday, November 17, 2005
Center of Excellence Proposal
à
Posted 11/17/05 by Gary Berg-Cross
At
http://colab.cim3.net/forum/ontac-forum/
I wanted to follow up Eric Ps earlier message about a hub approach to building our common ontology. I think that his questions and issues got side tracked.
Editorial; request
(definition
of a hub (in this context)) ß send email
a)
relationship between
modular ontology and a hub
b)
the relationship between
the concept of a common or “upper” ontology and modular ontology
Eric was curious about “how pervasive those anti-hub feelings really are.“
I’m not of one feeling on this issue, since I think the issue
is complex. I would welcome some discussion of some of it. Eric had particular ideas on Dr. Sowa’s
sub-sumption lattice idea, but I haven’t heard responses to that and perhaps
others can respond to it.
Editorial; request (request for outline of Eric’s
ideas on the lattice) ß send
email
For myself, I could imagine a hub or modular approach
depending on the quality of the hub.
I’d have to be convinced that it was doable, and would want to know the “seed”
for it and the process or development. I don’t see how this could be done with
merging some existing ontologies.
Various people have talked about using UMLS DOLCE/BFO,
SUMO, OpenCyc, ISO 15926, FEA-RMO and
the DoD Core Taxonomy.
The FEA-RMO is an Ontology of a Reference Model and not of an actual domain such as
health. It seems quite hard to connect
to this to others.
Also, the DoD taxonomy, in my opinion, has the degree of
problems that Barry and John pointed out in the "general ontology" so
it may not be easy to assimilate. We
might start without trying to merge these in and also might start with the best
2 or 3 as candidates to seed an effort.
Another point or question concerns leveraging the
experience of past efforts. Back seven
years or more there was an effort by the ANSI Ad Hoc group to construct a
standard, called the Reference Ontology.
They had a five-step approach for the following
1.
Upper levels (approx. 100,000 terms): Bring into correspondence (to
align) the terms of a small number of selected large-scale ontologies (eventual
size approx. 100,000 items). Do so inclusively; that is, create a result in
which users can choose which of the component ontologies' terms they wish to
see and use.
2.
Domain models (under 2,000 terms each): Link into this Ontology selected
domain-specific ontologies, developed to support reasoning about time, space,
physics, geography, etc. Do so inclusively; allow the linkage of various
different models of time, space, etc.
3.
Access tools: Create easy-to-use tools for Ontology access and
extension.
4.
Dissemination: Place the resulting Reference Ontology on the Web, freely
available.
5.
Theoretical basis: In ongoing work, have a team of highly qualified
individuals comb through the Ontology to find powerful generalizations, to weed
out unnecessary and inconsistent items, and to create a maximal factoring of
the upper levels of the Ontology.
Seems
quite similar to what we are talking about.
Whatever happened? Did it fail
because it didn’t have an upper ontology?
They listed the
following are candidate sources for terms to be included into their “merged
Reference Ontology” and a few of these (UMLS , CyC) have been mentioned as a
base for us too :
·
USC/ISI: Pangloss Ontology
SENSUS approx. 70,000 terms, general coverage, little detail,
taxonomization supports Natural Language applications.
·
Princeton: WordNet
approx. 70,000 terms, general coverage, little detail, taxonomized on Naive
Semantics / Cognitive Science principles.
·
CYCorp: Upper portion of CYC
ontology approx. 2,500 terms, general coverage, little detail, taxonomized
on Naive Semantics / AI principles. Later additions may include more of the
40,000-odd terms currently in CYC.
·
EDR: Upper portion of EDR
concept ontology approx. 1,000 terms, general coverage, medium detail,
taxonomized for Natural Language applications. Later additions may include more
of the approx. 400,000 terms in the EDR concept lexicon.
·
New Mexico State University: MIKROKOSMOS
approx. 4,000 terms, general coverage, detailed, taxonomized for Natural
Language applications.
·
European Union: EuroWordNetóunder
construction; probably approx. 50,000 terms, little detail, taxonomized on
Naive Semantics / Cognitive Science principles.
·
LXT Inc.: UMLS medical
ontology exceeds 50,000 terms, medium detail, taxonomized for medical
reasoning applications.
Perhaps
some of these should also be on our list if they have “matured”.
A last
point/issue concerns alignment between our starting sources and how to start on
this. Martin Doerr and others did some
work reported in “Towards a Core
Ontology for Information Integration” and
described the comparison and convergence of 2 ontologies using the OntoClean
approach. (Guarino, N. and Welty, C.,
“Evaluating ontological decisions with OntoClean,” Communiations of the ACM, 45 (2), pp. 61-65, 2002,)
This
uses analyses of top-level ontological distinctions related to:
1. instantiation versus membership
2. part-of and mereological axioms
3. extensionality
4. connection
5. location and extension
6. co-extension, co-connection
7. unity, singularity and plurality
8. dependence/independence
The
claim is that the OntoClean approach “enables: the detection of concept
definitions that are lacking in clarity or rigidity; the justification of valid
sub-sumption relations; and the detection of invalid sub-sumption declarations.
“
Would
it be useful to start looking at the match up of some of our “seed” ontologies
in this way?
Regards,
Gary Berg-Cross