Thursday, January 19, 2006
Discussion about the future of Protege
First, I would like to thank Mark Musen for his remark, and Oliver Dameron for his discussion. I have not read the other replies, but I will. My remarks below are specific to Mark and Oliver's discussion.
The funding issue is one that can be altered if there is a community understanding as to why a radical increase in the Stanford grant is really needed by governmental processes and business processes who (1) benefit from the theory of ontological modeling and (2) are held back by the entanglement of ontological modeling with computer programming. BUT, part of this need is that the ontology editing and visualization tool NOT be a research tool that changes often and which does not have a single instructional path to understanding how to use something "fixed". Even if this something is not perfect.
In reading Rinke Hoekstra's note, I am reminded how careful I have to be and we all have to be to understand other points of view. I certainly feel that Protege, the team at Stanford, and the world wide community is doing dedicated work. The issues of assumptions about the nature of the natural world, and computing, should be the core of the discussion. This requires maturity, and it requires that the other position not be mis-understood. These are complex positions and require reflection.
For example the failure of the electronic Custom's Partnership (eCP) sub project to produce ontological mediation of commodity transactions worldwide was in my opinion based on management incompetence (as many of the large IT failures are). But the choice of Protege was conditioned on the belief by management that this was a tool that could be used (not an academic project having experimental elements). There was use, but not use that actually addressed the actual needs of the eCP project.
The Roadmap I developed during my tenure with this eCP project can be read at:
http://www.bcngroup.org/area1/2005beads/GIF/RoadMap.htm
The Roadmap uses Protege in a very limited fashion ... to provide a tool assisting in interoperability between KIF type nary representations and other proprietary "semantic extraction" internal representations of information.
The hub architecture suggested in the Roadmap will benefit from the new developments in Ontology Repository, that is being supported by Protege 3.2.
As most know here, SQL databases suffer from inflexibility; because is is hard to change the data schema once data has populated the database and programs have been written to use that data.
The current paradigm (s) frames and OWL used by Protege have a similar limitation, not as inflexible in some senses; but certainly reflecting specific views about onotological modeling and not reflecting other views about ontological modeling.
So the issue is not in merely getting additional funding, it is making some changes that simplify the paradigm. I would suggest actually splitting the two paradigms and creating a frames knowledge editing system that has more elements of KIF (Knowledge Interchange Format) and which handles the inference quite differently, using the concept of an object with slots and fillers. The purpose would be to realize that frames paradigm purely and in an non confusing fashion. The abstract-concrete distinction in frames could be made by simply talking about trees and leafs. But making this abstract-concrete distinction as part of a class-subclass paradigm IN FRAMES diminishes the core use of a frame as the representation of context (as Frank Schank developed this notion). I hope I stated that clearly.
The OWL paradigm is a mess, excuse the language, not only because the great variety of "logics" and "axiom systems" but because of the URI dependencies (which rarely can be questioned without polemic responses) and because of "ontological assumption" that human knowledge is best organized into class - sub class subsumption. These are huge weaknesses, even if one gets massive ontological interoperability with theorem provers living happy inside this "Semantic Web". This system will be divorced from the world is specific ways, as discussed at:
http://www.bcngroup.org/area3/pprueitt/kmbook/Chapter2.htm
because of the limitations on formalism.
The frames paradigm does not have to make these particular assumptions. The organization of "frames" can have a great richness in the relationships - even to the point of becoming relational-oriented.
In the responses, there are elements of us-them where "us" is defined and the "them" is misrepresented by polemic (even if this is not acknowledged). (conjecture)
We see this us-them phenomenon in international relationships, and most of us cannot figure out how to have a principled position that argues the imperfection of the "us". Fox news should be watched this morning to see the extreme nature of polemics.
Oliver writes
Exactly.
If not, your ontology is probably incomplete:
- an abstract class cannot have direct instances
- a leaf does not have subclasses by definition
- therefore, an abstract leaf is a class that cannot have direct
instances, but that does not have (concrete) subclasses for holding
these instances. Until you add at least a concrete (direct or indirect)
subclass to your leaf, it cannot have any instance.
<end quote>
My work on "formative ontologies" which is grounded in specific literatures would have "ontology" form from a process of producing abstract categories, and pointing to concrete instances. The tree concept will not support this type of theory or any software that uses the n-ary as the primary construction. True ontological models exist in great numbers using n-ary (frames but without this abstract/concrete distinction) but these are not Semantic Web (W3C) compliant, and are not what one finds in the Protege implementation of "frames".
The way that OWL is set up is also limiting in specific ways.
The problem is that OWL and frames are as they are, and someone like myself who has the knowledge and background to see issues has very little we can do when the "contractor" says we have to use OWL.
One more thing, and an apology for questioning the paradigms: The integers make sense and do not have an alternative because the ontology of integers is correct. So teaching arithmetic is very straight forward. What has occurred over the past 50 or so years is the development of a strong reductionist paradigm in science; one that now is taking to form of a religion, with people being placed outside of "science" if they feel that "intelligent design" has nothing to do with religion; but everything to do with opening up science to the ontology of emergence and the natural stratification of physical processes into organizational layers. Both paradigm of Protege err on the side of reductionism, because of the orientation to engineering and computer science.
Definitions like the one Oliver has offered:
"A primitive class is a class for which you have *not* provided a
necessary and sufficient definition."
simply block of any possibility for talking about Peirce's notion of a primitive; a notion sometimes talked about by John Sowa (but not often anymore). The primitives are considered by many in the Second School of Semantic Science
http://www.ontologystream.com/home2004.htm
as "things the do not exist (in a normal sense)" but which are aggregated into a whole having properties NOT found in the sum of the parts. We look to biological signal pathway phenomenon, gene expression phenomenon, linguistic "double articulation", and the phenomenon of social expression (memetics), as scientific domains requiring non-reductionist theory and tools.
The primitives are also part of the Zachman framework, and is being developed as part of new OASIS standards for Service Oriented Architecture.
Dr Paul Prueitt
The Taos Research Institute
Taos New Mexico