January 2, 2006
Additional reading:
Cory Casanave's paper on Data
Access
work on ontology for biological signal pathways
EU’s program to model complexity using ontology
In “A Strategy for Improving and Integrating Biomedical Ontologies” (2005) a group of bioinformatics researcher develop the thesis that ontological models can improve communication between scientists, and assist in integrating data bases of various sorts. Genetic expression and signal pathway data are two examples. This is a useful and productive activity.
Clearly when one looks at the BioPAX group’s work on a OWL ontology that is recommended as a means to standardize bioinformatics information, one is able to see how some progress is being made in making data collection and use available in ways that are more clear. The OWL ontology serves as a data schema for integrating a number of existing databases and for accumulating additional data in a specific standard fashion.
In the BioPAX case, the group has developed OWL ontology that is about how various kinds of data can be marked with metadata. I believe that the greatest value that will occur form this work will come once harvesting programs are developed that use the data structure specified by the BioPAX ontology. A type of data warehousing with data mining operations is possible because there is ontology about how the data is stored. But it should be noted that data warehousing has limitations that depend on how data is acquired and the schema that the data in entered into.
The BioPAX ontology is not a scientifically validating model of the biological process. In fact an “ontological” model of signal pathways is not what the BioPAX project developed. What was developed was a data model for encoding of data in a way that was deemed interesting by the BioPAX group.
The problem is that some things are oversold, by a community that has oversold things for some time. I do not think that this is happening with the BioPAX group. However, there are some indications that some part of the group, that part more closely aligned with Protégé, would like to make unwarranted claims about value that can be inferred (regarding basic science) from the application of standard reasoning tools like Racer on the data organized in the BioPAX data ontology. This is misguided. What is not misguided is to use the data schema to apply analytic tools.
See: http://www.ontologystream.com/beads/nationalDebate/337.htm
Because the current status of mainstream knowledge engineering and “ontological sciences” is highly technical and almost incomprehensible to those outside the field, it is necessary that there be a group that can provide a criticism of the approach when that approach tends into overly speculative waters.
An illustrative example is in order.
In the second section of the paper the “Basic Formal Ontology” is discussed. Clearly there are some interesting ideas in the two primary “ontological categories”. SPAN and SNAP are declared to be ontological categories with SPAN being processes that have a duration and SPAP as being constituents of processes. The language is nuanced in a nice way and may have uses in some cases. Clearly it is often nice to talk about a process having constituent “things”. What this division of reality into two categories does is allow one to talk about certain kinds of things in certain ways.
But there are ontological commitments that are made when one accepts this division.
At the end of the long first paragraph of the second section we find the sentence: “ Since processes cannot exist without their participants, occurents are entities that depend on corresponding continuants. “
There are two interpretations of this sentence. One that for a specific “SPAN process” there are only one set of continuants, and the other is that there are observed more than one set of continuants. The first interpretation is fine in many instances, but is not the general phenomenon where compositional aggregation involves true complexity (natural complexity).
The first interpretation is not consistent with the fact that many processes are defined not by a specific set of participating sub-entities, but by the function that is being fulfilled by the process. The process of eating an apple to fulfill one’s need to eat and the process of eating an orange are the same process functionally. The research literature can find other examples of where substructure can vary while fulfilling the same function, ie being the same process.
The Pope yesterday made the point that the world is beset with fundamentalism, and I felt that His speech was also directed at scientific materialism and scientific reductionism.