Friday, October 22, 2004
Background discussions on a proposed
Anticipatory Technology Challenge Problem
White Paper on Incident Information Orb Architecture (IIOA) à
The interaction between information science innovation
and markets
Wojciech (Jaworski)
You sent an InXight Inc market White Paper "From Documents to Information, New Model for Information Retrieval.
"Search"
for 60 years has focused on helping users retrieve documents, based on the
traditional model of retrieval in libraries. The human is left to skim a
mountain of documents to glean the information he or she needs. This white
paper explores the promise of new technologies to deliver information directly
to users, such as Information Extraction and Timeline, Trend, and Relationship
Visualization, and illustrates these technologies with two real-world examples:
a counter-terrorism example and a corporate intelligence application. It then
proceeds to review the components needed when evaluating a successful
information retrieval solutions.
You said:
This
might be of interest. The tree visualizations look like your models.
I am familiar both with InXight Inc and this type of product. Many people
at InXight Inc would like to see a true paradigm shift, but recognize that
there is no market support for a true shift.
Purchases and contracts come by fulfilling the requisitions of users. Users on the other hand know that to stray outside of specific well-known language will create push-backs from Industry and from other parts of the “system” that causes the typical procurement decision.
It is a product and has almost no innovations except those that are well
known, and which do not go to the issue of stratification or to issues which
Sowa, Ballard, Adi and myself; to name a few, are offering. The missing ingredients can be listed:
1) No theory of information: There is no framework that helps to define the possible compositions of information.
2) No delineated substructural ontology: There is no substructural instrumentation that can be used to create a layed instrumentation of information identification.
3) Real time user manipulation of ontology: There is likely some use of taxonomy or machine ontology, Cyc Corp, OWL, Protégée; but the taxonomy is not available for easy manipulation by users in contexts that are real time, and need to be expressed as anticipatory contexts. Moreover the taxonomies are generally equipped with first order logics that compute non-reified data that is not related to what is real.
The 2005 ARDA Challenge Problem proposal from our group, and I include you in that group, would demonstrate all three by building on the existing Readware software, and generalizing to incorporate other innovations from other innovators. The foundation to the anticipatory technology is HIP, Human-centric Information Production. HIP is grounded in the notion that knowledge is something experienced by a human in real time, and that “information” is something that acts to cognitively prime mental experiences.
One of our claims is that Readware has a generalizable theory of language, and thus of information, which accomplishes #1, and #2. This is perhaps the only software that currently exists that does this. Ballard’s Mark 3 has design elements that when realized will address all three. Sowa’s VivoMind software has foundational work on frameworks, and on transformations over small cognitive graphs, but his work has not combined instrumentation, ie web harvesting, and substructural theory of information into a product that can harvest text and produce conceptual representation allowing mutual induction of mental experiences. See Orb notational paper.
Applied Technical Systems (ATS) NdCore ™ conceptual role up has instrumentation that feeds into a stratified architecture, having basic elements, aggregations of basic elements into “concepts” and aggregated concepts into metaconcepts. But much of this work is not made public in a way that the user can manipulate the underlying rules and artifacts, the procurement model really precludes this or at least does not imagine it. Thus NdCore ™ is not HIP.
Much more can be done with the CCM (Contiguous Connection Model) patents (1993, 1996) than ATS has done so far. The generalization made in the Orb notational paper has far greater capability, while also being able to exactly reproduce the NdCore processes and results. The BCNGroup science committee has a problem with both the principals who own the patent, and thus do not want to see an innovation that sets aside the patent, and the clients in Army intelligence who have pushed NdCore to the cutting edge of the available deployed technology. See note on the mapping of the patent space.
Software patents
Figure 1 indicates that at the heart of computer science there are a small number of mathematical concepts.
The AS-IS model, developed by Prueitt in 2002, shows how information technology procurement inhibits both the principals at ATS and the Army Intelligence clients. It is a general model of which the ATS to government relationship is but one example. One can talk about the Primentia patent in similar terms. Currently some new “evaluation” contracts have been made to “test” the PriMentia Orb type software in exercises that do not show the full value of the Hilbert encoding. These exercises, each costing around $100,000, are vehicles in which the prime contractor, Hick & Associates Inc in this case, make perhaps 40% as a reward for getting the contract. The work done is minimal and is designed to try to align the Primentia flagship product, HILBERT ™ for additional tests.
There is no question in my mind that the innovator, Gruenwald could develop a data cleaning and integration process that would in fact acquire relational data into small data constructions, similar to the Orb construction. The resulting system would allow a paradigm shift in information science. A proper economic reward would return to the innovator and those who have invested in Primentia Inc. But the AS-IS process flow between government and industry precludes this possibility.
There are other groups that one might mention here: human factors scholars, Human mark-up language, various work in computational linguistics and quantum linguistics, polylogics, schemalogics, Acappellasoftware Inc, Russian work in semiotics and situational control, the work talked about between Citkina and Prueitt on comparative terminology science, other framework based information technology like Zachman’s frameworks, complexity theory, etc.
I do not fault InXight for saying that they have a “new” model for information retrieval. However, many are ready to move on to something that is truly revolutionary, not – I might add – for the sake of revolution but because the old information science clearly is not all that can be deployed now.