Monday,
August 23, 2004
Foundational work in Knowledge Science
In the current data encoding for ReadWare technology, the word or phrase is mapped to a new, or old, hash container as needed. A simple logical transform tells us if a new word should be mapped to an existing container. If so the (unique name) is added (linked) to an existing container and becomes related to an existing reference in the conceptbase. If a new container is needed, then this new container is placed into a special repository until a re-write of the complete set of containers is possible.
The 1984 patent shows a sophisticated method based upon morphology. The goal was to try and reach a unique word stem by reducing a word to three significant consonants. One might think of dozens of ways to hash three letters. Initially, we had even more sophisticated methods that included rules that mitigated some of the effects of language change. It worked, though not as well as we had hoped.
We saw a slightly modified standard hashing scheme as a means of solving table-lookup problems and lessening computational load. Our data encoding process was not the object of our focus-- it was solution born out of necessity. It does its work well. We have a very fast system. Texts were compiled into "mathematical signatures" where each significant word and all concept guesses from the content of the text are "optimally" stored. Each signature corresponds a sequence of words in a text.
Each text or document signature is a record in the coding and structure of a random access database. There is a header with some information about the content and a body with some information content. There is nothing new there. Your notes on Orb encoding seem to have several new things.
We would like to know more about what the processor does to express functional relations between the same category and class of things. Recently we created the conceptbase -- a fixed taxonomy of word-themes for all the major word-forms of a language. That was really hard work.
As you have noted, we used a methodology that can be generalized.
The ReadWare signature file can be more sophisticated.
I will send you a parse of one of the bead-game pages with the byte-wise location of all <idioms, terms, names, concepts, categories, topics, issues and probes> in XML rendition. Then we can open up the discussion about the use of Hilbert encoding and the integration of other components of an anticipatory technology.
Ken
A Systematic Review of all Software Patents