Friday, December 02, 2005
Center of Excellence Proposal
ŕ
The Taos Institute
(on the possibilities)
[bead thread on curriculum
reform]
Ballard’s
communication on information theory advance
ŕ [258]
Communication from Richard Ballard
Paul Werbos:
This quick note is to confirm that the Physical Theory of Knowledge and Computation is founded in purest physics and not on either automata nor digital anything. That, I think is why Peter Freeman challenged me to find the science.
The fundamental
breakthrough that launched this quest came in Sept-Oct 1975 while working with
Alfred Bork on our Physics Computer Development at UCI, funded by NSF through
Andy Molnar's Technical Innovations in Education program. [Is Andy retired but
still around, I would love to tell him what has happened?] I had created then a
very successful "New World" graphic simulation of classical mechanics
called "Motion" in 1971 and Andy sweetened the pot when I proposed to
do the same open-ended "create your own world" treatment of quantum
mechanics (Schrodinger's time dependent wave equation). At the time everyone's
favorite examples were the 3 minute film loops of wave functions crashing into
potential barriers -- films made at Lawrence Livermore by Harry Shay and Judah
Schwartz.
I had prototyped what I had hoped to use with forward difference equations a la Euler's methods before writing the proposal and thought they were no problem. But finally when starting it in earnest, I discovered forward and backward differences were unstable. Harry Shay came to visit and gave me the news that parabolic partials were stable only with central difference methods (i.e. simultaneous equations or matrix inversion). He and Judah had spent $100s of thousands of "Free" [after hours] super computer time to get each 3 minute film loop. Loops could not be any longer because the wave functions would reflect off the memory limit boundaries and come back into the camera field of view as unwanted eddies.
Being a UC Berkeley Ph.D. with highest honors, of course, I decided that I would just find some totally new solution to the problem. Christmas vacation was coming (as now) and I could always find at least 2 free weeks that time of year (its harder now). North American Aviation (NAA) had put me through Berkeley on a scholarship, so every summer I was guaranteed a summer job. I started there as a "Technical Computer" -- (Feynman's job at Los Alamos) -- then traded my mechanical Monroe's and Marchant's in for a Bendix G-15 vacuum-tube machine (now in the Smithsonian) and did mathematical pick and shovel work simulating inertial navigation servo-loops and accelerometer integrations on everything from the Nautilus to the Minuteman.
Naively it seemed obvious that taking a second order PDE with at most 2n-1 non-zero matrix elements (tridiagonal difference operator) and inverting it into a dense n squared non-zero elements (greens function -- field matrix) obviously did not conserve information content. THE INVERSES WERE REVERSABLE SO THEIR INFORMATION MUST BE CONSERVED. 10 years earlier I had spent the summer learning Von Neumann’s matrix inversion for large matrix paper to deal with the NAA engineer's problems of ill-conditionness. Von Neumann had gotten hung up on round-off error and took everyone with him from there on out minimizing errors, while maximizing information redundancy.
I started routinely by using Gaussian reduction (diagonal - upper and lower triangular decomposition), then found everything went to n squared when inverting the upper and lower triangles. So I vowed not to do that. Instead I wrote the matrix elements out algebraically in terms of the last 2n-1 variables I had so far derived. Pushing these around I recognized a familiar algebraic pattern.
9 years earlier NAA had assigned me the summer job problem of reprogramming all the trigometric subroutines in a missile computer (probably Regulus or Hound-dog) because its arctan program had the wrong principal interval. During a test flight the missile had pointed in some magic angle and gotten a new vector to turn 180 degrees. It ultimately crashed in a farmer's field in Northern Florida.
As a summer hire I showed up just in time to try and fix it. There was no memory space left to make the change. So they asked me to convert all the Taylor series to continued fractions and find them 3 words of space -- rumors were that continued fractions would converge faster than Taylor series.
Now 9 years later, looking at those matrix element formulae -- they looked exactly like those familiar continued fractions. I made a change of variable and everything collapsed into "continued products" formed using only those 2n-1 continued fractions. I could get everything I needed by not creating any matrix, but rather by computing only the 2n-1 fractions with about the same number of scalar multiplications. Information storage and computing execution all had costs proportional to information content. Shannon was right!!!! -- I was hooked and all in 2 weeks at Christmas.
I told Drs. Alfred Bork, Joe Marasco, Andy Molnar, and other UCI Physicists, etc. then dove into the books, because with so much structure there -- mathematicians must already have discovered all this -- but they hadn't.
Richard Hamming visited UCI the next summer on sabbatical before moving on to the Naval Postgraduate School in Monterrey. He heard me give my spiel and then told me for certain -- mathematicians knew none of this. In the meantime, I had already started the first micro-based publishing company for William C. Brown and would soon move on to become the Apple Education Foundation for Steve Jobs, Woz, and Mike Markula. They were moving out of their garage and wanted me to give away computers and kick-start an educational software publishing industry.
Hamming then took on the role of pushing me to continue my matrix work whenever we talked. But it wasn't until 1986 (10 years later) after the educational software publishing industry crashed that I got back working on it at times other than Christmas. By then I had two missions: (1) to make (educational) knowledge capture cost effective and (2) to solve the SDI problem of adaptive discrimination by designing a battle management testbed that used lasers to learn proper mid-course signatures (in 15 minutes) and then machine teach signatures to infra-red telescopes and terminal phase interceptors. It turned out that Brilliant Pebbles had to be brilliant in order to invert a covariance matrix. I knew I could do it with Dumb Pebbles and we built demos for President Reagan to show him how the pebbles would peel of one by one after picking out their own targets (on an Atari game machine).
More important Feynman had found by inspection that his electron-electron scattering matrix elements could be approximated simply by using products of the diagonal and semi-diagonal matrix elements and had written that up as a curious note that discovery. His calculations were based upon perturbation theory and he had no way to predict the sign of each element. He assumed that because the perturbation series might need a small coupling constant to guarantee convergence then his results and QED calculations could only guarantee the right answers given "weakly coupled interactions."
By contrast I knew the signs on every element and could compute "exact" inverse magnitudes. The formulae did not change as a function of field strength. If the magnitudes truly depended on diagonal and semi-diagonals alone, then the scattering was accomplished with only a single photon exchange. All his other assumed virtual states were easily accountable through the continued product terms. The outstanding philosophical debate has been "are the virtual states physical?" The answer seems clearly NO, there are no missing degrees of freedom that might independently necessitate or populate some different virtual states. The single photon exchange accounts for everything.
I had met Feynman twice before at Berkeley and at UCI and tried to set up a meeting at Cal Tech. Friends at CalTech suggested I work out the inverses first for both 3 and 4 dimensions -- non-trivial. In the meantime I had met with Peter Freeman (1987) and he insisted that I prove first that -- information was conserved in computation. No one else seemed to believe that, perhaps including Feynman. Landauer, Bennett, and others were still trying to understand Shannon's mistake in calling information entropy. Everyone assumed that reversible computations had to be done quasi-statically to be thermodynamically reversible. Shannon's entropy mistake continued to vex everyone. Even while real heat losses plummeted as voltages dropped, Nature kept publishing articles on someday future need for quasi-static thermal reversability.
It seemed clear to me that information remained information (Leon Brillouin had caught the sign error and called it neg-entropy) while still relatively noise free within abundant bandwidth. Information became entropy when signals overflowed bandwidth and became the noise overburden that effectively destroyed bands that before had passed information unimpeded. Feynman's quote "finite space contains finite information" always seemed clear enough to me.
I will write soon of all that came while developing the Theory of Knowledge, but let me take up immediately your challenge of "localization". My understanding of the term deals with the issue of what becomes of some point-like energy disturbance if inserted within the field. Does it remain localized where placed, does it defuse, travel as a bundle, or if propagating on some axis does it loose form, etc. More significantly does it tend to grow explosively greater in magnitude as it spreads instead of diffusing with decreasing density. Any tendency toward explosive non-conservation and inability to localize would surely signal that the "mathematical solution" had become decidedly non-physical.
I thought of the localization problem before Hamming quizzed me on my plan of research. It seemed clear that starting with 2n-1 non-zero tri-diagonal constants was a measure of matrix dimension n and not strictly of information content. If the three diagonals were made up uniformly of just one or two distinct constants, then does the cost vary with the number of constants (informational degrees of freedom) or with the matrix dimension (space encompassed). The derived results remained proportional to the number of unique constants (information content), so massive matrix computations might be achieved requiring only two or three scalar multiplications.
Last year as an exercise I took up the infinite scattering problem for a week. I wanted to generate an existence theorem for tridiagonal inverses. Originally I had built all my inverses over the space of complex numbers and not confined them to real values. The time dependent Schrodinger equation is not relativistically covariant and is first order in time, so complex QM representations are often convenient. The other item of study was to leave the differential delta Xs explicitly finite and not assume they disappear in Newtonian limits. Real space and/or time should retain its finite QM granularities in nanotech.
I reduced all the calculations to a simple map and then tried last night to transform it into a pdf image you might enjoy studying. Clearly Paul W. you appear to be another potential reviewer when I settle down to writing. Unfortunately I found that Acrobat Distiller did not reproduce the greek letters and other symbols used in my equations. I can print them out and send copies to you. I will test tomorrow to see if MS-Word or HTML or some other intermediary format might preserve what Acrobat failed to translate in my drawings.
The point being is that localization is the test I assume in judging whether a given region of "complex information parameter space" will produce an inverse that is well behaved. We can talk in greater detail when both of us (plus John Sowa) have the map before us. The possible answers to your challenge can be tested. We can go through any catalog of well known 2nd order differential equations as in something like the Handbook of Mathematical Functions -- Abramowitz and Stegun. I have a 3d printing 1965. Their values, derivatives, and integrals are all well tabulated. We can plot their positions on the map noting those conditions for which they remain well-behaved and examine graphically just how their tabular behaviors corresponds to the parameter values I have plotted. I have done this before myself, though not exhaustively. The results have been striking.
The map is characterized by two parameters -- vertically by the symmetry (S) in the ratio of lower over upper semi-diagonal terms. First order derivatives make this ratio non-symmetric. The horizontal coordinate serves to measure (delta) diagonal dominance -- the ratio of the diagonal element magnitudes over with the upper semi-diagonal magnitude.
The striking feature of this map is a parabola starting at origin delta = 0, s = 0 and cupping upward in the curve s = delta squared/4. The boundary values nearest that parabolic curve are either assymptotic or strongly cusped. Variables within this parabola all have complex values, EXCEPT on the line-of-symmetry S = 1. There they are constant and always real -- we have named it variously the "Line of Reality or of 'Physics'".
There are two parameters that are functions of diagonal dominance (delta) and of symmetry (s). One, g, appears to have an inertia like mass, the other f appears to be capable of sustaining motion as with momentum. Their expressions look remarkably Lorentzian. On that line of reality f has constant value = 1. Which I think is the upper limit on stable localization. A number of well known special functions ride this line of symmetric reality like a rail, but slip off easily into a well localized reality (f<1) when the line emerges out of the parabola and onto the (everywhere real valued) plain outside the parabola.
The inner parabolic region immediately below the s = 1 line seems most explosively non-localized (if f>1 indicates that) but amplitudes there are all complex. The triangular region just below the parabola from s=1, delta=-2 down to s=-1, delta =0, up to s=1 delta = +2 may be modestly non-localized with f > 1, but all amplitudes there are real.
Outside the parabola, g always has real values that cusp very much like a Lorentzian mass approaching the speed of light. The cusp seems knife edge sharp but with complex phase values of modest constant magnitudes sitting infinitesimally near on the other side of the cusped rise. If the real world truly abhors infinities, this transition looks like a potential leap to "warp drive". Inside the parabola, g's complex phase angles change smoothly and amplitudes remain near to constant values.
Paul, you are going to love this map of potential realities. I think it has everything needed to build a whole new intuition.
The tridiagonal matrix is isomorphic to the PDE. God and Physics have already shown us how delighted they are with PDEs. Its reversability has been a constant perhaps by defintion. Is there any way it can be reversible and not conserve what it started with when first defined. Does switching from particle to field perspectives and back have some hysteresis loop, consuming some observable resource -- if so -- what?
Dick