# 6.3 Hierarchical Analogy

We have not yet discussed the possibility that analogies might be justified by analogy.
Obviously, analogy as a general mode of thought cannot be justified by analogy; that would be
circular reasoning. But, just as particular analogies can be justified by inductive or deductive
background knowledge, so can particular analogies be justified by analogical background
knowledge. In some cases, the mind may use analogical reasoning to determine how probable it
is that similarity A between x and x% will imply similarity B between x and x%.
y observing which similarities have, in similar situations, led to which other similarities.
Actually, this sort of analogical background information is a special kind of inductive
background information, but it is worth distinguishing.
Let us be more precise. Assume that processor P1 executes a long sequence of analogical
reasoning processes, and processor P2 observes this sequence, recording for each instance a
vector of the form (w,w%,f,w,w%,v,v%,R,r), where r is a number measuring the total
prominence of the patterns recognized in that instance, and R is the set of patterns located.
The prominence of a pattern may — as in the previous chapter — be defined as the product of
its intensity with its importance. The prominence of a set of patterns S may be crudely defined as
%S%K, where S is the structural complexity %S% of the set and K is some number representing
the prominence of the set. A very crude way to define K is as the average over all (y,z) in S of
theimportance of (y,z). A more accurate definition could be formulated by a procedure similar to
Algorithm 3.1.
Then processor P2 can seek to recognize patterns (y,z) with the property that when x is a
pattern in (w,w%,f,w,w%,v,v%,R), r tends to be large. This is an optimization problem:
maximize the correlation of r with the intensity of x, over the space of patterns in the first six
components of the vector. But it is a particularly difficult optimization problem in the following
sense: determining what entities lie in the space over which optimization is taking place is, in
itself, a very difficult optimization problem. In other words, it is a constrained optimization
problem with very unpleasant constraints. One very simple approach would be the following:
1. Using straightforward optimization or analogy, seek to recognize patterns in
(x,x%,f,w,w%,v,v%,R).
2. Over the space of patterns recognized, see which ones correlate best with large r.
3. Seek to recognize new patterns in (x,x%,f,w,w%,v,v%,R) in the vicinity of the answer(s)
obtained in Step 2.
Perhaps some other approach would be superior, but the difficulty is that one cannot expect to
find patterns in a given narrow vicinity merely because functions in that region correlate well
with r. The focus must be on the location of patterns, not the search for large correlation with r.
In this way analogy could be used to determine which analogies are likely to pay off. This
might be called second-level analogy, or "learning by analogy about how to learn by analogy."
And the same approach could be applied to the analogies involved in analyzing analogies,
yielding third-level analogy, or "learning by analogy how to learn by analogy how to learn by
analogy." Et cetera. These are tremendously difficult optimization problems, so that learning on
these levels is likely to be rather slow. On the other hand, each insight on such a high level will
probably have a great impact on the effectiveness of lower-level analogies.
Let us be more precise about these higher levels of learning. A processor which learns by
second level analogy must be connected to a processor which learns by analogy, in such a way
that is has access to the inputs and the outputs of this processor. Similarly, a processor which
learns by third level analogy must be connected to a processor which learns on the second level
in such a way that it has access to the inputs and the outputs of this second-level processor — and
the inputs and outputs of this second-level processor include all the inputs and outputs of at least
one first-level processor. In general, the absolute minimum number of inputs required for an n’th-
level analogy processor is proportional to n: this is the case, for instance, if every n’th level
processor is connected to exactly one (n-1)’th level processor. If each n-level processor is
connected to k (n-1)’th level processors for some k>1, then the number of inputs required for an
n’th level processor is [1-kn+1]/[1-k].
In general, if a set of analogical reasoning processors — Nk learning on level k, k%n — is
arranged such that each processor learning on level k is connected to all the inputs and outputs of
some set of nk processors on level k-1, then the question of network architecture is the question
of the relation between the Nk and the nk. For instance, if Nknk=8Nk-1, then each (k-1)-level
processor is being analyzed by eight different k-level processors; but if Nknk=Nk-1, then each (k-
1)-level processor is being analyzed by only one k-level processor.
This is an hierarchical analogy network: a hierarchy of processors learning by analogy how to
best learn by analogy how to best learn by analogy how to best learn by analogy… how to best
learn by analogy. As will be explained in later chapters, in order to be effective it must be
coupled with a structurally associative memory network, which provides a knowledge base
according to which the process of analogy can be executed.
Kaynak: A New Mathematical Model of Mind
belgesi-947