We have not yet discussed the possibility that analogies might be justified by analogy.

Obviously, analogy as a general mode of thought cannot be justified by analogy; that would be

circular reasoning. But, just as particular analogies can be justified by inductive or deductive

background knowledge, so can particular analogies be justified by analogical background

knowledge. In some cases, the mind may use analogical reasoning to determine how probable it

is that similarity A between x and x% will imply similarity B between x and x%.

y observing which similarities have, in similar situations, led to which other similarities.

Actually, this sort of analogical background information is a special kind of inductive

background information, but it is worth distinguishing.

Let us be more precise. Assume that processor P1 executes a long sequence of analogical

reasoning processes, and processor P2 observes this sequence, recording for each instance a

vector of the form (w,w%,f,w,w%,v,v%,R,r), where r is a number measuring the total

prominence of the patterns recognized in that instance, and R is the set of patterns located.

The prominence of a pattern may — as in the previous chapter — be defined as the product of

its intensity with its importance. The prominence of a set of patterns S may be crudely defined as

%S%K, where S is the structural complexity %S% of the set and K is some number representing

the prominence of the set. A very crude way to define K is as the average over all (y,z) in S of

theimportance of (y,z). A more accurate definition could be formulated by a procedure similar to

Algorithm 3.1.

Then processor P2 can seek to recognize patterns (y,z) with the property that when x is a

pattern in (w,w%,f,w,w%,v,v%,R), r tends to be large. This is an optimization problem:

maximize the correlation of r with the intensity of x, over the space of patterns in the first six

components of the vector. But it is a particularly difficult optimization problem in the following

sense: determining what entities lie in the space over which optimization is taking place is, in

itself, a very difficult optimization problem. In other words, it is a constrained optimization

problem with very unpleasant constraints. One very simple approach would be the following:

1. Using straightforward optimization or analogy, seek to recognize patterns in

(x,x%,f,w,w%,v,v%,R).

2. Over the space of patterns recognized, see which ones correlate best with large r.

3. Seek to recognize new patterns in (x,x%,f,w,w%,v,v%,R) in the vicinity of the answer(s)

obtained in Step 2.

Perhaps some other approach would be superior, but the difficulty is that one cannot expect to

find patterns in a given narrow vicinity merely because functions in that region correlate well

with r. The focus must be on the location of patterns, not the search for large correlation with r.

In this way analogy could be used to determine which analogies are likely to pay off. This

might be called second-level analogy, or "learning by analogy about how to learn by analogy."

And the same approach could be applied to the analogies involved in analyzing analogies,

yielding third-level analogy, or "learning by analogy how to learn by analogy how to learn by

analogy." Et cetera. These are tremendously difficult optimization problems, so that learning on

these levels is likely to be rather slow. On the other hand, each insight on such a high level will

probably have a great impact on the effectiveness of lower-level analogies.

Let us be more precise about these higher levels of learning. A processor which learns by

second level analogy must be connected to a processor which learns by analogy, in such a way

that is has access to the inputs and the outputs of this processor. Similarly, a processor which

learns by third level analogy must be connected to a processor which learns on the second level

in such a way that it has access to the inputs and the outputs of this second-level processor — and

the inputs and outputs of this second-level processor include all the inputs and outputs of at least

one first-level processor. In general, the absolute minimum number of inputs required for an n’th-

level analogy processor is proportional to n: this is the case, for instance, if every n’th level

processor is connected to exactly one (n-1)’th level processor. If each n-level processor is

connected to k (n-1)’th level processors for some k>1, then the number of inputs required for an

n’th level processor is [1-kn+1]/[1-k].

In general, if a set of analogical reasoning processors — Nk learning on level k, k%n — is

arranged such that each processor learning on level k is connected to all the inputs and outputs of

some set of nk processors on level k-1, then the question of network architecture is the question

of the relation between the Nk and the nk. For instance, if Nknk=8Nk-1, then each (k-1)-level

processor is being analyzed by eight different k-level processors; but if Nknk=Nk-1, then each (k-

1)-level processor is being analyzed by only one k-level processor.

This is an hierarchical analogy network: a hierarchy of processors learning by analogy how to

best learn by analogy how to best learn by analogy how to best learn by analogy… how to best

learn by analogy. As will be explained in later chapters, in order to be effective it must be

coupled with a structurally associative memory network, which provides a knowledge base

according to which the process of analogy can be executed.

Kaynak: A New Mathematical Model of Mind

belgesi-947

0 kişi bu belgeyi faydalı buldu

0 kişi bu belgeyi faydalı buldu