Let us now return to the problem of how a STRAM is to maintain its approximate network-of-

emergence structure. One way to approach this optimization problem is via the multilevel

methodology (Goertzel, 1989). This application of the multilevel methodology bears a strong

similarity to Achi Brandt’s (1985) multilevel algorithm for the Ising spin problem.

The first level would be the following procedure:

1) assign nodes to any newly-introduced functions, and single out a number of

pre-existing nodes, in such a way that the nodes assigned and selected are

approximately evenly distributed across the network (so that no two are too

close to each other). Call these functions xi and their nodes N(xi).

2) switch the node of each of these functions xi with the node of one of the

functions y with which it shares a pattern — that is, assign xi node N(y) and y node N(xi) — and

see if, according to the new arrangement, the total amount of pattern which the network indicates

in xi and y is greater. If so, let N(xi)=N(y) and let N(y)=N(xi) — that is, make the switch

permanent — and apply Step 2 to xi again. If not, proceed to one of the other functions y with

which xi shares a pattern (say, proceeding through the xi in order of decreasing intensity of

shared pattern). Once all the functions with which xi shares more than e (some small fixed

number) worth of pattern have been exhausted, exit Step 2.

This algorithm moves functions through the network toward their "proper position." The

problem is that, even if xi is not in its optimal position, its position could well be better for it than

the positions of its neighbors. Ideally, one might like to try each xi out in every possible position;

and although this is not possible, one may indeed improve upon the "neighbors-only" approach.

Following the multilevel methodology, one could seek to find the optimum node for xi among

those nodes at a distance between h1 and h2 from xi in the network (where the distance between

two nodes is measured by the number of links separating them). One could attempt this by the

Monte Carlo method,randomly seeking to switch xi with functions in this region, or one could

attempt this by randomly beginning the neighbors-only search given above from points in this

region.

And if one found a new home for xi in this region, one could execute the neighbors-only

search from the new home. And from the answer which this yielded, one could execute the

second-level search. And one could repeat this entire process according on an arbitrary number

of levels, according to the basic framework outlined in Chapter 2.

Of course, this is only one possibility; the multilevel philosophy could be applied in many

different ways, and there are many other approaches to optimization. The important point is that,

by some specially-tailored optimization method, the network must continually reorganize itself

so as to better approximate the network of emergence structure, and so as to integrate new data in

a manner which maintains this structure. It would seem to be completely impossible to

determine, at this stage, the actual method by which the human brain reorganizes its memory

network.

HOLISTIC MEMORY

It might seem that no organism could afford to continually subject its memory to such a risky,

speculative optimization algorithm as that sketched above. It could be, of course, that there exists

some optimization algorithm which is substantially more effective than those presently known.

However, as emphasized in Chapter 2, most researchers doubt if there will ever exist a rapid,

highly accurate algorithm for the solution of difficult optimization problems. In the context of

associative memory, rapidity is of the essence, so it is probably true that rough approximate

answers must suffice.

If one were designing a brain, one way of partially eliminating the risk involved in adjusting

memory according to a highly approximate algorithm would be to provide it with a large number

of structurally associative memories, each storing largely the same functions. In fact, it has often

been observed that biological memory is holistic in the sense that it appears to "store almost

everything almost everywhere". Any small section of the cortex appears to contain a large part of

the data which the cortex stores. This an empirical observation, validated repeatedly in the

laboratory (Hubel, 1988).

It has been suggested that holistic memory is necessary because of the error-prone nature of

the brain’s components. This is almost certainly true. However, the present considerations

suggest another good reason: the nature of structurally associative memory seems to require that

memory structure be continually subjected to radical, experimental transformations. In order for

these transformations not to interfere too much with continual memory use, the mind must

incorporate some sort of "workspace" in which unsuccessful transformations may be tried and

discarded without global effect. In short: holistic memory may be necessary not only because of

the error-prone natureof the brain’s "hardware", but also because of the error-prone nature of

high-level mental process.

PHYSICAL MEMORY STRUCTURES

In this context, let us return to the question of physical memory. A physical memory is,

obviously, not a weighted graph. Nonetheless we would like to call certain physical memories

structurally associative memories. Therefore it is necessary to assign to each physical memory M

a set of "naturally correspondent" structurally associative memories. We know of no completely

satisfactory way of doing this, but the following scheme is not unreasonable. It requires only that

we assign to each pair of elements (x,y) stored by M a distance DM(x,y) which measures the

difficulty of locating x in memory given that y has very recently been located. Let F be a

mapping which takes the set S of elements stored by M into the nodes of a graph; let L(x,y)

denote the number of links in the graph F(S) along the shortest path from(x) to F(y). Then the

accuracy with which F represents M may be defined by the average, over all (x,y), of %L(x,y)-

DM(x,y)%. This definition requires that, in a rough sense, distance in the memory M correspond

to distance in the graph F(S). Certainly there is more to be done in this direction, but the point is

that it is indeed possible to model any sort of memory as a graph, and hence to analyze any sort

of memory as structurally associative memory.

MEMORY AND ANALOGY

Finally, we must now explore the manner in which a structurally associative memory may be

utilized by cognitive processes. In the notation of the definition of analogy given above, consider

that f and y have already been chosen. The process of selection will be discussed briefly a bit

later. Then, given x, an appropriate x% may be located by a simple process of "looking up" in

STRAM.

In the case of structural analogy all that is required is to look for something which has a large

amount of pattern in common with x. Specifically, a list must be made of the significant patterns

in x, and then the other functions in which these functions are also patterns must be searched,

with the objective of locating those which have a large number of these patterns in common.

This may be a large search, and it will have to be approximate — e.g. a Monte Carlo search could

be implemented, skipping over at random some of the neighbors. The point is that the making of

the list of patterns in x, and of neighbors which may also possess these patterns, is merely a

matter of looking at the actual graph of the network. Of course, this method will only be accurate

to the extent that the STRAM approximates the network of emergence structure.

In the case of contextual analogy, a list must be made of all the patterns which emerge

between x and v, and then a search of the neighbors of thesepatterns must be made, until a

function x% is located such that many of the same patterns emerge between it and some other

entity v%.

If f is not the identity mapping, then things are slightly more involved. One knows only that

f(x%) is close to x in the network; one knows nothing about x% itself. So, in general, a list must

be made of all the patterns which emerge in x or between x and w, and then a search of the

neighbors of these patterns must be made, until an entity z is located such that many of the same

patterns emerge between it and some other entity w%. Then x% may be set equal to f-1(z). This

involves the potentially difficult computation of f-1, and it is generally much more difficult than

structural or contextual analogy. However, as suggested earlier, even speculative modeling

analogy may be useful, as in brainstorming.

The location of x% is Step 1 of the general process of analogy. The next step is the recognition

of patterns in x%. This process may be decomposed into two stages: isolating the patterns

already "known" by STRAM to be part of the structure of x%, and finding new patterns not yet

"known" by STRAM. Of course, if f is the identity mapping, then it is trivial to locate all patterns

in x’ that have been identified by STRAM; in fact, this was already done to some degree of

approximation in the selection of x%. In this case the first stage is very easy to execute. But if f

is not the identity, then in order to find out what STRAM knows about x%, one must search

through STRAM for x% (remember,it is f(x%) which is near x, not necessarily x% itself). One

may treat this search as a minimization, over the space of all nodes of STRAM, of the function

d#(x,x%). It would not be wise to execute this minimization by local search (from each node

proceeding to that neighboring node which minimizes the function), because the structure of

STRAM is almost certainly imperfect; a focused Monte Carlo search, a simulated annealing

search, or a multilevel search would be much more effective.

The second stage of Step 2, the location of new patterns in x%, will be implicitly addressed

later, when we discuss pattern recognition in general.

Kaynak: A New Mathematical Model of Mind

belgesi-951

0 kişi bu belgeyi faydalı buldu

0 kişi bu belgeyi faydalı buldu