It is natural to define a random sequence as one which has no statistical regularities (von

Mises, 1957). For instance, one might propose that in a random binary sequence 1 should occur

exactly as often as 0. Also, one might require that the four doublets 00, 01, 10 and 11 should

occur equally often. And perhaps the eight triplets 000, 001, 010, 011, 100, 101, 110, and 111

should occur with equal frequency. And so on. It would clearly be desirable to define a random

sequence as one in which all subsequences of length n occur with equal frequency. Let us call

this the natural definition.

Clearly, this definition does not apply to finite sequences. Each sequence of length n contains

exactly one subsequence of length n (itself), so it certainly does not contain all sequences of

length n equally often. According to this definition only an infinitely long sequence can be

random.

Early in this century it was discovered that there is a basic flaw in this approach. The

restrictions imposed by the natural definition are so stringent that no sequence, finite or infinite, can possible satisfy them. However, they are not beyond repair. A normal sequence is defined as one in which, as you go further and further out in the sequence, the frequencies of all the subsequences of length n get closer and closer to being equal. For instance, if one tooksamples of a normal sequence near the beginning, one might well find a lot more 00s than 01s, 10s or 11s.

But eventually, if one took samples far enough out, one would have to find 00s, 01s, 10s and 11s

equally often.

Intuitively speaking, if you tossed a coin and recorded 0 whenever tails came up, 1 whenever

heads came up, you would expect the list of 0’s and 1’s to be a normal sequence. Essentially, a

normal sequence is a sequence in which, as you go further and further out, each digit has less and less to do with the others. Just as, in a series of coin tosses, each toss has essentially nothing to do with the others.

That is one approach to randomness. There is another approach, involving the KCS definition

of complexity, which also involves infinite sequences. What is remarkable is that the two

different approaches turn out to be closely related.

RANDOMNESS AND COMPLEXITY

Consider an infinitely long binary sequence x. Let x[n] denote the first n terms of x. For

instance, if x = 01001101010001001010…, then x[7] = 0100110. The idea behind the KCS

approach to randomness is that the complexity of the infinite sequence x can be defined in terms

of the complexities of the finite sequences x[n]. The first step is to ask: as n gets bigger and

bigger, what happens to the KCS complexity of x[n]? If x = 0000000…, then the question has an

easy answer. The sequence x[n] consists of n zeros, and KCS(x[n]) complexity is about log(n).

And, intuitively speaking, if x is totally structureless, then x[n] has a KCS complexity of about n.

These considerations lead up to the crucial insight, due to Kolmogorov and Per Martin-Lof. Look

at what happens to the ratio KCS(x[n])/n as n gets bigger and bigger.

If, as n gets bigger and bigger, KCS(x[n])/n gets closer and closer to 1, then it follows that for large n, KCS(x[n]) is close to n. And this means that, for large n, x[n] essentially has no

structure. On the other hand, if KCS(x[n])/n gets closer and closer to zero as n increases, this

means that for large n there is indeed some structure in x[n]. It means that, for large n, there is indeed a better way of computing x[n] than just saying "Print ‘x[n]’".

What if x looks like this: 01010000010000010001000100 00000100010001…? Here every

other digit is a zero: the first, the third, the fifth, and so on. But the even-numbered digits follow no apparent pattern. What if x continued this way forever? Then x[n] could be computed by a program of the form "Print this sequence, putting a ‘0’ between every two terms: ‘110010…’", where ‘110010…’ is a finite sequence consisting of the odd-numbered terms of x[n]. How long is this program? Well, the sequence consisting of the odd-numbered terms of x[n] is about n/2 digits long. So here KCS(x[n]) is about n/2. Thus KCS(x[n])/n is about 1/2.

Ignoring a number of technical issues, we may define a random infinite sequence x as a

sequence for which, as n gets bigger and bigger, KCS(x[n])/n does not approach zero, but rather

approaches some other number. A randominfinite sequence x is one for which there is a fairly

easy way of computing x[n], when n gets large. It can be proved that almost all infinitely long

sequences are random in this sense — algorithmically random.

One way to understand this definition is as follows: A random sequence is an infinite sequence

which cannot be summed up in any formula of finite length. For instance, 00000000… can be

summed up in the formula "Repeat ‘0’ forever". And 010010001000010000010000001… can be

summed up in the formula "Repeat ‘0’ k times and then print ‘1’, for k=1,2,3,4,…." But a random sequence x cannot be summed up in any formula, because if it could then that formula would provide a way to compute x[n].

Clearly, every sequence which is random in this sense is not normal. Think about the sequence

given three paragraphs up, whose odd-numbered digits are all zeros but whose even-numbered

digits have no structure to them. No matter how far out you look, 0’s and 1’s are not going to

occur equally often in this sequence. There will always be more 0’s. The best you can say about

this sequence is that it has a subsequence — the sequence of its even-numbered digits — which

looks to be normal.

PROBLEMS WITH RANDOMNESS

The theories of randomness sketched above are not very useful in practice, for obvious

reasons. It only deals with infinitely long sequences. In reality, we are always faced with finite collections of data.

This restriction to infinite sequences leads to a couple of interesting conceptual paradoxes.

First of all, the very proof that random sequences exist is somewhat troublesome. We have

proved that almost every infinite sequence is random, but can we prove that any one particular

sequence is random? We cannot, because random sequences are precisely those infinite

sequences which cannot be summarized in a finite formula. In fact, the set of random sequences

is precisely the set of all sequences which we cannot write down in any way. We have proved

that the set exists, but we cannot demonstrate that any particular sequence belongs to it, because in order to do so we would have to write down that particular sequence. This is not exactly a logical paradox, but it is certainly disconcerting.

G. Spencer-Brown has discovered a particularly poignant way of illustrating the implications

of this logical peculiarity. Suppose, he says, that you have built a random number generator — a machine which is intended to generate numbers in such a way that each number it generates has

absolutely nothing to do with the others. Then how can you test it to see if it works?

Suppose you tested it and it gave out a hundred zeros in a row. You would probably assume

that it was broken. But, statistically speaking, a truly random number generator would generate a hundred zeros in a row sometimes. Not very often, but sometimes. There’s no reason that rare

occasion shouldn’t comefirst. After all, the sequence consisting of a hundred zeros in a row is no less likely than any other sequence of a hundred numbers.

So you run it some more. And the same thing happens — it keeps giving tens. Still, you’re not

really justified in concluding that it doesn’t work. The same argument applies. The fact is that no matter what the machine does for the first n trials, it can be argued that a true random number generator would be just as likely to generate that sequence as any other. So no matter what the machine does, you can’t judge its effectiveness.

You could examine the mechanism of your random number generator to see if it looks as

though it is operating randomly. For instance, you could supply it with a mechanical coin tosser.

Then you’d probably be confident its answers were random, since you’re probably confident the

results of a coin toss are random. But this is nothing more or less than intuition: you’re assuming something is random because you haven’t seen any structure to it in the past. An intuition is not the same as a theoretical guarantee.

Essentially, this paradox arises from the assumption that a random number generator must

give out every sequence of length n with equal frequency. But is there any other way to define a

random number generator? One could define a random number generator as a machine which

generates a normal sequence of numbers, but it is easy to see that there is no way to prove a

finite sequence is part of a normal sequence. This is because the definition of normality involves going "far enough out" in a sequence. Once you go far enough out in a normal sequence, all subsequences of length n must occur equally often. But as n increases, so does the precise

meaning of "far enough". Say you determine that the first million terms of a sequence present all subsequences of length 1000 or less with the appropriate frequencies. That still doesn’t tell you whether or not the sequence repeats a certain subsequence of length 1,000,000,000 too often. In fact, for all you know, the sequence could consist of the same million-digit-long sequence repeated over and over and over.

And remember, normality is in a sense a weaker concept than algorithmic randomness — it

says that the decimal expansion of pi is random. It is even more obvious that there is no way to

tell if a finite sequence is part of an infinite (algorithmically) random sequence. After all, if we just see the fragment 01001001010011001001, how do we know it’s part of a random sequence

and not part of the sequence 01001001010011001001 01001001010011001001

01001001010011001001 01001001010011001001… which repeats the same fragment over and

over again. This dilemma is reminiscent of the problem of induction, to be discussed in Chapter

5.

Our discussion has been phrased in terms of binary sequences, but it could be generalized to

deal with any other mathematical objects in exactly the same way. The conclusions would not

change a bit. The only things that are truly, mathematically random are infinitely large entities which can never even be summarized in finite formulas. And there is no practical way to tell if a given physical machine produces mathematically random sequences.

In sum: randomness is a phantom. In reality, all we can do is assume thatthose things we’ve

detected no structure in are random. But this is a working assumption, not a mathematical

guarantee. And therefore, the question of whether the mind and brain are stochastic or

deterministic is a phantom as well. Thus there can never be an empirical reason to say that the

mind or the brain is a stochastic computer rather than a deterministic computer. Because,

scientifically speaking, to say that X is random is only to say that X has aspects in which we

cannot detect any order. To declare that there is consequently no order there is unjustified.

belgesi-937