# Computational Complexity

Computational complexity theory, also called algorithmic complexity theory, seeks to answer
two different kinds of questions: "How hard is this problem?", and "How effective is this
algorithm at solving this problem?". A number of difficult issues are involved here, and it is not possible to delve into them deeply without sophisticated mathematics. Here we shall only scratch the surface.
Questions of computational complexity are only meaningful in the context of a general theory
of computation. Otherwise one can only ask "How hard is this problem for this computer?", or
"How hard is this problem for this particular person?". What lets us ask "How hard is this
problem?", without any reference to who is actually solving the problem, is a theory which tells
us that problems are basically just as hard for one computer as for another. Here as in so many
other cases, it is theory which tells us what questions to ask.

According to the theory of Turing machines, any sufficiently powerful computer can simulate
any other computer. And this is not merely a theoretical illusion. In practice, computers such as PCs, mainframes and supercomputers are highly flexible. An IBM PC could be programmed to
act just like a MacIntosh; in fact, there are software packages which do something very close to
this. Similarly, a MacIntosh could be programmed to act just like an IBM. Turing proved that
there is a program which tells a computer, given appropriate information, how to simulate any
other computer. Therefore, any computer which is powerful enough to run this program can act
as a universal Turing machine. If it is equipped with enough memory capacity — e.g. enough disk
drives — it can impersonate any computer whatsoever.

True, this universal simulation program is very complex. But if a problem is sufficiently
difficult enough, this doesn’t matter. Consider the problem of sorting a list of numbers into
increasing order. Suppose computer A is capable of solving this problem very fast. Then
computer B, if it is sufficiently powerful, can solve the problem by simulating computer A. If the problem is sorting the list {2,1,3}, then this would be a tremendous effort, because simulating A is vastly more difficult than sorting the list {2,1,3}. But if the list in question is a billion numbers long, then it’s a different story. The point is that lists of numbers can get as long as you like, but the complexity of simulating another computer remains the same.
Let us make this example more precise. Assume that both A and B have an unlimited supply
of disk drives — an infinite memory tape — at their disposal. Suppose that the program for
simulating computer A is so slow that it takes computer B 10 time steps to simulate one of
computer A’s time steps. Supposealso that computer A is capable of sorting a list of n numbers in n2 time steps. That is, it can sort 10 numbers in 100 time steps, 100 numbers in 10000 time steps, and so on. Assume that computer B is not quite so bright, and it has a sorting program built into its hardware which takes n3 time steps to sort a list of n numbers.

Then, if B were given a list of 3 numbers, its hardware could sort it in 33=27 time steps. If it
tried to sort it by simulating A, it would take 10(32)=90 time steps. Clearly, it should rely on its built-in hardware. But if B were given a list of 10 numbers, it would take 103=1000 steps to sort it. If it tried to sort the list by simulating A, it would take 10(102) time steps — exactly the same amount of time. And if B were given a list of 1000 numbers, it would take 10003=1,000,000,000 steps to sort it using its hardware, and only 10(10002) =10,000,000 steps to sort it by simulating A. The longer the list is, the more useful is the capacity for simulation, and the less useful is the built-in hardware.

The point is that as the size of the problem, n, gets bigger and bigger, the differences between
computers become irrelevant. It is worth being a little more rigorous about this. Take any type of problem, and assign to each instance of it a "size" n. For example, if the problem is sorting lists of numbers, then each instance is a list of numbers, and its size is its length. Let A(n) denote the longest amount of time which computer A requires to solve any problem instance of size n. Let B(n) denote the longest amount of time which computer B requires to solve any problem instance of size n. Assume that the time required to solve an instance of the problem increases as n increases (just as the time required to sort a list of n numbers increases as n increases). Then it follows that the bigger n gets, the less significant is the difference between A(n) and B(n).
Mathematically, we say that as n goes to infinity, the ratio A(n)/B(n) goes to 1.
All this follows from the assumption that any sufficiently powerful computer can simulate any
other one, by running a certain "universal Turing machine" program of large but fixed size.

AVERAGE-CASE ANALYSIS
Note that the quantity A(n) is defined in terms of "worst-case" computation. It is the longest that computer A takes to solve any problem instance of size n. Any computer worth its salt can sort the list {1,2,3,4,5,6,7,8,9,10} faster than the list {5,7,6,4,10,3,8,9,2,1}. But A(n) ignores the easy cases. Out of all the possible instances, it only asks: how hard is the hardest?
For some applications, this is a useful way to look at computation. But not always. To see
why, consider the following well-known problem. A salesman, driving a jeep, must visit a
number of cities in the desert. There are no mountains, rivers or other obstructions in the region.

He wants to know what is the shortest route that goes through all the different cities. This is
known as the Traveling Salesman Problem. Each specific instance of the problem isparticular
collection of cities or, mathematically speaking, a set of points in the plane. The size of an
instance of the problem, n, is simply the number of cities involved.

How hard is this problem? When the data is presented pictorially, human beings can solve it
pretty well. However, we must remember that even if Maria is exceptionally good at solving the
problem, what Maria(n) measures is the longest it takes Maria to arrive at the correct solution for any collection of n cities. No human being does well according to this strict criterion. We do not always see the absolute shortest path between the n cities; we often identify a route which is close to correct, but not quite there. And we sometimes miss the mark entirely. So we are not very good at solving the Traveling Salesman Problem, in the sense that there are instances of the problem for which we get the answer wrong or take a long time to get to the answer. But we are good at it in the sense that most of the time we get reasonably close to the right answer, pretty fast. There are two different notions of proficiency involved here.

The simplest way to solve the Traveling Salesman problem is to list all the possible paths
between the cities, then compare all the lengths to see which one is the shortest. The problem is that there are just too many paths. For instance, if there are 5 cities, then there are [4x3x2]/2 = 12 paths. If there are 10 cities, then there are [9x8x7x6x5x4x3x2]/2 = 181440 paths. If there are, say, 80 cities, then there are more paths than there are electrons in the universe. Using this method, the number of steps required to solve the Traveling Salesman problem increases very fast as the size of the problem increases. So, given a large Traveling Salesman problem, it might be better to apply erratic human intuition than to use a computer to investigate every possible path.
Let’s consider a simple analogy. Suppose you run a bank, and you have three loan officers
working for you. Officer A is very methodic and meticulous. He investigates every case with the
precision of a master detective, and he never makes a mistake. He never loans anyone more than
they can afford. Everyone he approves pays back their loans, and everyone he turns down for a
loan would not have paid it back anyway. The only problem is that he often takes a long time to
determine his answer. Officer B, on the other hand, works entirely by intuition. He simply looks
a person over, talks to them about golf or music or the weather, and then makes his decision on
the spot. He rejects a some people who deserve loans, and he gives some people more or less
money than they can afford to pay back. He gives loans to a few questionable characters who
have neither the ability nor the inclination to pay the bank back.

Suppose that, although you really need both, you have been ordered to cut back expenses by
firing one of your loan officers. Which one should go? At first you might think "Officer B, of
course." But what if you have a lot ofmoney to lend, and a great many people demanding loans?
Then A might be a poor choice — after all, B will serve a lot more customers each month. Even
though there are some cases where A is much better than B, and there are many cases where A is
a little better than B, the time factor may tip the balance in B’s favor.

You may be thinking "Well, a real bank executive would find someone who’s both fast and
accurate." In the case of the Traveling Salesman problem, however, no one has yet found an
algorithm which finds the exact shortest path every time much faster than the simple method
given above. And it seems likely that no such algorithm will ever be discovered. The Traveling
Salesman problem and hundreds of other important problems have been shown to be "NP-
complete", which means essentially that if there is a reasonably fast algorithm for solving any
one of them, then there is a reasonably fast algorithm for solving all of them. Many
mathematicians believe that the question of whether such algorithms exist is undecidable in the
sense of Godel’s Incompleteness Theorem: that there’s no way to prove that they do, and there’s
no way to prove that they don’t.

Now, we have discovered algorithms which solve the Traveling Salesman problem faster than
people, and on the average come up with better answers (Peters, 1985). But there are still some
collections of cities for which they give the wrong answer, or take a ridiculously long time to
solve. In the case of the Traveling Salesman problem, it seems that there is no point in looking
for algorithms which solve the problem exactly, every time. All the algorithms which do that are
just too slow. Rather, it seems to be more intelligent to look for algorithms that solve the
problem pretty well a lot of the time.

It turns out that most of the mathematical problems involved in thought and perception are a
lot like the Traveling Salesman problem. They are "NP-complete". So when, in later chapters, we
discuss the algorithms of thought, we shall virtually never be discussing algorithms that solve
problems perfectly. The relevant concept is rather the PAC algorithm — the algorithm which is
Probably Approximately Correct.

PARALLELISM
One interesting aspect of the McCullough-Pitts neural network is the way it does many things
at once. At every time step, all the neurons act. The original formulation of the Turing machine
was not like that; it only did one thing at a time. It moved the tapes, then looked in its memory to see what to do next. Of course, the McCullough-Pitts network and the original Turing machine are fundamentally equivalent; anything one can do, so can the other. But the McCullough-Pitts network will, in most cases, get things done faster.

The computers in popular use today are like the original Turing machine: they only do one
thing at a time. This is true of everything from PCs to huge mainframe computers — Cybers,
VAXs and so forth. They are serialcomputers. Some supercomputers and special-purpose
research computers, however, can work in parallel: they can do up to hundreds of thousands of
things at once. The advantage of parallelism is obvious: speed. By using a parallel computer, one trades off space for time.
There are many different kinds of parallel computers. Some are so-called single-instruction
machines. They can do many things at once, as long as these things are all the same. For
instance, a typical single-instruction machine could multiply fifty numbers by four all at the
same time. But it might not be able to multiply one number by four at the same time as it added
six to another number. Multiple-instruction machines are more interesting, but also more
difficult to build and to program. A multiple-instruction parallel computer is like a bunch of
serial computers connected to each other. Each one can execute a different program, and
communicate the results of its computation to certain others. In a way, it is like a society of serial computers. Thinking Machines Corporation, in Cambridge, Massachusetts, has manufactured a
number of powerful multiple-instruction parallel computers called Connection Machines. They
are now being used in science and industry — for, among other things, modeling the behavior of
fluids, analyzing visual data, and generating computer graphics.

Why is all this relevant? Some may dispute the neurophysiological relevance of the
McCullough-Pitts model and its contemporary descendants. But everyone agrees that, if the brain
is a computer, it must be a parallel computer. The brain contains about 100 billion neurons, all
operating at once, and besides that it is continually swirling with chemical activity. The diversity of its activity leaves little doubt that, if it is indeed a computer, it is a multiple-instruction parallel computer. This is the intuition behind the recent spurt of research in parallel distributed processing.

In Chapter 11 I will take this one step further and argue that the brain should be modeled as a
multiple-instruction parallel quantum computer. By then, it will be clear just how different
such a computer is from today’s serial computers. We are talking about a computer which does
billions of different things at once and incorporates a huge amount of chance into its operations.

As we shall see later, it is a computer whose state is not completely measurable by any sequence
of physical observations. It is a computer which, in a physically precise sense, plays a significant role in the continual creation of the universe. It could be argued that a computer with all these properties should not be called a "computer". But, mathematical theories aside, the intuitive concept of computation has always been somewhat fuzzy. As warned in the Introduction, the limitations of present-day computers should not be taken as fundamental restrictions on the nature of computation.

belgesi-932