# Science and Hypothesis/Chapter 11

CHAPTER XI.

THE CALCULUS OF PROBABILITIES.

Unfortunately the difficulty does not thus disappear. A gambler wants to try a coup, and he asks my advice. If I give it him, I use the calculus of probabilities; but I shall not guarantee success. That is what I shall call subjective probability. In this case we might be content with the explanation of which I have just given a sketch. But assume that an observer is present at the play, that he knows of the coup, and that play goes on for a long time, and that he makes a summary of his notes. He will find that events have taken place in conformity with the laws of the calculus of probabilities. That is what I shall call objective probability, and it is this phenomenon which has to be explained. There are numerous Insurance Societies which apply the rules of the calculus of probabilities, and they distribute to their shareholders dividends, the objective reality of which cannot be contested. In order to explain them, we must do more than invoke our ignorance and the necessity of action. Thus, absolute scepticism is not admissible. We may distrust, but we cannot condemn en bloc. Discussion is necessary.

I. Classification of the Problems of Probability.—In order to classify the problems which are presented to us with reference to probabilities, we must look at them from different points of view, and first of all, from that of generality. I said above that probability is the ratio of the number of favourable to the number of possible cases. What for want of a better term I call generality will increase with the number of possible cases. This number may be finite, as, for instance, if we take a throw of the dice in which the number of possible cases is 36. That is the first degree of generality. But if we ask, for instance, what is the probability that a point within a circle is within the inscribed square, there are as many possible cases as there are points in the circle—that is to say, an infinite number. This is the second degree of generality. Generality can be pushed further still. We may ask the probability that a function will satisfy a given condition. There are then as many possible cases as one can imagine different functions. This is the third degree of generality, which we reach, for instance, when we try to find the most probable law after a finite number of observations. Yet we may place ourselves at a quite different point of view. If we were not ignorant there would be no probability, there could only be certainty. But our ignorance cannot be absolute, for then there would be no longer any probability at all. Thus the problems of probability may be classed according to the greater or less depth of this ignorance. In mathematics we may set ourselves problems in probability. What is the probability that the fifth decimal of a logarithm taken at random from a table is a 9. There is no hesitation in answering that this probability is 1-10th. Here we possess all the data of the problem. We can calculate our logarithm without having recourse to the table, but we need not give ourselves the trouble. This is the first degree of ignorance. In the physical sciences our ignorance is already greater. The state of a system at a given moment depends on two things—its initial state, and the law according to which that state varies. If we know both this law and this initial state, we have a simple mathematical problem to solve, and we fall back upon our first degree of ignorance. Then it often happens that we know the law and do not know the initial state. It may be asked, for instance, what is the present distribution of the minor planets? We know that from all time they have obeyed the laws of Kepler, but we do not know what was their initial distribution. In the kinetic theory of gases we assume that the gaseous molecules follow rectilinear paths and obey the laws of impact and elastic bodies; yet as we know nothing of their initial velocities, we know nothing of their present velocities. The calculus of probabilities alone enables us to predict the mean phenomena which will result from a combination of these velocities. This is the second degree of ignorance. Finally it is possible, that not only the initial conditions but the laws themselves are unknown. We then reach the third degree of ignorance, and in general we can no longer affirm anything at all as to the probability of a phenomenon. It often happens that instead of trying to discover an event by means of a more or less imperfect knowledge of the law, the events may be known, and we want to find the law; or that, instead of deducing effects from causes, we wish to deduce the causes from the effects. Now, these problems are classified as probability of causes, and are the most interesting of all from their scientific applications. I play at écarté with a gentleman whom I know to be perfectly honest. What is the chance that he turns up the king? It is 18. This is a problem of the probability of effects. I play with a gentleman whom I do not know. He has dealt ten times, and he has turned the king up six times. What is the chance that he is a sharper? This is a problem in the probability of causes. It may be said that it is the essential problem of the experimental method. I have observed n values of x and the corresponding values of y. I have found that the ratio of the latter to the former is practically constant. There is the event; what is the cause? Is it probable that there is a general law according to which y would be proportional to x, and that small divergencies are due to errors of observation? This is the type of question that we are ever asking, and which we unconsciously solve whenever we are engaged in scientific work. I am now going to pass in review these different categories of problems by discussing in succession what I have called subjective and objective probability.

III. Probability in the Physical Sciences.—We now come to the problems which are connected with what I have called the second degree of ignorance—namely, those in which we know the law but do not know the initial state of the system. I could multiply examples, but I shall take only one. What is the probable present distribution of the minor planets on the zodiac? We know they obey the laws of Kepler. We may even, without changing the nature of the problem, suppose that their orbits are circular and situated in the same plane, a plane which we are given. On the other hand, we know absolutely nothing about their initial distribution. However, we do not hesitate to affirm that this distribution is now nearly uniform. Why? Let b be the longitude of a minor planet in the initial epoch that is to say, the epoch zero. Let a be its mean motion. Its longitude at the present time—i.e., at the time t will be at + b. To say that the present distribution is uniform is to say that the mean value of the sines and cosines of multiples of at + b is zero. Why do we assert this? Let us represent our minor planet by a point in a plane—namely, the point whose co-ordinates are a and b. All these representative points will be contained in a certain region of the plane, but as they are very numerous this region will appear dotted with points. We know nothing else about the distribution of the points. Now what do we do when we apply the calculus of probabilities to such a question as this? What is the probability that one or more representative points may be found in a certain portion of the plane? In our ignorance we are compelled to make an arbitrary hypothesis. To explain the nature of this hypothesis I may be allowed to use, instead of a mathematical formula, a crude but concrete image. Let us suppose that over the surface of our plane has been spread imaginary matter, the density of which is variable, but varies continuously. We shall then agree to say that the probable number of representative points to be found on a certain portion of the plane is proportional to the quantity of this imaginary matter which is found there. If there are, then, two regions of the plane of the same extent, the probabilities that a representative point of one of our minor planets is in one or other of these regions will be as the mean densities of the imaginary matter in one or other of the regions. Here then are two distributions, one real, in which the representative points are very numerous, very close together, but discrete like the molecules of matter in the atomic hypothesis; the other remote from reality, in which our representative points are replaced by imaginary continuous matter. We know that the latter cannot be real, but we are forced to adopt it through our ignorance. If, again, we had some idea of the real distribution of the representative points, we could arrange it so that in a region of some extent the density of this imaginary continuous matter may be nearly proportional to the number of representative points, or, if it is preferred, to the number of atoms which are contained in that region. Even that is , impossibleand our ignorance is so great that we are forced to choose arbitrarily the function which defines the density of our imaginary matter. We shall be compelled to adopt a hypothesis from which we can hardly get away; we shall suppose that this function is continuous. That is sufficient, as we shall see, to enable us to reach our conclusion.

What is at the instant t the probable distribution of the minor planets—or rather, what is the mean value of the sine of the longitude at the moment ti.e., of sin (at + b)? We made at the outset an arbitrary convention, but if we adopt it, this probable value is entirely defined. Let us decompose the plane into elements of surface. Consider the value of sin (at + b) at the centre of each of these elements. Multiply this value by the surface of the element and by the corresponding density of the imaginary matter. Let us then take the sum for all the elements of the plane. This sum, by definition, will be the probable mean value we seek, which will thus be expressed by a double integral. It may be thought at first that this mean value depends on the choice of the function Φ which defines the density of the imaginary matter, and as this function Φ is arbitrary, we can, according to the arbitrary choice which we make, obtain a certain mean value. But this is not the case. A simple calculation shows us that our double integral decreases very rapidly as t increases. Thus, I cannot tell what hypothesis to make as to the probability of this or that initial distribution, but when once the hypothesis is made the result will be the same, and this gets me out of my difficulty. Whatever the function Φ may be, the mean value tends towards zero as t increases, and as the minor planets have certainly accomplished a very large number of revolutions, I may assert that this mean value is very small. I may give to Φ any value I choose, with one restriction: this function must be continuous; and, in fact, from the point of view of subjective probability, the choice of a discontinuous function would have been unreasonable. What reason could I have, for instance, for supposing that the initial longitude might be exactly 0°, but that it could not lie between 0° and 1°?

The difficulty reappears if we look at it from the point of view of objective probability; if we pass from our imaginary distribution in which the supposititious matter was assumed to be continuous, to the real distribution in which our representative points are formed as discrete atoms. The mean value of sin (at ${\displaystyle +}$ b) will be represented quite simply by

${\displaystyle {\frac {1}{n}}\sum \sin \ (at+b),}$

n being the number of minor planets. Instead of a double integral referring to a continuous function, we shall have a sum of discrete terms. However, no one will seriously doubt that this mean value is practically very small. Our representative points being very close together, our discrete sum will in general differ very little from an integral. An integral is the limit towards which a sum of terms tends when the number of these terms is indefinitely increased. If the terms are very numerous, the sum will differ very little from its limit—that is to say, from the integral, and what I said of the latter will still be true of the sum itself. But there are exceptions. If, for instance, for all the minor planets b = π2 -at, the longitude of all the planets at the time t would be π2, and the mean value in question would be evidently unity. For this to be the case at the time 0, the minor planets must have all been lying on a kind of spiral of peculiar form, with its spires very close together. All will admit that such an initial distribution is extremely improbable (and even if it were realised, the distribution would not be uniform at the present time—for example, on the 1st January 1900; but it would become so a few years later). Why, then, do we think this initial distribution improbable? This must be explained, for if we are wrong in rejecting as improbable this absurd hypothesis, our inquiry breaks down, and we can no longer affirm any thing on the subject of the probability of this or that present distribution. Once more we shall invoke the principle of sufficient reason, to which we must always recur. We might admit that at the beginning the planets were distributed almost in a straight line. We might admit that they were irregularly distributed. But it seems to us that there is no sufficient reason for the unknown cause that gave them birth to have acted along a curve so regular and yet so complicated, which would appear to have been expressly chosen so that the distribution at the present day would not be uniform.

IV. Rouge et Noir.—The questions raised by games of chance, such as roulette, are, fundamentally, quite analogous to those we have just treated. For example, a wheel is divided into thirty-seven equal compartments, alternately red and black. A ball is spun round the wheel, and after having moved round a number of times, it stops in front of one of these sub-divisions. The probability that the division is red is obviously 12. The needle describes an angle θ, including several complete revolutions. I do not know what is the probability that the ball is spun with such a force that this angle should lie between θ and θ+dθ, but I can make a convention. I can suppose that this probability is Φ(θ)dθ. As for the function Φ(θ), I can choose it in an entirely arbitrary manner. I have nothing to guide me in my choice, but I am naturally induced to suppose the function to be continuous. Let ε be a length (measured on the circumference of the circle of radius unity) of each red and black compartment. We have to calculate the integral of Φ(θ)dθ, extending it on the one hand to all the red, and on the other hand to all the black compartments, and to compare the results. Consider an interval 2ε comprising two consecutive red and black compartments. Let M and m be the maximum and minimum values of the function Φ(θ) in this interval. The integral extended to the red compartments will be smaller than ΣMε; extended to the black it will be greater than Σmε. The difference will therefore be smaller than Σ(M - m)ε. But if the function Φ is supposed continuous, and if on the other hand the interval ε is very small with respect to the total angle described by the needle, the difference M - m will be very small. The difference of the two integrals will be therefore very small, and the probability will be very nearly 12. We see that without knowing anything of the function Φ we must act as if the probability were 12. And on the other hand it explains why, from the objective point of view, if I watch a certain number of coups, observation will give me almost as many black coups as red. All the players know this objective law; but it leads them into a remarkable error, which has often been exposed, but into which they are always falling. When the red has won, for example, six times running, they bet on black, thinking that they are playing an absolutely safe game, because they say it is a very rare thing for the red to win seven times running. In reality their probability of winning is still 12. Observation shows, it is true, that the series of seven consecutive reds is very rare, but series of six reds followed by a black are also very rare. They have noticed the rarity of the series of seven reds; if they have not remarked the rarity of six reds and a black, it is only because such series strike the attention less.

V. The Probability of Causes.—We now come to the problems of the probability of causes, the most important from the point of view of scientific applications. Two stars, for instance, are very close together on the celestial sphere. Is this apparent contiguity a mere effect of chance? Are these stars, although almost on the same visual ray, situated at very different distances from the earth, and therefore very far indeed from one another? or does the apparent correspond to a real contiguity? This is a problem on the probability of causes.

Let us pass on to an example of a more scientific character. I require to determine an experimental law; this law, when discovered, can be represented by a curve. I make a certain number of isolated observations, each of which may be represented by a point. When I have obtained these different points, I draw a curve between them as carefully as possible, giving my curve a regular form, avoiding sharp angles, accentuated inflexions, and any sudden variation of the radius of curvature. This curve will represent to me the probable law, and not only will it give me the values of the functions intermediary to those which have been observed, but it also gives me the observed values more accurately than direct observation does; that is why I make the curve pass near the points and not through the points themselves.

Here, then, is a problem in the probability of causes. The effects are the measurements I have recorded; they depend on the combination of two causes—the true law of the phenomenon and errors of observation. Knowing the effects, we have to find the probability that the phenomenon shall obey this law or that, and that the observations have been accompanied by this or that error. The most probable law, therefore, corresponds to the curve we have traced, and the most probable error is represented by the distance of the corresponding point from that curve. But the problem has no meaning if before the observations I had an à priori idea of the probability of this law or that, or of the chances of error to which I am exposed. If my instruments are good (and I knew whether this is so or not before beginning the observations), I shall not draw the curve far from the points which represent the rough measurements. If they are inferior, I may draw it a little farther from the points, so that I may get a less sinuous curve; much will be sacrificed to regularity.

Why, then, do I draw a curve without sinuosities? Because I consider à priori a law represented by a continuous function (or function the derivatives of which to a high order are small), as more probable than a law not satisfying those conditions. But for this conviction the problem would have no meaning; interpolation would be impossible; no law could be deduced from a finite number of observations; science would cease to exist.

Fifty years ago physicists considered, other things being equal, a simple law as more probable than a complicated law. This principle was even invoked in favour of Mariotte's law as against that of Regnault. But this belief is now repudiated; and yet, how many times are we compelled to act as though we still held it! However that may be, what remains of this tendency is the belief in continuity, and as we have just seen, if the belief in continuity were to disappear, experimental science would become impossible.

VI. The Theory of Errors.—We are thus brought to consider the theory of errors which is directly connected with the problem of the probability of causes. Here again we find effects—to wit, a certain number of irreconcilable observations, and we try to find the causes which are, on the one hand, the true value of the quantity to be measured, and, on the other, the error made in each isolated observation. We must calculate the probable à posteriori value of each error, and therefore the probable value of the quantity to be measured. But, as I have just explained, we cannot undertake this calculation unless we admit à priorii.e., before any observations are made—that there is a law of the probability of errors. Is there a law of errors? The law to which all calculators assent is Gauss's law, that is represented by a certain transcendental curve known as the "bell."

But it is first of all necessary to recall the classic distinction between systematic and accidental errors. If the metre with which we measure a length is too long, the number we get will be too small, and it will be no use to measure several times—that is a systematic error. If we measure with an accurate metre, we may make a mistake, and find the length sometimes too large and sometimes too small, and when we take the mean of a large number of measurements, the error will tend to grow small. These are accidental errors.

It is clear that systematic errors do not satisfy Gauss's law, but do accidental errors satisfy it? Numerous proofs have been attempted, almost all of them crude paralogisms. But starting from the following hypotheses we may prove Gauss's law: the error is the result of a very large number of partial and independent errors; each partial error is very small and obeys any law of probability whatever, provided the probability of a positive error is the same as that of an equal negative error. It is clear that these conditions will be often, but not always, fulfilled, and we may reserve the name of accidental for errors which satisfy them.

We see that the method of least squares is not legitimate in every case; in general, physicists are more distrustful of it than astronomers. This is no doubt because the latter, apart from the systematic errors to which they and the physicists are subject alike, have to contend with an extremely important source of error which is entirely accidental—I mean atmospheric undulations. So it is very curious to hear a discussion between a physicist and an astronomer about a method of observation. The physicist, persuaded that one good measurement is worth more than many bad ones, is pre-eminently concerned with the elimination by means of every precaution of the final systematic errors; the astronomer retorts: "But you can only observe a small number of stars, and accidental errors will not disappear."

What conclusion must we draw? Must we continue to use the method of least squares? We must distinguish. We have eliminated all the systematic errors of which we have any suspicion; we are quite certain that there are others still, but we cannot detect them; and yet we must make up our minds and adopt a definitive value which will be regarded as the probable value; and for that purpose it is clear that the best thing we can do is to apply Gauss's law. We have only applied a practical rule referring to subjective probability. And there is no more to be said.

Yet we want to go farther and say that not only the probable value is so much, but that the probable error in the result is so much. This is absolutely invalid: it would be true only if we were sure that all the systematic errors were eliminated, and of that we know absolutely nothing. We have two series of observations; by applying the law of least squares we find that the probable error in the first series is twice as small as in the second. The second series may, however, be more accurate than the first, because the first is perhaps affected by a large systematic error. All that we can say is, that the first series is probably better than the second because its accidental error is smaller, and that we have no reason for affirming that the systematic error is greater for one of the series than for the other, our ignorance on this point being absolute.

VII. Conclusions.—In the preceding lines I have set several problems, and have given no solution. I do not regret this, for perhaps they will invite the reader to reflect on these delicate questions.

However that may be, there are certain points which seem to be well established. To undertake the calculation of any probability, and even for that calculation to have any meaning at all, we must admit, as a point of departure, an hypothesis or convention which has always something arbitrary about it. In the choice of this convention we can be guided only by the principle of sufficient reason. Unfortunately, this principle is very vague and very elastic, and in the cursory examination we have just made we have seen it assume different forms. The form under which we meet it most often is the belief in continuity, a belief which it would be difficult to justify by apodeictic reasoning, but without which all science would be impossible. Finally, the problems to which the calculus of probabilities may be applied with profit are those in which the result is independent of the hypothesis made at the outset, provided only that this hypothesis satisfies the condition of continuity.