Popular Science Monthly/Volume 76/April 1910/The Population of the United States

 THE POPULATION OF THE UNITED STATES
By Professor JAMES S. STEVENS

UNIVERSITY OF MAINE

OF all branches of statistics, those which relate to the population of a country or city are of most general interest. The interest felt in the question of our national population culminates every ten years when the census is taken. To be sure, no great importance is to be attached to mere numbers; yet we can not help feeling a little pride if we belong to the biggest religious denomination, the biggest university, or the biggest country.

The plots in Fig. 1 represent the growth in population of various countries as indicated by the census of 1900. The curves were made by Mr. W. R. Wilcox and were printed in the census report for that year.

An examination of these curves shows that for the most part the growth of a country is constant; for example, the lines representing France, Spain, Sweden and Norway, Turkey and Italy, are nearly straight. This indicates that while the population of those countries is increasing slightly, there is no great gain from year to year. The population of the United States is represented by a curve which is well known in mathematics. In the chart below I have redrawn this curve, and with it one which is a true parabola.

It will be seen that these two are strikingly similar. Now if the population of the United States increased in such a manner as always to follow this parabolic form, the census enumeration would be unnecessary, as one could predict the future population from the past. Unfortunately, however, this is not the case; and it is only by a somewhat tedious method that we are able to predict the future population with any degree of certainty. There are two kinds of formulæ—rational and empirical. A rational formula is one which is mathematically true under all conditions. The fact that the volume of a cylinder equals π multiplied by the square of the radius, multiplied by the length, is a fact that does not depend upon any external circumstances whatever. On the other hand, the value of the acceleration due to gravitation is not a constant quantity, but differs with the latitude and altitude of the observer. This latter is one of the most important physical constants in nature and a great deal of time and money have been expended in determining its value. While there is no mathematical formula that expresses this value, an empirical formula has been devised in which, if one substitutes the latitude and altitude of the place of observation, a close approximation of the value of this constant may be obtained. To take another illustration: The relation between the velocity of water flow to its depth below the surface in a river is not a constant one, but

Fig. 1.

repeated measurements have shown that by the use of an empirical formula containing certain constants, this relation may be quite well established. On the other hand, the relation between the space passed over by a freely falling body and the time of its fall is a perfectly definite one, and may be correctly expressed by saying that the space passed over varies as the square of the time.

These illustrations will be sufficient to make clear the distinction between mathematical and empirical formulæ. It will be hardly necessary to state that the growth of a country in population follows the latter

Fig. 2.

rather than the former. In order to predict the future population, it is necessary to determine what kind of a curve the previous numbers representing the population will plot. As has been stated above, this curve comes out a parabola and the equation for this parabola is

${\displaystyle P=S+TX+UX^{2}.}$

There are also certain terms of higher orders which may be omitted. In this equation, P represents the population at any time, X the number of the decade and S, T and U are constants, which are to be determined. The determination of these constants involves a somewhat technical process, which may be briefly stated as follows: We first write down the population of the United States for each decade since the census began to be taken.

 Year Population Year Population 1790 3.9 millions 1850 23.2 millions 1800 5.3 1860 31.4 1810 7.2 1870 38.6 1820 9.6 1880 50.2 1830 12.9 1890 62.6 1840 17.1 1900 76.3

From these observations we form what are called "observation equations" by substituting for P and X their proper values.

These observation equations follow:

Normal equations for each constant are formed from these observation equations by multiplying each equation by the coefficient of the constant concerned in the equation and adding. This gives us three equations containing three unknown quantities. These unknown quantities are determined by any method and substituted in the general formula for ${\displaystyle S,T}$ and ${\displaystyle U,}$ respectively. For example, in 1900, before the census returns for that year were available, the process above outlined yielded the following equations:

When these equations are solved, it is found that

${\displaystyle S=6.08,\ T=0.690,\ U=0.622}$

If we substitute these in the formula, we get

${\displaystyle P=6.08+6.9+62.2,\ orP=75.2\ millions,}$

which is the forecast for 1900.

(It should be observed that in this work the year 1790 was considered 1, and 1800 was taken as the origin.)

This estimate proved somewhat low, as the census returns reported 76.3 millions for 1900. This indicates that the population of the country is growing a little more rapidly than would be indicated from its past history.

While the government authorities are at work on the census for 1910, it will be interesting to try this method of forecasting, and to see how well our results will compare with those to be announced later on. I have made a number of equations which are supposed to represent empirically the growth of the population of our country. These have been made in various ways, but all depend upon the parabolic formula, and the method outlined above.

The equations yield the following values for the census of 1910:

 A. ${\displaystyle P=89.7}$ C. ${\displaystyle P=89.7}$ E. ${\displaystyle P=91.4}$ B. ${\displaystyle P=89.3}$ D. ${\displaystyle P=91.4}$ F. ${\displaystyle P=91.3}$

It will be observed that these forecasts fall into two classes, in one of which the numbers run between eighty-nine and ninety millions and the other a little over ninety-one millions. The former are based more strictly upon the formula as it stands, including the entire set of observations. In the latter, greater weight is given to more recent observations, as it is supposed that they represent more nearly the present rate of increase in the population. The last formula (F) is based upon three observations only, those for 1880, 1890, and 1900. It is probable that while the formulæ yielding the lower results conform more nearly to the population of our country in the past, the results which are yielded by the other set of formulæ are more correct for 1910. As an illustration of the closeness with which the formulæ conform to past conditions, we will determine the results for each census by means of formula:

${\displaystyle P=5.13+0.358X+0.666X^{2}.}$

 Year⁠ P (Census)⁠ P (By formula)⁠ Difference⁠ 1800 5.3 5.3 0.0 1810 7.2 6.2 1.0 1820 9.6 8.5 1.1 1830 12.9 12.2 0.7 1840 17.1 17.2 0.1 1850 23.2 23.6 0.4 1860 31.4 31.3 0.1 1870 38.6 40.3 1.7 1880 50.2 50.7 0.5 1890 62.6 62.4 0.2 1900 76.3 75.4 0.9 6.7

The formula published by Dr. H. S. Pritchett, in The Popular Science Monthly, of November, 1900, agreed more closely with the results of past censuses than the one used here. It will be noted that while the sum of the various deviations resulting from each application of the formula is 6.7 millions, that from Dr. Pritchett's formula is only about 4 millions. This formula, however, does not seem to fit the future conditions so well as the one employed here, for it gives a population of 77.5 millions for 1900, while the census returns show it to be 76.3 millions.

As a method of determining the population of the United States during the coming decades, the application of these formulæ is interesting. By the use of formula E—

${\displaystyle P=5.4+0.12X+0.7X^{2}}$

we obtain the following forecasts:

 1910 91.4 millions 1970 209.7 millions 1920 107.6 1980 234.4 1930 125.3 1990 260.4 1940 144.3 2000 287.8 1950 164.7 2500 3,443.8 1960 186.5 3000 10,099.8

Since one of the factors used in the formula is a square, it is noticeable that the increase is quite a rapid one as the years go on. In the year 2270, which is not so very remote, the estimate is 1,557 millions, which is about the population of the globe at the present clay. The predictions reached by this formula are somewhat smaller than those given by the formula of Dr. Pritchett in the article referred to. An interesting point in the curve is noticed for the year 1870. It will be observed that the population for this year differs more widely from that of the predicted population than that for any other year. This is probably due to two causes. In the first place, the effects of the civil war are shown in the reduction of the population, and, secondly, it is probable that the census of 1870 was not so accurately taken as that of any other decade. This latter reason is given by Mr. Robert Porter in the Census Bulletin No. 12, 1890.

There is another method of forecasting the census, which depends upon reported estimates of the population in various centers. "The World Almanac," for example, secures the best available data from government and other officials, and each decade estimates the census which is to be taken. In January, 1900, this estimate was 79.4 millions, while the census enumeration showed 76.3 millions. This was about 4 per cent, too high. In January, 1910, the estimate is given as 92.3 millions. If this is reduced in this ratio, it gives a result of 88.8 millions for the year 1910.

In conclusion, it may be stated that the results of empirical formulæ, unlike those of the mathematical formulæ, are never perfectly reliable or correct. It is, therefore, impossible to predict the population for 1910 with any such degree of certainty as one can predict the free fall of a body in a given interval of time. It is to these empirical formulæ, however, that science owes much of its progress, and the governments of civilized countries are spending thousands of dollars in order to bring the constants in these various formulæ a little nearer the truth. In its application to the problem before us, it may be stated that if the population for 1910 shall be found to conform to the general trend of increase in the population since the first census was taken, we may feel certain that it will come out about 89.7 millions. If, on the other hand, it shall be found to conform more nearly to the growth made in the last few decades, it will be about 91.3 millions. Of course, there is always the possibility that the case may be an abnormal one and that the growth of our country will be found to conform to no formula which represents its past growth. In such a case, all empirical methods are helpless. It should be added that the formula? used in this article are anti-expansion formula?—that is, they do not include the insular possessions of the United States.