Popular Science Monthly/Volume 39/August 1891/The Value of Statistics
By Hon. CARROLL D. WRIGHT.
THE German historian Schlosser has said that history is statistics ever advancing, and statistics is stationary history. Looking beneath the words of Schlosser, one must conclude that he means that the constant accumulation of statistical data from period to period, or from epoch to epoch—that is, statistics ever in motion—creates history, history being made up of the ever-advancing events of life, which are shown through statistical methods, but that statistics of one epoch constitutes the permanent history thereof. The statistician, therefore, in the truest sense, writes history, and he writes it in the most crystallized form which can be adopted. He uses symbols, it is true, but with them he unlocks the facts of his period, so that they may be made plain to all students coming after him; he tells the story of our present state, that when the age we live in becomes the past, that story shall be found to exist in true and just proportions. The very word "statistics" indicates the soundness of the German writer's thought. The word is from the French statistique—from the Greek statos, meaning fixed, settled; statos being based on the stem sta, meaning to stand. Statistics, then, is used to illustrate fixed and settled conditions.
As a department of political science, statistics is used to classify, arrange, and discuss facts relating to a part or the whole of a country or people, or facts relating to classes of individuals or interests in different countries, and especially those facts which illustrate the physical, social, moral, intellectual, political, industrial, and economical condition or changes of condition of the people, in so far as such conditions may be indicated through numerical and tabular statements.
It is not a matter of much consequence whether statistics is a science or a method. English writers on statistics generally consider that it constitutes a method. Continental writers more generally insist upon its being denominated a science. The American opinion follows that of the Continent. It is true that statistical research can be called a scientific method of determining facts, and for studying various phenomena from which laws relating to life, production, distribution, consumption, etc., can be drawn; and the method must be considered scientific, because by it facts can be clearly stated, classified, and analyzed, elements which make science. We speak of the science of botany, because, for one reason, all the facts relating to botany can be classified; and so as to other departments of knowledge, classification or the lack of it determining the scientific or unscientific character of the knowledge. Science demands a classification of facts so rigid that all men will consent to its integrity and to its use.
Whether statistics is a science or a scientific method, its use is sometimes empirical, deceptive, and illusory, and even dishonest; and because of these things the method is often condemned. We frequently hear it said that nothing is so uncertain as figures, and many writers contend that no thorough dependence can be placed upon statistical data. One long engaged in statistical work feels more and more keenly, as the results of original investigation, not only the limitations of statistics, but the fact that perfectly honest and truthful statistical tables may not only be vicious in themselves, but may also lead to the most worthless conclusions, the tables themselves not indicating, and it not being possible to fully indicate by them, the exact truth they contain. The method, I believe, is the surest for ascertaining conditions, and the truest on which to base conclusions; but the method must be supplemented by full and frank analysis. A statistical table, independent of such analysis, is to me what a red flag is to a bull. It immediately excites antagonism and invites attack. The value of any statistical presentation must depend upon the basis upon which it is made, the integrity of the collection of the various elements of it, and the analysis which accompanies it. No one has any right to quote statistical tables without using and understanding the analysis of them. It is because of the flippant and careless use of statistics by writers and speakers that it receives their condemnation. No one thinks, however, of condemning anæsthetics because the burglar chloroforms his victim; or the elementary rules of arithmetic, the means by which all honest accounts are kept, simply because dishonest accounts are made possible by the same means; yet I know that, because so many instances of the lying use of honest statistics meet one's observation, it is not remarkable that there are so many surprising denunciations of the method, and that we hear the oft-repeated statement that anything can be proved by statistics—a statement usually made for the purpose of belittling the importance and value of the method under discussion. It is perfectly true that one so disposed can, by dropping an essential element of a table, show the exact reverse of the truth, or, by a combination of truthful elements, prove an untruth; just as the foolish man thought he could prove by the Bible that there was no God, by quoting the exact language of the Psalms in the statement "There is no God," the whole statement being, "The fool hath said in his heart, There is no God." Such a use of statistics belongs to the theorist, who cares more for his idea than for the truth; who cares more for his view of the conditions of life than for the historical fact. Statistics is as dangerous in the hands of such a person as it is in the hands of the statistical mechanic—the man who is ready at all times to construct tables showing such points as may be ordered of him. This latter person is a nuisance. He perverts facts and shows how to reach conclusions false in themselves by the use of truthful statements. Some practical illustrations drawn from actual experience may be of value, and show how we can guard against vicious conclusions when the statistical statements from which the conclusions are drawn are true. This method will also show the true value of statistics.
I have seen it argued that the production of cheese in the United States is growing smaller and smaller, and is likely to cease entirely. This argument is based upon the following figures, taken from the agricultural statistics of the Federal censuses of 1850, '60, '70, and '80: The production of cheese, according to such statistics, was, in 1850, 105,535,893 pounds; in 1860, 103,663,927 pounds; in 1870, 53,492,153 pounds; and in 1880, 27,272,489 pounds. Nothing could be more conclusive as evidence than these figures, because, while the statement for 1850 might be erroneous on account of imperfect enumeration, the enumerations have grown more and more complete; and if the 1880 figures show only about one quarter of the product of 1850, the cheese industry must soon cease to exist. A very little study and a very small amount of common knowledge would have taught the one using the argument I have quoted to examine the statistics of manufactures, as the production of cheese has been thrown largely into the factory. By consulting this side of the census, it is found that the production of cheese in factories in 1880 was 171,750,495 pounds; the truth being, therefore, that the production of cheese in the United States is, as one would naturally suppose, largely on the increase.
One of the most common statistical illustrations used in political arguments is that relating to the percentage of increase of population as compared with the percentage of increase in expenditures. I have prepared a little table (which I have never seen in print) showing the factors used in such an argument, supplemented by factors which ought to be used. The table is as follows:
Statement showing, decennially, the expenditures of the Government and the population of the United States from 1830 to 1890, inclusive, and the value of manufactured products in the United States from 1850 to 1880, inclusive, with the percentage of increase.
Looking at one part of the foregoing table, it will be seen that between 1850 and 1890 the population increased 170·45 per cent, while the expenditures of the Government for the same period increased 338·36 per cent, the percentage of increase in expenditures being enormously in excess of the increase in population. If, however, one should examine the business side, which offers the truer basis of comparison, so far as expenditures are concerned, it will be seen that the percentage of increase in 1880 over 1850 was 426·89 per cent. The figures for 1890 are not yet available. Taking the same year, that is, 1880, for the three elements, we find that the population increased in 1880 as over 1850 116·26 per cent, expenditures for the same period 218·17 per cent, and the value of manufactured products increased 426·89 per cent. To carry this illustration to its logical completeness, the statistics of valuations and some others should be added, but they would simply be accumulative on the business side of the comparison as against the simple comparison of expenditures with population.
Criminal statistics are, perhaps, the most misleading, even when absolutely correct. The attempt is often made to compare the criminal statistics of one State with those of another, in order to determine relative criminal conditions. Such a comparison is and must be thoroughly vicious in every element. One illustration will be sufficient. Suppose one should undertake, as has been the case, to compare the criminal conditions of Massachusetts and Virginia by means of the statistics of crime. Such a comparison would lead only to unjust conclusions, to angry discussion, and to general ill-feeling, for the reason that the criminal codes of the two States differ widely. A few years ago I had occasion to examine this subject, and I found that the criminal code of Massachusetts, at the time the study was made, provided for the punishment of 158 offenses designated as crimes, and the sentences under the statutes of course appeared in the criminal statistics of Massachusetts. The code of Virginia at the same time recognized but 108 such offenses as crimes punishable at law; that is, there were 50 distinct offenses known to the Massachusetts law which were not to be found in the criminal laws of Virginia. No honest comparison, therefore, could be made between the criminal statistics of the two States, and a truthful statement of such statistics in comparison would lead to the most dishonest conclusions. Even with parallel codes and with accurate statistics of the number of persons in prison for crime, no conclusions would be justifiable, for of the offenses common to both States several were punishable by imprisonment in Massachusetts, but by fine only in Virginia. So the prison statistics would show sentences under grave crimes in Massachusetts, while the prison statistics of Virginia would show that no one had been sentenced for such crimes. The grave offenses coming under this distinction are adultery, fornication, lewd conduct, drunkenness, carrying concealed weapons, extortion. From official returns made to the Secretary of the Commonwealth of Massachusetts from the various prisons for the year in which the previous study was made, it appeared that more than 54 per cent of the commitments were for crimes which in Virginia would have been punished by fine only, and the persons so punished never would have appeared, of course, in the prison statistics of the former State. Statistically speaking, then, the only comparisons that would approach fairness, under the conditions named, would be to reduce the actual number of prisoners confined for crimes in Massachusetts 33 per cent to conform to the number of crimes known to the two States in common. The balance, then, would have to be reduced 54 per cent on account of the crimes punishable in Massachusetts by imprisonment for which a fine only is imposed in Virginia. This illustration indicates how unwise it is to undertake to prove the moral or immoral condition of one community as compared with another by criminal statistics. The more accurate the statistics, the more unjust and vicious the comparison.
Another exceedingly effective illustration drawn from criminal statistics relates to some of the most perfect statistical showings I have ever had occasion to examine. From the year 1860 to 1879, inclusive, the criminal statistics of Massachusetts are perfect, and are the results of the certified reports of the clerks of all the criminal courts in the State. From the official statistics, as reported by the Massachusetts Bureau of Statistics of Labor in its Eleventh Annual Report, January, 1880, I have drawn certain comparative columns covering the crime of Massachusetts for the twenty years named. These columns are shown in the table on page 450.
What would a superficial examination of the foregoing figures, which, as I have said, are among the most accurate statistics I have ever examined, prove as to the progress of crime in the State named? I will use what I have seen or have known others to use, referring to these statistics. The increase in population in Massachusetts for the twenty years covered by the table was 50·4 per cent. The percentage of increase of crime for the same period was 70·4 per cent. If we look back to 1875, we shall find that the table shows that the population increased 34·1 per cent over 1860, and that for the same period crime increased 144 per cent, while in 1873 the increase of crime was 179·3 per cent. These figures, perfectly true and accurate, used deftly, give an exceedingly black eye to the State of Massachusetts, and no one can gainsay the bare statistical conclusion or attack the accuracy of the figures on which the conclusion is based. A very casual study of all the
facts, however, relieves the Commonwealth of the oft-repeated statement that crime is increasing much faster than population. The total number of sentences for all offenses for the twenty years is 578,348. An examination of the column of the foregoing table headed "Drunkenness, including common drunkards," shows that the percentage of increase since 1860 is 155.9 per cent, the total tal being 340,814; that is to say, 60 per cent of the total number of crimes reported under all classes of offenses belong entirely to what may be called "rum offenses." An examination of the statistics of crimes other than drunkenness and liquor offenses shows that the increase for the twenty years from 1860 to 1879, inclusive, was but 20.1 per cent, as against an increase of 50.4 per cent in the population. But the truest comparison is based on what are called "high crimes." These are the crimes which represent criminal conditions more than any other. They are the crimes of abortion, felonious assault, burglary, breaking and entering, burning a building, embezzlement, forgery, incest, murder, manslaughter, robbery, and rape. The total number of sentences under all these high crimes for the twenty years is shown in the foregoing table, with percentage of increase since 1860 brought into comparison with the increase of population. This increase in 1879 was 39.6 percent over 1860, while the population for the same time had increased 50.4 per cent. This side of the table shows that any argument made to prove that the crime of the State of Massachusetts for the twenty years named increased much more rapidly than the population is a vicious argument and the conclusion a false one.
The high crimes are those that have been always recognized as crimes. The crime of drunkenness and other crimes belonging to liquor legislation are modern, and in the light of this statement the table receives valuable explanation. In Massachusetts the prohibitory law of 1855 was in force until 1868, when a license law was passed. Under the former the vigorous prosecutions of 1866 and 1867 caused the number of sentences to increase rapidly, and this led to a repeal of the prohibitory law and the enactment of the license law of 1868. The friends of the license law insisted upon its vigorous enforcement, and this action carried the number of rum convictions of all grades still higher, when another reaction secured the re-enactment of the prohibitory law, which went into effect July 1, 1869, and under the application of this law the lines of figures representing drunkenness and liquor offenses went to their highest point in 1872. In 1870 the Legislature allowed the free sale of “ale, porter, strong beer, and lager beer,” everywhere, unless prohibited by a vote of a city or town. This law was repealed in 1873. From the year 1873, either through the effect of the repeal of the beer law or of waning interest in the prohibitory law, resulting in decreased vigilance in prosecutions, the lines of figures dropped till 1875, when the prohibitory law was repealed. From 1876 to 1879, the last year named in the table, the figures constantly decreased.
It would be interesting in this respect to inquire whether the figures representing rum crimes are due to legislation wholly, or to vigorous or weak execution of law alternately applied, or to the positive decline of drunkenness through the efforts of reform movements. It is true that sentences for minor crimes and misdemeanors, and even felonious assaults and aggravated crimes, have risen or fallen, as indicated by the barometer of sentences for rum crimes alone. Legislative crimes—offenses which have been named crimes by legislative enactment—should not be used to show increase of crime in volume. Civilization has raised many things formerly considered as, perhaps, immoral and as offenses against the moral law to well-defined crimes. The result is, that we are constantly increasing the work of criminal courts; the number of sentences is thus increased comparatively, even when the volume of crime, as shown by the comparisons of crime per se, may decrease. So truthful statistics may show absolutely false conclusions, unless the elements are intelligently and honestly used.
Many illustrations as forcible as those cited might be drawn from the statistical work of the State and Federal Governments, but those given are sufficient to illustrate how dangerous truthful statistics may be; and to show, further, that the value of statistics depends upon not only the integrity of their basis, but also on their intelligent and honest analysis.
Another line of fallacies comes from the misuse of averages. Mr. W. L. Sargant, in his Essays published in London in 1870, has an exceedingly interesting chapter entitled The Lies of Statistics, and I am indebted to him to some extent for an illustration as to averages. The frequent fallacies in the practice of striking averages add greatly to the disturbing influences resulting from inaccurate enumerations, the perplexity and differences in international trade accounts, the miscalculations by individual inquirers, and the inadequate consideration of all the elements of tabular statements. M. Quetelet explained the principle which ought to guide us in the matter of averages. He pointed out that an average may indicate two different things. For instance, if one measures Nelson's Monument ten times, and always with a slightly different result, and then adds the measurements together and divides the sum by ten, the quotient is an average or mean. So one may accurately measure the Duke of York's Pillar, the Parisian obelisk, and the Column Vendome, add the measurements together and divide the sum by three, and declare the quotient to be the average or mean height of these three monuments. M. Quetelet contended, and properly, that the results in the two instances are of such different significance as to require two separate names. He would limit the term “average or mean” to cases represented by the first illustration—the repeated measurement of one monument—and he would apply the term “arithmetical mean” to cases represented by the second illustration—the measurement of several monuments. The repeated measurement of one monument results in a mean approximation to something actually existing, and this is an excellent definition of an average. The measurements and calculations having reference to a number of monuments result in no knowledge of anything existing; they simply and only indicate a relation among things actually existing. It is through a misunderstanding of these elements that we have so many misleading statements of statistics relating to wages and prices. The development of wage statistics has kept pace with all statistical methods. The great trouble is that, on account of faulty presentations in the past, no very satisfactory comparisons of the present conditions with the past can be made; and, generally speaking, those who use statistical comparisons covering a period of years should be exceedingly careful that the elements are approximately identical for the various years of the period.
Statistical science improves like all others, and this improvement is doing much to lead empirical statisticians into erroneous conclusions. The original faults of statistics are great enough, but the faults resulting from ignorant comparisons are greater still. On the whole, however, enormous as have been the errors, false as have been many of the statistical statements of official reports, inaccurate as have been many of the calculations, and fallacious and almost monstrous as have been many of the inferences, political economy has, nevertheless, profited greatly by what has been accomplished. The errors are gradually disappearing, and a very considerable remainder of truth has been left. We know far more than did our fathers of the progress of population, the resources of the nation, the earnings of the people, the cost of living, the efficiency of labor, more of criminal conditions, of mortality in town and country, of vagrancy and pauperism, of crowding and immigration; and, in fact, know more of all the conditions of life which make up sociology. Legislators and philanthropists could ill spare their statistical guides, lame and delusive though they be, for “know thyself” applies to nations as well as to men , and that nation which neglects to study its own conditions and affairs in the most searching and critical manner must fall into retrogression. History is, indeed, statistics ever advancing, and statistics is stationary history. Science is best taught by examples of errors. This is to statistical art what a chapter of fallacies is to logic.