# Popular Science Monthly/Volume 78/April 1911/Scientific versus Personal Distribution of College Credits

 SCIENTIFIC VERSUS PERSONAL DISTRIBUTION OF COLLEGE CREDITS[1]
By President WILLIAM T. FOSTER

REED COLLEGE, PORTLAND, OREGON

EARLIER articles in the Popular Science Monthly and in Science have shown that grades in college courses have no exact meaning. 1 Yet college honors are everywhere awarded on the naïve assumption that grades in college courses are distributed on a scientific basic. For many important administrative purposes we assume that an A in one course is equivalent to an A in another course; that the 80 per cent, of one instructor indicates an achievement equal to the 80 per cent, of another instructor. Accordingly, we estimate the fitness of candidates for admission, determine eligibility for athletics, assign annually hundreds of thousands of dollars in scholarships and fellowships, award commencement honors, elect men to Phi Beta Kappa, and confer degrees wholly, or in large part, on the evidence secured by merely counting the number of As, the number of Bs, and so forth, that each student has to his credit. The question is pertinent to what extent our assumption of the equivalency of grades is warranted by the facts.

Our universities and colleges vary so little in this phase of the administration of the curriculum that the detailed distribution of the grades of a few institutions for a few years will fairly represent the practise, except in two or three universities, throughout the country. The grades A, B, C, D usually represent degrees of excellence between 100 per cent, and 60 per cent, of some undefined thing and are all pass marks. The grade E commonly indicates failure. In the figures here presented, the grades have these meanings. The per cent, of the students in each subject who receive each grade is graphically shown, so that a glance reveals the central tendency for each grade in each institution and the extreme deviations in both directions. In all cases the names of instructors and the exact designations of the courses are omitted, at the request of the several institutions concerned; though one may be pardoned the query, what objections could there be to publicity, if grades were distributed on a defensible basis.

Figs. 1 and 2 show the proportion each grade is of the whole number given at Harvard College in each of the elementary courses in twenty-one subjects during one academic year. Thus, the range of the highest

TABLE I

Harvard College

Distribution of 8,969 Grades, Elementary Courses

credit (A) is from one per cent, in English to thirty-five per cent, in Greek. The range of grade B is from eleven per cent, in English, zoology, engineering and astronomy to thirty-three per cent, in fine arts. Grade C shows a minimum of eighteen per cent, in mathematics and a maximum of fifty-two per cent, in English. Grade D ranges from ten to thirty-one; grade E, from one to twenty-one. Courses with fewer than one hundred students are omitted, as the smaller courses are not fairly comparable in a single year with the larger ones on a percentage basis.

Still further to safeguard our comparisons, the intermediate and advanced courses are grouped by themselves. Some men believe that the credits in an advanced course, which to some extent represents the survival of the fittest students in the department, should be differently distributed from the credits in an elementary course in the same subject. College records everywhere show that a larger proportion of the

high rank men than of the low rank men, in an introductory course, continue the subject in advanced courses. Indeed, one of the chief objects of the elective system is to enable students to specialize in fields in which they are likely to achieve distinction. But this hardly justifies the extreme and continued variations among the grade distributions of the intermediate group of courses, nor does it account in a satisfactory

way for the diverse practises among advanced courses. Figs. 3 and 4 show a variation of two per cent, to sixty per cent, in the As given in intermediate courses in Harvard College; and extremes of seventeen per cent, and seventy-four per cent, in the case of grade B, Fig. 4 pictures the statistics of grades C, D and E.

Table II. indicates the comparative distribution of grades A-E for the three groups—elementary, intermediate and advanced.

TABLE II

Harvard College

 Totals %A %B %C %D %E No. of Grades Group I. 7 20 42 21 7 8,969 Group II. 12 28 37 13 4 2,426 Group III. 36 38 13 2 2 476

Not only are there extreme variations among different courses, but there are variations in the same courses from year to year that can not be accounted for, apparently, by any of our scientific studies in the

distribution of abilities among human beings. Fig. 5 shows the per cent, of As and the per cent, of Bs given in each of 53 courses at Harvard

College for 1906-07 contrasted with the same data concerning the same courses for the following year. The data are taken from the published rank lists and from the enrollment in courses as given annually in the president's reports. A glance at the graphs in Fig. 5 reveals the fact that eight or ten courses show marked changes from one year to the next, whereas in eighty to ninety per cent, of the courses there are only negligible variations.

Another fact revealed in Fig. 5 is worth noting here. The six courses reaching highest in the A group are as follows: courses in Greek,

Greek, Italian, Greek, Greek, Latin. The exact designations are withheld because of the respect of the administration for the feelings of individual instructors. Nevertheless, students are encouraged to examine the published rank lists, in which he who runs for high grades may read his chances. Reference to Fig. 1 shows that in 1903-04, as well, Greek far surpassed all other subjects in awarding high marks.

Figs. 6 and 7 show the range of percentages for each grade, A-E, for each instructor in Bowdoin College, Maine, giving in 1907-08 over one hundred grades. Fig. 8 gives the distribution of highest grades and of lowest grades, the honors and the failures, for various departments at the University of California in December, 1906. It is unnecessary to present statistics for other years or for other institutions. From Maine to California the administration of college credits, though

alike in no other particular, agrees in this—that its basis is personal rather than scientific[2] We do not know what the symbols mean: they have no defined meaning. To reply that a given symbol signifies a certain per cent, of an undefined something is to beg the question. The administration of the curriculum on a percentage basis—giving grade B in Greek, for example, a value of 80 to 90 per cent.—is apparently exact, but in reality far from it.

We have here used the statistics of institutions which do not attempt to grade students in many groups. The usual school method of grading on a one-hundred-per-cent. basis produces confusion. As a matter of fact, teachers can not use more than a dozen grades with discrimination. This has been proved repeatedly by careful tests, and is evident from the erratic clustering of grades around arbitrary points on the scale in every school where the attempt is made to use a scale with thirty or more divisions.

Individual instructors, in defense of their extreme variations from the mean proportion of high and low grades, often assert that the students who elect their subjects are much better than the students who elect other courses. Figs. 2 and 8 seem to indicate that quite the opposite is the truth. The poorer students elect a larger proportion of their work than the better students from courses in which the number of high grades given is relatively large. The better students elect a larger proportion of their work than the poorer students from courses in which the number of low grades given is relatively large. Furthermore, it is possible to show that the variations in grade distribution do not represent equal variations in the abilities of the groups of students concerned. We can demonstrate this by comparing the grades attained by a large number of students in certain departments with the grades attained by these same students in other departments.

Such a study is summarized in Table III. It exhibits the record in Harvard College of 363 men from twelve classes who later graduated with honor from the Harvard Law and Medical Schools. It gives the exact number of students receiving in a given subject a rank higher than their median rank for all subjects. Thus it appears that English, fine arts, mathematics, classics and modern languages, in the order named, constitute a group in which the grades assigned in this institution are comparatively low. On the other hand, natural sciences, philosophy and history and political sciences, in the order named, make up a group in which the grades assigned are comparatively high. At the two extremes stand English, in which 86 per cent, of the students received lower than their median rank in all subjects, and natural sciences, in which 71 per cent, of the same students received higher than their median rank. Furthermore, this table does not represent the extreme variations within departments. The eccentricities of the hardest markers in English and the easiest markers in natural sciences are here offset by the other markers in each department. We must conclude, therefore, that the diverse distribution of grades shown in the figures can not be justified by the unsupported assertion that the students electing certain subjects have far more ability than the students electing other subjects.

The question arises whether it is possible to supplant the personal equation as the chief factor in the awarding of college grades by scientific guidance? The immediate answer to this question depends on whether the distribution of mental traits in groups of individuals follows any regular law—and for the present on nothing else. The ultimate answer to this question will be the discovery of units of measurement in every school subject, and the construction, by scientific methods, of scales that can be applied as the foot-rule is now applied, regardless of time, or place or persons. The best possible ratings of individuals by relative position are only temporary expedients that must some day give way to ratings by means of standard scales. The nearest approach to such a scale, and a perfect illustration of the method, is E. L. Thorndike's "Handwriting," Teachers College Record, March, 1910. The Courtis Standard Tests in Arithmetic also furnish a means of comparing the achievement of one school with that of another, and the work of one year with that of another. We are not likely to continue to spend billions of dollars on education and be satisfied with guessing at results. Measurements of results with quantitative precision will be made as soon as people know enough to demand such measurements.

Lacking the necessary units and scales, we may even now ask whether the differences among individuals in mental capacities are explainable by any simple causes and amenable to any single type of description. They are not, if we are to accept the tables and figures just presented as correct records of the abilities of college students. But fortunately we are not dependent on such unscientific data. Psychologists have recently given us many rigorously scientific studies of the distribution of mental traits.

These studies have shown that in any group of individuals representing a single species, the distribution of any trait not greatly influenced by natural selection appears to be that of a chance event. The surface of frequency for that trait approaches that of the probability integral. It is like the cross-section of a pile of sand dumped from a cart. The most convenient way to represent tables of frequencies is by means of diagrams in which distances along a base line represent the different quantities, or units of measurement, and the heights of columns erected upon it represent the frequencies of the several quantities. Fig. 9 presents several illustrations, D representing the results of a memory test. By such graphic representations rather that algebraic formulae, the answer to our question and the evidence for it can be made clear even to one unfamiliar with the mathematical properties of the surface of frequency of a chance event.

Fig. 9, A, gives the distribution, or surface of frequency, of the type to which we assume that all distributions of mental traits conform. Fig. B is the same type of distribution with a coarser separation into grades. This type is called the normal surface of frequency. It describes, for example, the distribution of accidental errors in scientific observation. Thorndike's numerous measurements show a remarkable uniformity in the distribution of mental traits among individuals. Fig. 9, D, showing the memory span for digits in 123 American women students, is a good example. In all cases the distribution closely approximates the normal type.

Does the distribution of the complex abilities that determine excellence in college courses approximate this normal type? Theoretically it should, and our theory is supported in a striking way by the distribution of 8,969 grades in twenty-one elementary courses for two years, 1907-08, at Harvard College. The curve in Fig. 11, representing this distribution, is nearly normal. The percentages for the grades A-E are, respectively, 7, 20, 42, 21, 7. Yet there are wide variations among

the instructors in these very courses. In fact not a single instructor came as near to a normal distribution as the sum of all their grades. Now, no one of these markers is as likely to tell the truth as all together. Their several errors correct each other and thus give us, in Fig. 11 (Group 1), a close approximation to the type of curve we should expect to have with an infinite number of cases. In 1909-10, the grades in certain elementary courses in Harvard College (Chemistry 1, Comparative Literature 1, English A, Government 1, History 1, Mathematics F, Philosophy C, Zoology 1) were distributed in the following percentages: A ${\displaystyle {{\ce {=}}}}$ 5.5, B ${\displaystyle {{\ce {=}}}}$ 21, C ${\displaystyle {{\ce {=}}}}$ 44, D ${\displaystyle {{\ce {=}}}}$ 19.5, E ${\displaystyle {{\ce {=}}}}$ 9. The curve for these facts was printed and sent to each instructor with an explanation of its meaning, and a superimposed red curve showing in each case precisely how the instructor's distribution differed from the norm. A table was prepared showing the distribution of grades for all courses having 80 or more students.[3] The range for each grade in

percentages was as follows:

 A 0.7-20 B 6-39 C 27-62 D 0-31 E 0-20

Accordingly, we have scientific grounds for assuming that a theoretically correct distribution of the grades of college students will approach the normal surface of frequency (Fig. 9, A and B) unless the group is subject to selection. In that case the curve would be skewed negatively or positively as in Fig. 10.

As a matter of fact, college students are a selected group.[4] If the surface A (Fig. 9) represents the distribution of all elementary school pupils at a given time, then most of those pupils who are to go to college fall in the upper end of that surface. If our colleges took the best students and only the best, if they made a clean cut off the top, then the distribution of their abilities would be represented by a surface closely approximating EFG of surface A. But for various reasons—including our extremely inaccurate means of attempting to determine fitness for entrance—our colleges do not admit merely those who are best fitted to pursue higher study, that is, the upper end of the

surface. Some pupils find ways into college who occupy stations in the surface not far above the median; or the line of mediocre ability. This is clearly shown in Professor Dearborn's study of the relative standing in scholarship of students in high school and in college. Consequently the lower end of the surface would not be clean-cut as in EFG, but rather like the heavy line of Fig. 10. It would, of course, be skewed positively, for there could not possibly be many cases near G. Most of them would have to fall in the larger space near EF. The curve would be similar to that for incomes. The heavy line in Fig. 10, therefore, though not representing with precision[5] the scientifically correct distribution of college grades, does certainly come nearer the correct frequency curve than the normal curve, or than that representing the present practise of any college or university in the country. In institutions where many sons of rich parents are dragged just above the failure line by tutors, the curve would be skewed even more than in Fig. 10.

As we proceed upward through the years of school and college we should thus expect to find the curve skewed more and more in a positive direction, provided the standards are appropriately higher each year and a new base line is taken for each successive group. Those who accept the principle of normal distribution only for freshman courses in college, or for any single period in the school life of the child, would be at a loss to prove its peculiar fitness for that period.

TABLE III

Records in Harvard College of 363 Honor Graduates in Law and Medicine

Number of Students receiving Grades in Certain Departments ABOVE their General Average in all Departments

 Fine Arts Natural sci ences Mathe matics Philo logy His tory Modern lan guages English Clas sics Law graduates 24 68 33 78 86 67 12 41 Medical graduates 32 88 36 53 42 50 12 24 Total 56 156 63 131 128 117 24 66

Number of Students receiving Grades in Certain Departments BELOW their General Average in all Departments Law graduates

 Fine Arts Natural sci ences Mathe matics Philo logy His tory Modern lan guages English Clas sics Law graduates 40 55 33 32 29 84 46 41 Medical graduates 45 19 48 33 53 55 109 32 Total 85 64 81 65 82 139 155 73

Summary

 Number above 56 156 69 131 128 117 24 66 Subject Fine Arts Natural sci ences Mathe matics Philo logy His tory Modern lan guages English Clas sics Number below 85 64 81 65 82 139 155 73

At least two institutions now enforce a distribution of grades on a scientific basis. At the University of Missouri, an A is approximately equal to an A, a B equal to a B, in a defined sense; so that grades may be intelligently and fairly used for administrative purposes. According to the definitions adopted in June, 1908, grades A ${\displaystyle {{\ce {+}}}}$ B must equal 25 per cent., grade C, 50 per cent., and grades D ${\displaystyle {{\ce {+}}}}$ E, 25 per cent, of the total number given by each instructor. Under the old system, forty teachers in five years graded their students so that 25 per cent, received A, 35 per cent, received B, and 32 per cent, received C. Moreover, the lack of uniformity among instructors was as great as at Harvard and California. Under the new system the irregularity of the grading was reduced the first year from one fourth to one tenth, or in the ratio of 5 to 2.

The distribution of 24,979 grades in percentages was as follows:

 Aug., 1908 Feb., 1909 June, 1909 Feb., 1910 B 23.3 20.7 21.0 21.3 ${\displaystyle \scriptstyle {\left.{\begin{matrix}\ \end{matrix}}\right\}\,}}$ 26.0 A 7.7 4.6 4.6 4.7 C 41.2 47.5 48.8 49.6 D 8.7 13.7 13.8 14.4 ${\displaystyle \scriptstyle {\left.{\begin{matrix}\ \end{matrix}}\right\}\,}}$ 20.9 E 15.6 8.5 8.0 6.5 Delayed 3.5 5.0 3.8 3.5

Not counting the delayed reports, the distribution of the 11,342 grades for the first year of the new system was, in percentages:

 A 4.9 ${\displaystyle \scriptstyle {\left.{\begin{matrix}\ \end{matrix}}\right\}\,}}$ 26.6 B 21.7 C 50 D 14.5 ${\displaystyle \scriptstyle {\left.{\begin{matrix}\ \end{matrix}}\right\}\,}}$ 23.5 E 9

In spite of the adopted definitions, the tendency remains to mark students too high. Every attempt to devise a system of marking whereby extraordinary achievement can be awarded the distinction it deserves has failed because of the democratic tendency in all institutions to place so large a proportion of students in the "distinguished" group. At Dartmouth College, a professor once announced to a large class, "Gentlemen, I must warn you that the committee on instruction has requested me to make my examinations harder, but, gentlemen, I am pleased to say that I shall continue to mark the papers." A century ago a Virginia academy attempted to have its students graded in six divisions,—bonus, melior, optimus, and malus, pejor, pessimus. But history records that "the continual tendency was to mark inferior students too high. Thus it came to pass that not half the bad scholars got malus, the worst almost never fell below it, and bonus, though a mark of approbation, came to be considered as a disgrace, while optimus, which ought to have been reserved for scholars of the highest merit, was commonly bestowed on all who rose above mediocrity." As the president of this old institution remarked, "a temporizing professor who loves popularity, and desires, like the old man in the fable, to please everybody, is sure to be guilty of this fault, and like many a politician, to sacrifice permanent good for temporary favor." This is still the tendency everywhere, in spite of the manifest absurdity of declaring a large proportion of students distinguished.

On the other hand, nearly every institution has instructors who occasionally refuse pass marks to large proportions of their students. It was when a professor in Missouri "flunked" his entire class, and the boards overruled him by passing the entire class, that some of the faculty urged the adoption of a scientific system of grading. The students at another college put more sense than lyrical charm into the following lines:

There was a professor named Bray
Who forgot the reflection on Bray,
When in two of his classes
He gave but few passes.
And frightened good students away.

If an instructor refuses to pass some of the median half of the surface of distribution, it must mean, as a rule, that his methods of instruction or discipline are faulty, or that an unwarranted proportion of students have been admitted to a course they are unprepared to take. In either case, the fault is not with the students, but with the administration of the college.

The distribution of grades by the various departments at the University of Missouri in 1909, under the new rules, showed a range in percentage of A's from twenty in the history of art to zero in political science. The narrow limits of 2 to 7 per cent, included seventy-two per cent, of the departments. Thirteen departments gave the median percentage of A's, which was 4. The entire distribution of grades by departments was published and sent to the instructors, together with a table locating the responsibility for the failure to hold to the adopted definition of grades. The table gave the name of each instructor whose percentage of A-B grades differed from 25 by more than 2, and the name of each instructor whose percentage of D-E grades differed from 25 by more than 4. The table would have been an invaluable guide to students who were seeking the easiest way to get high grades. It was in fact a table of chances.

As a result of this wholesome publicity, the instructors in 1910 showed an even closer approximation to the adopted scheme of distribution. This means that we come nearer to knowing what a grade stands for at the University of Missouri than at any other institution in the country.

Replies from 58 members of the faculty of the University of Missouri in 1910 show that 51 approve of the general principle of standardizing grades and 4 do not approve; only 1 reports that he does not aim to have his grades conform to the system in the long average; 21 tend in grading large, elementary classes to give low marks and offset them by higher marks given to advanced classes, 20 do not; 15 think that the effect, before the semester is over, is to discourage the efforts of some students appreciably, 23 do not; 26 believe that the effect of the system has been good, 7 regard it as bad, and 23 as inappreciable.

TABLE IV

University of Missouri

 Subjects of Study %A %B %C %D %E %Del. Total Class. Arch, and History of Art 15 25 39 9 4 8 297 Botany 9 19 44 16 10 2 557 Physical Education 8 17 50 10 15 — 649 Latin 7 25 45 18 5 — 323 Germanic Languages 7 23 45 13 9 3 1006 Animal Husbandry 7 22 51 14 4 2 594 Economics 7 15 43 23 11 1 369 Agronomy 6 26 57 4 6 1 321 Horticulture 6 23 47 13 8 3 495 Music 6 20 58 4 11 1 280 Law 5 24 52 10 2 7 3984 Experimental Psychology 5 20 53 14 7 1 497 Mathematics 5 20 49 12 11 3 962 Philosophy 5 14 45 20 12 4 336 Veterinary Science 4 27 59 7 2 1 292 History 4 26 49 13 5 3 1023 Sociology 4 23 51 16 5 1 594 Education 4 22 50 16 7 1 751 Journalism 4 21 59 12 3 2 342 Political Science 4 21 44 22 3 6 280 Philosophy of Education 4 20 53 19 3 1 365 Home Economics 4 20 53 15 2 6 220 Physics 4 20 43 19 8 6 1030 Electrical Engineering 3 22 49 18 4 4 491 Mech. Draw, and Hydr. Engin. 3 22 49 12 8 6 726 Mechanical Engineering 3 21 53 16 6 1 642 Romance Languages 3 21 49 14 10 3 468 Shopwork 3 20 59 12 3 3 376 Theory and Practise of Art 3 19 48 15 9 6 289 English 3 18 50 18 8 3 1583 Chemistry 3 16 46 20 12 3 1379 Elocution 2 25 50 21 2 — 232 Civil Engineering 2 21 52 18 5 2 836 Zoology 2 19 49 20 8 2 391 Geology and Mineralogy 2 17 57 16 6 2 344 Military Education 2 16 52 6 — 24 293

Discarding the arbitrary divisions employed wherever undefined symbols and numbers are used, we may divide the area of the normal surface of frequency as it is always divided for other scientific purposes. We may mark off a middle area equal to the sum of the two areas left at the sides. Half the students of any group will be represented by this middle area. We may designate this group by the symbol C, or K, or 76, or 13, or 289, or we may name it after the chairman of the school board. Much will be gained when we rid ourselves of the notion that the letters and numbers we now use so widely necessarily have any particular meaning. What we call this group does not matter: the significant thing is that it stands for an ability above and below which half the cases lie. It means that a student taken at random from a class of one hundred has one chance in four of falling above the middle group. It means that if we represent the ability of this group by C, we know precisely what an instructor means when he gives a student that grade. He means that the ability of the student in his course is greater than that of one fourth of the course and less than that of another fourth of the course. This median group ought to be the largest, for it is where most human beings fall, as shown by the height of the probability curve.

We can not indicate real distinction, however, unless we subdivide the upper quartile. We can do this arbitrarily or we can turn to a table of values of the normal probability integral.[6] Here the extreme ability is called 3. The point of the vertical line which separates the median group from the inferior group is .68. Half way between 3 and .68 is 1.84. Accepting this as the division point for the upper and the lower quartile, we find at the upper end of the surface of distribution three per cent, of the whole, and at the lower end three per cent. If we indicate the five sections, from the upper end to the lower, by the symbols A, B, C, D, E, we have the following distribution of grades:

 Per Cent. A 3 B 22 C 50 D 22 E 3

If, on the other hand, we assume that the distribution of abilities of college students is not normal, but skewed, the following percentages for each grade would more nearly represent the facts:

 Per Cent. A 2 ${\displaystyle \scriptstyle {\left.{\begin{matrix}\ \\\ \end{matrix}}\right\}\,}}$ 20 B 18 C 50 D 24 ${\displaystyle \scriptstyle {\left.{\begin{matrix}\ \\\ \end{matrix}}\right\}\,}}$ 30 E 6

As variation in the abilities of those who elect a given course is sure to occur from year to year, some would prefer an elastic definition of the grades; for example:

 Per Cent. A 0-6 B 15-21 C 45-55 D 20-28 E 0-10

Any one of these definitions of the meaning of the five groups would come nearer to telling the truth, be more serviceable for administrative purposes, and convert the vast amount of labor now used in making out grades into more valuable data for the scientific study of education than the present personal distribution of college credits. A defensible definition of grades should be adopted by each faculty and its members should be required to adhere closely to it, in the long run, at least in all courses primarily for undergraduates, until we can supplant the method of grading by relative position by scales made up of equal units.

After the definition of grades is adopted, a table should be sent to each instructor, as often as grades are required at the college office, showing the distribution of grades in each course in the college and emphasizing those that depart far in either direction from the adopted mean. Every instructor should be requested to justify his eccentricities, at least in a series of years. If such publicity does not accomplish sufficient uniformity for administrative purposes, insurgent and careless instructors should be reminded by the appropriate authorities that it is for the interest of all for each to abide by the decision of the faculty.

To rate instructors solely with respect to the proportion of high grades awarded by them, or solely with respect to the quality of students attracted to their courses, is evidently inadequate. An instructor may give more high grades than his associates, because he has more students who deserve distinction. But if this is the case, the administrators of the college curriculum can readily devise a means of measurement which will show at a glance the justification for any conspicuous deviation from the normal distribution of grades. All the instructors of any institution may be located on a scale which shall take account not only of the grades awarded, but as well of the quality of the students electing each course.

For example, as part of an investigation conducted at Williams College by a committee in accordance with a resolution of the faculty. Dean Ferry, at the request of the president, devised a plan for measuring the relative quality of the classes in the elective courses of junior and senior years and of the grades given in each. Taking the work of the first two years, where the courses are nearly all prescribed, as a basis for the determination of the scholarship of the students, statistics were carefully worked out for the elective courses of three successive classes. The results of his extensive study are summarized in Table V. Column I. gives each instructor his position with respect to the quality of students in his courses. The larger the proportion of men attracted to his courses from the upper half of the student body in general scholarship, the larger the plus rating of the instructor. For purposes of comparison, Column II. gives each instructor his position with respect to the proportion of high grades and low grades assigned to him. Thus, for example, instructor number 4 has the high rating of 41 in the quality of his students and the low rating of -23 in the assignment of grades. Instructor number 26, on the contrary, has the low rating of -21 in quality of students and the high rating of 52 in grades assigned. In other words, he has a conspicuously large proportion of the students whose general scholarship is low, and to these poor students he awards a conspicuously large proportion of high grades. Many a teacher would be surprised to discover his standing on such a scale, and the college administrator who undertakes to deal with such discrepancies, through discussion with individual members of the faculty, will do well to provide himself with a quantitative presentation of the facts.

TABLE V

A Rating of Elective Classes in Williams College

 I II I II 1 113 0 16 2 41 2 113 0 17 1 42 3 77 27 18 -1 56 4 41 -23 19 -2 6 5 39 23 20 -4 -11 6 39 -21 21 -5 89 7 24 3 22 -7 63 8 20 49 23 -8 59 9 17 50 24 -14 40 10 15 34 25 -17 95 11 13 20 26 -21 52 12 9 41 27 -22 89 13 7 32 28 -30 114 14 6 58 29 -33 66 15 5 63 30 -40 73

Such regulation will be resented by many college teachers as an infringement on their rights. But academic freedom that allows each member of a faculty to do as he pleases in matters that reach far beyond the interests of his own department is intolerable license. As President Eliot has said:

A faculty can properly criticize the results of any professor's, or other instructor's, work as they appear in certain easily visible ways. Among such visible evidences are . . . the resort of obviously incompetent or uninterested students to his courses; examination papers of a trivial or pedantic sort; uniform high grades or uniform low grades returned by the professor; an extraordinary number of distinctions earned in his courses; or an extraordinary number of rejections and failures. These are legitimate subjects of inquiry by a faculty committee or by faculty officials, and can be dealt with by a faculty without impairing just academic freedom. The knowledge that this power of revision resides in a facility is a valuable control over individual eccentricities.

It is sometimes said that "there are usually some courses in a university which, from year to year, secure only an inferior grade of pupils, and other lines of work which, for various reasons, secure a proportionate number of superior students. Classical students in the high school and university, and students in the advanced courses in mathematics, are often examples of such selected groups of students. The above principle would not be equitable in these cases." In answer to this argument, it should be noted, first, that it is, in large part, the very grading to which objection is raised that has caused the resort of poor students to certain courses; and, second, if the better men do resort in larger proportions to certain courses, that fact can be readily shown by statistics. It is one of the many educational questions on which speculation and opinion are quite out of place.

Without a scientific administration of college credits, the other safeguards of the elective system are insufficient. There will always be students who are more interested in getting through their courses than in getting profit from them. The poorer students seek the courses which give the larger proportions of high grades. Earnest but needy students, too, are under great temptation to elect courses with a view to winning money scholarships, as long as scholarships are awarded on the false assumption than an A is equal to an A. To all students who are prompted by unworthy motives in the election of studies. Figs. 1 to 11 are charts pointing the easiest courses to a degree. And students in all colleges are guided by such charts, more or less accurately plotted. It is futile for the authorities to try to suppress such information and protect their instructors from the notoriety they deserve. Nor is the elective system to blame for the presence of snap courses and the relative ease with which high grades are secured from certain instructors. Nor is the credit-for-quality plan to be condemned because it accentuates the evils of our marking devices. The best way to safeguard the elective system and the credit-for-quality plan against the evils here set forth is to enforce a scientific distribution of college credits.

1. Popular Science Monthly, Vol. LXVI., pp. 367-378, 1905, by J. McKeen Cattell. Science, N. S., Vol. XXVIII., No. 712, pp. 243-250, 1908, by Max Meyer.
2. W. F. Dearborn, in Bulletin 368 of the University of Wisconsin, presents numerous unscientific distributions of grades in Wisconsin.
3. In Bulletin 368 of the University of Wisconsin, Professor Dearborn attempts to justify the normal distribution of grades "from the fact that it is used in actual practise." Two objections may be made to this contention: first, very few instructors do closely approximate the normal distribution; second, as their practises have no scientific basis, any one of them could only by accident indicate the theoretically correct distribution. If, however, all of Professor Dearborn's curves were represented by one, made from thousands of grades by scores of instructors, it would conform more closely to the general biological law of variation than any of the curves he presents.
4. Thorndike, E. L., "The Selective Influence of the College," Educational Review, 30, 1.
5. "The curve of probability gives us the only precise meaning of the term 'scientific knowledge.' We have seen that human observations and measurements are never precisely accurate. Generalizations, in like manner, are never precisely true. The formulation of a law of nature can never be made absolutely exact. Scientific knowledge, therefore, is not that absolutely exact and certain knowledge which the popular mind assumes it to be. It is certainty or exactness within a range of error, and to diminish that range is the object of scientific endeavor." Giddings, F. H., "Sociology," Columbia University Press, New York, 1908, p. 24.
6. A table of values of the normal probability integral is found on page 148 of Thorndike's "Mental and Social Measurements." In Science, 712, 243, Max Meyer uses this basis for dividing the probability surface.