Page:Proceedings of the Royal Society of London Vol 60.djvu/513

From Wikisource
Jump to navigation Jump to search
This page needs to be proofread.
478 Mr. G. U. Yule. On the Significance of Brawald Formulae

subjecting the distances of the means from the line to some minimal condition. If the slope of RR is positive we may say that large values of x are on the whole associated with large values of y, if it is negative large values of x are associated with small values of y. Further, if the slope of RR to the vertical be given we shall have a measure of a rough practical kind of tbe shift of the mean of an a1-array when its type y is altered. The equation to RR consequently gives a concise and definite answer to two most important statistical questions. It is also evident that if the means of the arrays actually lie in a straight line (as in normal correlation), the equation to RR must be the equation to the line of regression.

Let n be the number of observations in any a>array, and let cl be the horizontal distance of the mean of this array from the line RR. I propose to subject the line to the condition that the sum of all quantities like nd?shall be a minimum, i.e., I shall use the condition of least squares. I do this solely for convenience of analysis; I do not claim for the method adopted any peculiar advantage as regards the probability of its results. It would, in fact, be absurd to do so, for I am postulating at the very outset that the curve of regression is only exceptionally a straight line; there can consequently be no meaning in seeking for the most probable straight line to represent the regression.

Let x, ybe a pair of associated deviations, let a be the standard deviation of any array about its own mean, and let