1911 Encyclopædia Britannica/Algebra
ALGEBRA (from the Arab. al-jebr waʼl-muqābala, transposition and removal [of terms of an equation], the name of a treatise by Mahommed ben Musa al-Khwarizmi), a branch of mathematics which may be defined as the generalization and extension of arithmetic.
The subject-matter of algebra will be treated in the following article under three divisions:—A. Principles of ordinary algebra; B. Special kinds of algebra; C. History. Special phases of the subject are treated under their own headings, e.g. Algebraic Forms; Binomial; Combinatorial Analysis; Determinants; Equation; Continued Fraction; Function; Groups, Theory of; Logarithm; Number; Probability; Series.
A. Principles of Ordinary Algebra
1. The above definition gives only a partial view of the scope of algebra. It may be regarded as based on arithmetic, or as dealing in the first instance with formal results of the laws of arithmetical number; and in this sense Sir Isaac Newton gave the title Universal Arithmetic to a work on algebra. Any definition, however, must have reference to the state of development of the subject at the time when the definition is given.
2. The earliest algebra consists in the solution of equations. The distinction between algebraical and arithmetical reasoning then lies mainly in the fact that the former is in a more condensed form than the latter; an unknown quantity being represented by a special symbol, and other symbols being used as a kind of shorthand for verbal expressions. This form of algebra was extensively studied in ancient Egypt; but, in accordance with the practical tendency of the Egyptian mind, the study consisted largely in the treatment of particular cases, very few general rules being obtained.
3. For many centuries algebra was confined almost entirely to the solution of equations; one of the most important steps being the enunciation by Diophantus of Alexandria of the laws governing the use of the minus sign. The knowledge of these laws, however, does not imply the existence of a conception of negative quantities. The development of symbolic algebra by the use of general symbols to denote numbers is due to Franciscus Vieta (François Viète, 1540–1603). This led to the idea of algebra as generalized arithmetic.
4. The principal step in the modern development of algebra was the recognition of the meaning of negative quantities. This appears to have been due in the first instance to Albert Girard (1595–1632), who extended Vieta’s results in various branches of mathematics. His work, however, was little known at the time, and later was overshadowed by the greater work of Descartes (1596–1650).
5. The main work of Descartes, so far as algebra was concerned, was the establishment of a relation between arithmetical and geometrical measurement. This involved not only the geometrical interpretation of negative quantities, but also the idea of continuity; this latter, which is the basis of modern analysis, leading to two separate but allied developments, viz. the theory of the function and the theory of limits.
6. The great development of all branches of mathematics in the two centuries following Descartes has led to the term algebra being used to cover a great variety of subjects, many of which are really only ramifications of arithmetic, dealt with by algebraical methods, while others, such as the theory of numbers and the general theory of series, are outgrowths of the application of algebra to arithmetic, which involve such special ideas that they must properly be regarded as distinct subjects. Some writers have attempted unification by treating algebra as concerned with functions, and Comte accordingly defined algebra as the calculus of functions, arithmetic being regarded as the calculus of values.
7. These attempts at the unification of algebra, and its separation from other branches of mathematics, have usually been accompanied by an attempt to base it, as a deductive science, on certain fundamental laws or general rules; and this has tended to increase its difficulty. In reality, the variety of algebra corresponds to the variety of phenomena. Neither mathematics itself, nor any branch or set of branches of mathematics, can be regarded as an isolated science. While, therefore, the logical development of algebraic reasoning must depend on certain fundamental relations, it is important that in the early study of the subject these relations should be introduced gradually, and not until there is some empirical acquaintance with the phenomena with which they are concerned.
8. The extension of the range of subjects to which mathematical methods can be applied, accompanied as it is by an extension of the range of study which is useful to the ordinary worker, has led in the latter part of the 19th century to an important reaction against the specialization mentioned in the preceding paragraph. This reaction has taken the form of a return to the alliance between algebra and geometry (§5), on which modern analytical geometry is based; the alliance, however, being concerned with the application of graphical methods to particular cases rather than to general expressions. These applications are sometimes treated under arithmetic, sometimes under algebra; but it is more convenient to regard graphics as a separate subject, closely allied to arithmetic, algebra, mensuration and analytical geometry.
9. The association of algebra with arithmetic on the one hand, and with geometry on the other, presents difficulties, in that geometrical measurement is based essentially on the idea of continuity, while arithmetical measurement is based essentially on the idea of discontinuity; both ideas being equally matters of intuition. The difficulty first arises in elementary mensuration, where it is partly met by associating arithmetical and geometrical measurement with the cardinal and the ordinal aspects of number respectively (see Arithmetic). Later, the difficulty recurs in an acute form in reference to the continuous variation of a function. Reference to a geometrical interpretation seems at first sight to throw light on the meaning of a differential coefficient; but closer analysis reveals new difficulties, due to the geometrical interpretation itself. One of the most recent developments of algebra is the algebraic theory of number, which is devised with the view of removing these difficulties. The harmony between arithmetical and geometrical measurement, which was disturbed by the Greek geometers on the discovery of irrational numbers, is restored by an unlimited supply of the causes of disturbance.
10. Two other developments of algebra are of special importance. The theory of sequences and series is sometimes treated as a part of elementary algebra; but it is more convenient to regard the simpler cases as isolated examples, leading up to the general theory. The treatment of equations of the second and higher degrees introduces imaginary and complex numbers, the theory of which is a special subject.
11. One of the most difficult questions for the teacher of algebra is the stage at which, and the extent to which, the ideas of a negative number and of continuity may be introduced. On the one hand, the modern developments of algebra began with these ideas, and particularly with the idea of a negative number. On the other hand, the lateness of occurrence of any particular mathematical idea is usually closely correlated with its intrinsic difficulty. Moreover, the ideas which are usually formed on these points at an early stage are incomplete; and, if the incompleteness of an idea is not realized, operations in which it is implied are apt to be purely formal and mechanical. What are called negative numbers in arithmetic, for instance, are not really negative numbers but negative quantities (§ 27 (i.)); and the difficulties incident to the ideas of continuity have already been pointed out.
12. In the present article, therefore, the main portions of elementary algebra are treated in one section, without reference to these ideas, which are considered generally in two separate sections. These three sections may therefore be regarded as to a certain extent concurrent. They are preceded by two sections dealing with the introduction to algebra from the arithmetical and the graphical sides, and are followed by a section dealing briefly with the developments mentioned in §§ 9 and 10 above.
Arithmetical Introduction to Algebra
13. Order of Arithmetical Operations.—It is important, before beginning the study of algebra, to have a clear idea as to the meanings of the symbols used to denote arithmetical operations.
(i.) Additions and subtractions are performed from left to right. Thus 3 ℔ + 5 ℔ − 7 ℔ + 2 ℔ means that 5 ℔ is to be added to 3 ℔, 7 ℔ subtracted from the result, and 2 ℔ added to the new result.
(ii.) The above operation is performed with 1 ℔ as the unit of counting, and the process would be the same with any other unit; e.g. we should perform the same process to find 3s. + 5s. − 7s. + 2s. Hence we can separate the numbers from the common unit, and replace 3 ℔ + 5 ℔ − 7 ℔ + 2 ℔ by (3 + 5 − 7 + 2) ℔, the additions and subtractions being then performed by means of an addition-table.
(iii.) Multiplications, represented by ×, are performed from right to left. Thus 5×3×7×1 ℔ means 5 times 3 times 7 times 1 ℔; i.e. it means that 1 ℔ is to be multiplied by 7, the result by 3, and the new result by 5. We may regard this as meaning the same as 5×3×7 ℔, since 7 ℔ itself means 7×1 ℔, and the ℔ is the unit in each case. But it does not mean the same as 5×21 ℔, though the two are equal, i.e. give the same result (see § 23).
This rule as to the meaning of × is important. If it is intended that the first number is to be multiplied by the second, a special sign such as >< should be used.
(iv.) The sign ÷ means that the quantity or number preceding it is to be divided by the quantity or number following it.
(v.) The use of the solidus / separating two numbers is for convenience of printing fractions or fractional numbers. Thus 16/4 does not mean 16 ÷ 4, but 164.
(vi.) Any compound operation not coming under the above descriptions is to have its meaning made clear by brackets; the use of a pair of brackets indicating that the expression between them is to be treated as a whole. Thus we should not write 8×7+6, but (8×7)+6, or 8×(7+6). The sign × coming immediately before, or immediately after, a bracket may be omitted; e.g. 8×(7+6) may be written 8(7+6).
This rule as to using brackets is not always observed, the convention sometimes adopted being that multiplications or divisions are to be performed before additions or subtractions. The convention is even pushed to such an extent as to make “412+323 of 7+5” 412+(323 of 7)+5”; though it is not clear what “Find the value of 412+323 times 7+5” would then mean. There are grave objections to an arbitrary rule of this kind, the chief being the useless waste of mental energy in remembering it.
(vii.) The only exception that may be made to the above rule is that an expression involving multiplication-dots only, or a simple fraction written with the solidus, may have the brackets omitted for additions or subtractions, provided the figures are so spaced as to prevent misunderstanding. Thus 8+(7×6)+3 may be written 8+7.6+3, and 8+76+3 may be written 8+7/6+3. But 3 . 52 . 4 should be written (3.5)/(2.4), not 3.5/ 2.4.
14. Latent Equations.—The equation exists, without being shown as an equation, in all those elementary arithmetical processes which come under the head of inverse operations; i.e. processes which consist in obtaining an answer to the question “Upon what has a given operation to be performed in order to produce a given result?” or to the question “What operation of a given kind has to be performed on a given quantity or number in order to produce a given result?”
(i.) In the case of subtraction the second of these two questions is perhaps the simpler. Suppose, for instance, that we wish to know how much will be left out of 10s. after spending 3s., or how much has been spent out of 10s. if 3s. is left. In either case we may put the question in two ways:—(a) What must be added to 3s. in order to produce 10s., or (b) To what must 3s. be added in order to produce 10s. If the answer to the question is X, We
have either |
(a) 10s. = 3s.+X, ∴X = 10s.−3s. |
(a) If 24d. is divided into 4 equal portions, how much will each portion be?
Let the answer be X; then
(b) Into how many equal portions of 6d. each may 24d. be divided?
Let the answer be x; then
(iii.) Where the direct operation is evolution, for which there is no commutative law, the two inverse operations are different in kind.
(a) What would be the dimensions of a cubical vessel which would exactly hold 125 litres; a litre being a cubic decimetre?
Let the answer be X; then
(b) To what power must 5 be raised to produce 125?
Let the answer be x; then
15. With regard to the above, the following points should be noted.
(1) When what we require to know is a quantity, it is simplest to deal with this quantity as a whole. In (i.), for instance, we want to find the amount by which 10s. exceeds 3s., not the number of shillings in this amount. It is true that we obtain this result by subtracting 3 from 10 by means of a subtraction-table (concrete or ideal); but this table merely gives the generalized results of a number of operations of addition or subtraction performed with concrete units. We must count with something; and the successive somethings obtained by the addition of successive units are in fact numerical quantities, not numbers.
Whether this principle may legitimately be extended to the notation adopted in (iii.) (a) of § 14 is a moot point. But the present tendency is to regard the early association of arithmetic with linear measurement as important; and it seems to follow that we may properly (at any rate at an early stage of the subject) multiply a length by a length, and the product again by another length, the practice being dropped when it becomes necessary to give a strict definition of multiplication.
(2) The results may be stated briefly as follows, the more usual form being adopted under (iii.) (a):—
The important thing to notice is that where, in any of these five cases, one statement is followed by another, the second is not to be regarded as obtained from the first by logical reasoning involving such general axioms as that “if equals are taken from equals the remainders are equal”; the fact being that the two statements are merely different ways of expressing the same relation. To say, for instance, that X is equal to A − B, is the same thing as to say that X is a quantity such that X and B, when added, make up A; and the above five statements of necessary connexion between two statements of equality are in fact nothing more than definitions of the symbols , and .
An apparent difficulty is that we use a single symbol − to denote the result of the two different statements in (i.) (a) and (i.) (b) of § 14. This is due to the fact that there are really two kinds of subtraction, respectively involving counting forwards (complementary addition) and counting backwards (ordinary subtraction); and it suggests that it may be wise not to use the one symbol − to represent the result of both operations until the commutative law for addition has been fully grasped.
16. In the same way, a statement as to the result of an inverse operation is really, by the definition of the operation, a statement as to the result of a direct operation. If, for instance, we state that A=X−B, this is really a statement that X=A+B. Thus, corresponding to the results under § 15 (2), we have the following:—
(1) Where the inverse operation is performed on the unknown quantity or number:—
(2) Where the inverse operation is performed with the unknown quantity or number:—
In each of these cases, however, the reasoning which enables us to replace one statement by another is of a different kind from the reasoning in the corresponding cases of § 15. There we proceeded from the direct to the inverse operations; i.e. so far as the nature of arithmetical operations is concerned, we launched out on the unknown. In the present section, however, we return from the inverse operation to the direct; i.e. we rearrange our statement in its simplest form. The statement, for instance, that 32−x=25, is really a statement that 32 is the sum of x and 25.
17. The five equalities which stand first in the five pairs of equalities in §15(2) may therefore be taken as the main types of a simple statement of equality. When we are familiar with the treatment of quantities by equations, we may ignore the units and deal solely with numbers; and (ii.) (a) and (ii.) (b) may then, by the commutative law for multiplication, be regarded as identical. The five processes of deduction then reduce to four, which may be described as (i.) subtraction, (ii.) division, (iii.) (a) taking a root, (iii.) (b) taking logarithms. It will be found that these (and particularly the first three) cover practically all the processes legitimately adopted in the elementary theory of the solution of equations; other processes being sometimes liable to introduce roots which do not satisfy the original equation.
18. It should be noticed that we are still dealing with the elementary processes of arithmetic, and that all the numbers contemplated in §§ 14-17 are supposed to be positive integers. If, for instance, we are told that 15=34 of (x−2), what is meant is that (1) there is a number u such that x=u+2, (2) there is a number v such that u=4 times v, and, (3) 15=3 times v. From these statements, working backwards, we find successively that v=5, u=20, x=22. The deductions follow directly from the definitions, and such mechanical processes as “clearing of fractions” find no place (§ 21 (ii.)). The extension of the methods to fractional numbers is part of the establishment of the laws governing these numbers (§ 27 (ii.)).
19. Expressed Equations.—The simplest forms of arithmetical equation arise out of abbreviated solutions of particular problems. In accordance with § 15, it is desirable that our statements should be statements of equality of quantities rather than of numbers; and it is convenient in the early stages to have a distinctive notation, e.g. to represent the former by capital letters and the latter by small letters.
As an example, take the following. I buy 2 ℔ of tea, and have 6s. 8d. left out of 10s.; how much per ℔ did tea cost?
(1) In ordinary language we should say: Since 6s. 8d. was left, the amount spent was 10s. − 6s. 8d., i.e. was 3s. 4d. Therefore 1 ℔ of tea cost 1s. 8d.
(2) The first step towards arithmetical reasoning in such a case is the introduction of the sign of equality. Thus we say:—
Cost of 2 ℔ tea+6s. 8d.=10s.
∴ Cost of 2 ℔ tea=10s.−6s. 8d.=3s. 4d.
∴ Cost of 1 ℔ tea=1s. 8d
(3) The next step is to show more distinctly the unit we are dealing with (in addition to the money unit), viz. the cost of 1 ℔ tea. We write:—
(2 × cost of 1 ℔ tea)+6s. 8d.=10s.
∴ 2 × cost of 1 ℔ tea=10s.−6s. 8d.=3s. 4d.
∴ Cost of 1 ℔ tea=1s. 8d.
(4) The stage which is introductory to algebra consists merely in replacing the unit “cost of 1 ℔ tea” by a symbol, which may be a letter or a mark such as the mark of interrogation, the asterisk, &c. If we denote this unit by X, we have
20. Notation of Multiples.—The above is arithmetic. The only thing which it is necessary to import from algebra is the notation by which we write 2X instead of 2 × X or 2 . X. This is rendered possible by the fact that we can use a single letter to represent a single number or numerical quantity, however many digits are contained in the number.
It must be remembered that, if a is a number, 3a means 3 times a, not a times 3; the latter must be represented by a × 3 or a . 3.
The number by which an algebraical expression is to be multiplied is called its coefficient. Thus in 3a the coefficient of a is 3. But in 3 . 4a the coefficient of 4a is 3, while the coefficient of a is 3 . 4.
21. Equations with Fractional Coefficients.—As an example of a special form of equation we may take
(i.) There are two ways of proceeding.
(a) The statement is that (1) there is a number u such that x=2u, (2) there is a number v such that x=3v, and (3) u+v=10. We may therefore conveniently take as our unit, in place of x, a number y such thatx=6y.
3y+2y=10,
5y=10, y=2, x=6y=12.
(b) We can collect coefficients, i.e. combine the separate quantities or numbers expressed in terms of x as unit into a single quantity or number so expressed, obtaining
By successive stages we obtain (§ 18) x=2, x=12; or we may write at once x= of 10= of 10=12. The latter is the more advanced process, implying some knowledge of the laws of fractional numbers, as well as an application of the associative law (§ 26 (i.)).
(ii.) Perhaps the worst thing we can do, from the point of view of intelligibility, is to “clear of fractions” by multiplying both sides by 6. It is no doubt true that, if +x+x=10, then 3x+2x=60 (and similarly x+x+x=10, then 3x+2x+x=60); but the fact, however interesting it may be, is of no importance for our present purpose. In the method (a) above there is indeed a multiplication by 6; but it is a multiplication arising out of subdivision, not out of repetition (see Arithmetic), so that the total (viz. 10) is unaltered.
22. Arithmetical and Algebraical Treatment of Equations.—The following will illustrate the passage from arithmetical to algebraical reasoning. “Coal costs 3s. a ton more this year than last year. If 4 tons last year cost 104s., how much does a ton cost this year?”
If we write X for the cost per ton this year, we have
From this we can deduce successively X−3s.=26s., X=29s. But, if we transform the equation into
we make an essential alteration. The original statement was with regard to X−3s. as the unit; and from this, by the application of the distributive law (§ 26 (i.)), we have passed to a statement with regard to X as the unit. This is an algebraical process.
In the same way, the transition from (x2+4x+4)−4=21 to x2+4x+4=25, or from (x+2)2=25 to x+2=√25, is arithmetical; but the transition from x2+4x+4=25 to (x+2)2=25 is algebraical, since it involves a change of the number we are thinking about.
Generally, we may say that algebraic reasoning in reference to equations consists in the alteration of the form of a statement rather than in the deduction of a new statement; i.e. it cannot be said that “If A=B, then E=F” is arithmetic, while “If C=D, then E=F” is algebra. Algebraic treatment consists in replacing either of the terms A or B by an expression which we know from the laws of arithmetic to be equivalent to it. The subsequent reasoning is arithmetical.
23. Sign of Equality.—The various meanings of the sign of equality (=) must be distinguished.
(i.) 4 × 3 ℔=12 ℔.
This states that the result of the operation of multiplying 3 ℔ by 4 is 12 ℔.
(ii.) 4 × 3 ℔=3 × 4 ℔.</math>
This states that the two operations give the same result; i.e. that they are equivalent.
(iii.) A’s share=5s., or
3 × A’s share=15s.
Either of these is a statement of fact with regard to a particular quantity; it is usually called an equation, but sometimes a conditional equation, the term “equation” being then extended to cover (i.) and (ii.).
This is a definition of x3; the sign=is in such cases usually replaced by ≡.
This is usually regarded as being, like (ii.), a statement of equivalence. It is, however, only true if 1s. is equivalent to 12d., and the correct statement is then
If the operator is omitted, the statement is really an equation, giving 1s. in terms of 1d. or vice versa.
The following statements should be compared:—
X=A’s share= of £10=3×5£=£15.
X =A’s share= of £10 = of £30=£15.
In each case, the first sign of equality comes under (iv.) above, the second under (iii.), and the fourth under (i.); but the third sign comes under (i.) in the first case (the statement being that ½ of £10=£5) and under (ii.) in the second.
It will be seen from § 22 that the application of algebra to equations consists in the interchange of equivalent expressions, and therefore comes under (i.) and (ii.). We replace 4(x−3), for instance, by 4x−4.3, because we know that, whatever the value of x may be, the result of subtracting 3 from it and multiplying the remainder by 4 is the same as the result of finding 4x and 4.3 separately and subtracting the latter from the former.
A statement such as (i.) or (ii.) is sometimes called an identity.
The two expressions whose equality is stated by an equation or an identity are its members.
24. Use of Letters in General Reasoning.—It may be assumed that the use of letters to denote quantities or numbers will first arise in dealing with equations, so that the letter used will in each case represent a definite quantity or number; such general statements as those of §§ 15 and 16 being deferred to a later stage. In addition to these, there are cases in which letters can usefully be employed for general arithmetical reasoning.
- (i.) There are statements, such as A+B=B+A, which are particular cases of the laws of arithmetic, but need not be expressed as such. For multiplication, for instance, we have the statement that, if P and Q are two quantities, containing respectively p and q of a particular unit, then p×Q=q×P; or the more abstract statement that p×q=q×p.
- (ii.) The general theory of ratio and proportion requires the use of general symbols.
- (iii.) The general statement of the laws of operation of fractions is perhaps best deferred until we come to fractional numbers, when letters can be used to express the laws of multiplication and division of such numbers.
- (iv.) Variation is generally included in text-books on algebra, but apparently only because the reasoning is general. It is part of the general theory of quantitative relation, and in its elementary stages is a suitable subject for graphical treatment (§ 31).
25. Preparation for Algebra.—The calculation of the values of simple algebraical expressions for particular values of letters involved is a useful exercise, but its tediousness is apt to make the subject repulsive.
What is more important is to verify particular examples of general formulae. These formulae are of two kinds:—(a) the general properties, such as m(a+b)=ma+mb, on which algebra is based, and (b) particular formulae such as (x−a)(x+a)=x2−a2. Such verifications are of value for two reasons. In the first place, they lead to an understanding of what is meant by the use of brackets and by such a statement as 3(7+2)=3 . 7+3 . 2. This does not mean (cf. § 23) that the algebraic result of performing the operation 3(7+2) is 3 . 7+3 . 2; it means that if we convert 7+2 into the single number 9 and then multiply by 3 we get the same result as if we converted 3 . 7 and 3 . 2 into 21 and 6 respectively and added the results. In the second place, particular cases lay the foundation for the general formula.
Exercises in the collection of coefficients of various letters occurring in a complicated expression are usually performed mechanically, and are probably of very little value.
26. General Arithmetical Theorems.
(i.) The fundamental laws of arithmetic (q.v.) should be constantly borne in mind, though not necessarily stated. The following are some special points.
(a) The commutative law and the associative law are closely related, and it is best to establish each law for the case of two numbers before proceeding to the general case. In the case of addition, for instance, suppose that we are satisfied that in a+b+c+d+e we may take any two, as b and c, together (association) and interchange them (commutation). Then we have a+b+c+d+e=a+c+b+d+e. Thus any pair of adjoining numbers can be interchanged, so that the numbers can be arranged in any order.
(b) The important form of the distributive law is m(A+B)=mA+mB. The form (m+n)A=mA+nA follows at once from the fact that A is the unit with which we are dealing.
(c) The fundamental properties of subtraction and of division are that A−B+B=A and m× of A=A, since in each case the second operation restores the original quantity with which we started.
(ii.) The elements of the theory of numbers belong to arithmetic. In particular, the theorem that if n is a factor of a and of b it is also a factor of pa±qb, where p and q are any integers, is important in reference to the determination of greatest common divisor and to the elementary treatment of continued fractions. Graphic methods are useful here (§ 34 (iv.)). The law of relation of successive convergents to a continued fraction involves more advanced methods (see § 42 (iii.) and Continued Fraction).
(iii.) There are important theorems as to the relative value of fractions; e.g.
(a) If then each=.
(b) is nearer to 1 than is; and, generally, if ≠ , then lies between the two. (All the numbers are, of course, supposed to be positive.)
27. Negative Quantities and Fractional Numbers.—(i.) What are usually called “negative numbers” in arithmetic are in reality not negative numbers but negative quantities. If a person has to receive 7s. and pay 5s., with a net result of +2s., the order of the operations is immaterial. If he pays first, he then has −5s. This is sometimes treated as a debt of 5s.; an alternative method is to recognize that our zero is really arbitrary, and that in fact we shift it with every operation of addition or subtraction. But when we say “−5s.” we mean “−(5s.),” not “(−5)s.”; the idea of (−5) as a number with which we can perform such operations as multiplication comes later (§ 49).
(ii.) On the other hand, the conception of a fractional number follows directly from the use of fractions, involving the subdivision of a unit. We find that fractions follow certain laws corresponding exactly with those of integral multipliers, and we are therefore able to deal with the fractional numbers as if they were integers.
28. Miscellaneous Developments in Arithmetic.—The following are matters which really belong to arithmetic; they are usually placed under algebra, since the general formulae involve the use of letters.
(i.) Arithmetical Progressions such as 2, 5, 8, . . .—The formula for the rth term is easily obtained. The problem of finding the sum of r terms is aided by graphic representation, which shows that the terms may be taken in pairs, working from the outside to the middle; the two cases of an odd number of terms and an even number of terms may be treated separately at first, and then combined by the ordinary method, viz. writing the series backwards.
In this, as in almost all other cases, particular examples should be worked before obtaining a general formula.
(ii.) The law of indices (positive integral indices only) follows at once from the definition of a2, a3, a4, . . . as abbreviations of a.a, a.a.a, a.a.a.a, . . ., or (by analogy with the definitions of 2, 3, 4, . . . themselves) of a.a, a.a2, a.a3, . . . successively. The treatment of roots and of logarithms (all being positive integers) belongs to this subject; a= being the inverses of n=ap (cf. §§ 15, 16). The theory may be extended to the cases of p=1 and p=0; so that a3 means a.a.a.1, a2 means a.a.1, a1 means a.1, and a0 means 1 (there being then none of the multipliers a).
The terminology is sometimes confused. In n=ap, a is the root or base, p is the index or logarithm, and n is the power or antilogarithm. Thus a, a2, a3, . . . are the first, second, third, . . . powers of a. But ap is sometimes incorrectly described as “a to the power p”; the power being thus confused with the index or logarithm.
(iii.) Scales of Notation lead, by considering, e.g., how to express in the scale of 10 a number whose expression in the scale of 8 is 2222222, to
(iv.) Geometrical Progressions.—It should be observed that the radix of the scale is exactly the same thing as the root mentioned under (ii.) above; and it is better to use the term “root” throughout. Denoting the root by a, and the number 2222222 in this scale by N, we have
N= 2222222.
aN=22222220.
Thus by adding 2 to aN we can subtract N from aN+2, obtaining 20000000, which is=2 . a7; and from this we easily pass to the general formula for the sum of a geometrical progression having a given number of terms.
(v) Permutations and Combinations may be regarded as arithmetical recreations; they become important algebraically in reference to the binomial theorem (§§ 41, 44).
(vi.) Surds and Approximate Logarithms.—From the arithmetical point of view, surds present a greater difficulty than negative quantities and fractional numbers. We cannot solve the equation 7s.+X=4s.; but we are accustomed to transactions of lending and borrowing, and we can therefore invent a negative quantity −3s. such that −3s.+3s.=0. We cannot solve the equation 7X=4s.; but we are accustomed to subdivision of units, and we can therefore give a meaning to X by inventing a unit 17s=1s. such that 7×17s=1s., and can thence pass to the idea of fractional numbers. When, however, we come to the equation x2 = 5, where we are dealing with numbers, not with quantities, we have no concrete facts to assist us. We can, however, find a number whose square shall be as nearly equal to 5 as we please, and it is this number that we treat arithmetically as √5. We may take it to (say) 4 places of decimals; or we may suppose it to be taken to 1000 places. In actual practice, surds mainly arise out of mensuration; and we can then give an exact definition by graphical methods.
When, by practice with logarithms, we become familiar with the correspondence between additions of length on the logarithmic scale (on a slide-rule) and multiplication of numbers in the natural scale (including fractional numbers), √5 acquires a definite meaning as the number corresponding to the extremity of a length x, on the logarithmic scale, such that 5 corresponds to the extremity of 2x. Thus the concrete fact required to enable us to pass arithmetically from the conception of a fractional number to the conception of a surd is the fact of performing calculations by means of logarithms.
In the same way we regard log102, not as a new kind of number, but as an approximation.
(vii.) The use of fractional indices follows directly from this parallelism. We find that the product am × am × am is equal to a3m; and, by definition, the product ∛a × ∛a × ∛a is equal to a, which is a1. This suggests that we should write ∛a as a1/3; and we find that the use of fractional indices in this way satisfies the laws of integral indices. It should be observed that, by analogy with the definition of a fraction, ap/q mean (a1/q)p, not (ap)1/q.
II. Graphical Introduction to Algebra
29. The science of graphics is closely related to that of mensuration. While mensuration is concerned with the representation of geometrical magnitudes by numbers, graphics is concerned with the representation of numerical quantities by geometrical figures, and particularly by lengths. An important development, covering such diverse matters as the equilibrium of forces and the algebraic theory of complex numbers (§ 66), has relation to cases where the numerical quantity has direction as well as magnitude. There are also cases in which graphics and mensuration are used jointly; a variable numerical quantity is represented by a graph, and the principles of mensuration are then applied to determine related numerical quantities. General aspects of the subject are considered under Mensuration; Vector Analysis; Infinitesimal Calculus.
30. The elementary use of graphic methods is qualitative rather than quantitative; i.e. it is for purposes of illustration and suggestion rather than for purposes of deduction and exact calculation. We start with related facts, and adopt a particular method of visualizing the relation. One of the relations most commonly illustrated in this way is the time-relation; the passage of time being associated with the passage of a point along a straight line, so that equal intervals of time are represented by equal lengths.
31. It is important to begin the study of graphics with concrete cases rather than with tracing values of an algebraic function. Simple examples of the time-relation are—the number of scholars present in a class, the height of the barometer, and the reading of the thermometer, on successive days. Another useful set of graphs comprises those which give the relation between the expressions of a length, volume, &c., on different systems of measurement. Mechanical, commercial, economic and statistical facts (the latter usually involving the time-relation) afford numerous examples.
32. The ordinary method of representation is as follows. Let X and Y be the related quantities, their expressions in terms of selected units A and B being x and y, so that X = x . A, Y = y . B. For graphical representation we select units of length L and M, not necessarily identical. We take a fixed line OX, usually drawn horizontally; for each value of X we measure a length or abscissa ON equal to x . L, and draw an ordinate NP at right angles to OX and equal to the corresponding value of y . M. The assemblage of ordinates NP is then the graph of Y.
The series of values of X will in general be discontinuous, and the graph will then be made up of a succession of parallel and (usually) equidistant ordinates. When the series is theoretically continuous, the theoretical graph will be a continuous figure of which the lines actually drawn are ordinates. The upper boundary of this figure will be a line of some sort; it is this line, rather than the figure, that is sometimes called the “graph.” It is better, however, to treat this as a secondary meaning. In particular, the equality or inequality of values of two functions is more readily grasped by comparison of the lengths of the ordinates of the graphs than by inspection of the relative positions of their bounding lines.
33. The importance of the bounding line of the graph lies in the fact that we can keep it unaltered while we alter the graph as a whole by moving OX up or down. We might, for instance, read temperature from 60° instead of from 0°. Thus we form the conception, not only of a zero, but also of the arbitrariness of position of this zero (cf. § 27 (i.)); and we are assisted to the conception of negative quantities. On the other hand, the alteration in the direction of the bounding line, due to alteration in the unit of measurement of Y, is useful in relation to geometrical projection.
This, however, applies mainly to the representation of values of Y. Y is represented by the length of the ordinate NP, so that the representation is cardinal; but this ordinate really corresponds to the point N, so that the representation of X is ordinal. It is therefore only in certain special cases, such as those of simple time-relations (e.g. “J is aged 40, and K is aged 26; when will J be twice as old as K?”), that the graphic method leads without arithmetical reasoning to the properties of negative values. In other cases the continuation of the graph may constitute a dangerous extrapolation.
34. Graphic representation thus rests on the principle that equal numerical quantities may be represented by equal lengths, and that a quantity mA may be represented by a length mL, where A and L are the respective units; and the science of graphics rests on the converse property that the quantity represented by pL is pA, i.e. that pA is determined by finding the number of times that L is contained in pL. The graphic method may therefore be used in arithmetic for comparing two particular magnitudes of the same kind by comparing the corresponding lengths P and Q measured along a single line OX from the same point O.
(i.) To divide P by Q, we cut off from P successive portions each equal to Q, till we have a piece R left which is less than Q. Thus P = kQ+R, where k is an integer.
(ii.) To continue the division we may take as our new unit a submultiple of Q, such as Q/r, where r is an integer, and repeat the process. We thus get P = kQ+m . Q/r+S = (k+m/r)Q+S, where S is less than Q/r. Proceeding in this way, we may be able to express P÷Q as the sum of a finite number of terms k+m/r+n/r²+...; or, if r is not suitably chosen, we may not. If, e.g. r = 10, we get the ordinary expression of P/Q as an integer and a decimal; but, if P/Q were equal to 1/3, we could not express it as a decimal with a finite number of figures.
(iii.) In the above method the choice of r is arbitrary. We can avoid this arbitrariness by a different procedure. Having obtained R, which is less than Q, we now repeat with Q and R the process that we adopted with P and Q; i.e. we cut off from Q successive portions each equal to R. Suppose we find Q = sR+T, then we repeat the process with R and T; and so on. We thus express P÷Q in the form of a continued fraction, , which is usually written, for conciseness, &c., or &c.
(iv.) If P and Q can be expressed in the forms pL and qL, where p and q are integers, R will be equal to (p−kq)L, which is both less than pL and less than qL. Hence the successive remainders are successively smaller multiples of L, but still integral multiples, so that the series of quotients k, s, t, . . . will ultimately come to an end. Moreover, if the last divisor is uL., then it follows from the theory of numbers (§ 26 (ii.)) that (a) u is a factor of p and of q, and (b) any number which is a factor of p and q is also a factor of u. Hence u is the greatest common measure of p and q.
35. In relation to algebra, the graphic method is mainly useful in connexion with the theory of limits (§§ 58, 61) and the functional treatment of equations (§ 60). As regards the latter, there are two classes of cases. In the first class come equations in a single unknown; here the function which is equated to zero is the Y whose values for different values of X are traced, and the solution of the equation is the determination of the points where the ordinates of the graph are zero. The second class of cases comprises equations involving two unknowns; here we have to deal with two graphs, and the solution of the equation is the determination of their common ordinates.
Graphic methods also enter into the consideration of irrational numbers (§ 65).
III. Elementary Algebra of Positive Numbers
36. Monomials.—(i.) An expression such as a.2.a.a.b.c.3.a.a.c, denoting that a series of multiplications is to be performed, is called a monomial; the numbers (arithmetical or algebraical) which are multiplied together being its factors. An expression denoting that two or more monomials are to be added or subtracted is a multinomial or polynomial, each of the monomials being a term of it. A multinomial consisting of two or of three terms is a binomial or a trinomial.
(ii.) By means of the commutative law we can collect like terms of a monomial, numbers being regarded as like terms. Thus the above expression is equal to 6a5bc2, which is, of course, equal to other expressions, such as 6ba5c2. The numerical factor 6 is called the coefficient of a5bc2 (§ 20); and, generally, the coefficient of any factor or of the product of any factors is the product of the remaining factors.
(iii.) The multiplication and division of monomials is effected by means of the law of indices. Thus 6a5bc2 ÷ 5a2bc = 65a3c, since b0 = 1. It must, of course, be remembered (§ 23) that this is a statement of arithmetical equality; we call the statement an “identity,” but we do not mean that the expressions are the same, but that, whatever the numerical values of a, b and c may be, the expressions give the same numerical result.
In order that a monomial containing am as a factor may be divisible by a monomial containing ap as a factor, it is necessary that p should be not greater than m.
(iv.) In algebra we have a theory of highest common factor and lowest common multiple, but it is different from the arithmetical theory of greatest common divisor and least common multiple. We disregard numerical coefficients, so that by the H.C.F. or L.C.M. of 6a5bc2 and 12a4b2cd we mean the H.C.F. or L.C.M. of a5bc2 and a4b2cd. The H.C.F. is then an expression of the form apbqcrds, where p, q, r, s have the greatest possible values consistent with the condition that each of the given expressions shall be divisible by apbqcrds. Similarly the L.C.M. is of the form apbqcrds, where p, q, r, s have the least possible values consistent with the condition that apbqcrds shall be divisible by each of the given expressions. In the particular case it is clear that the H.C.F. is a4bc and the L.C.M. is a5b2cd.
The extension to multinomials forms part of the theory of factors (§ 51).
37. Products of Multinomials.—(i.) Special arithmetical results may often be used to lead up to algebraical formulae. Thus a comparison of numbers occurring in a table of squares
12=1 |
112=121 |
suggests the formula (A + a)2=A2 + 2Aa + a2. Similarly the equalities
99 × 101=9999=10000 − 1
98 × 102=9996=10000 − 4
97 × 103=9991=10000 − 9
. . .
. . .
. . .
lead up to (A−a) (A + a)=A2−a2. These, with (A−a)2 = A2−2Aa + a2, are the most important in elementary work.
(ii.) These algebraical formulae involve not only the distributive law and the law of signs, but also the commutative law. Thus (A + a)2=(A + a)(A + a)=A(A + a) + a(A + a)=AA + Aa + aA + aa; and the grouping of the second and third terms as 2Aa involves treating Aa and aA as identical. This is important when we come to the binomial theorem (§ 41, and cf. § 54 (i.)).
(iii.) By writing (A+a)2=A2 + 2Aa + a2 in the form (A+a)2=A2 + (2A+a)a, we obtain the rule for extracting the square root in arithmetic.
(iv.) When the terms of a multinomial contain various powers of x, and we are specially concerned with x, the terms are usually arranged in descending (or ascending) order of the indices; terms which contain the same power being grouped so as to give a single coefficient. Thus 2bx − 4x2 + 6ab + 3ax would be written −4x2 + (3a+2b)x + 6ab. It is not necessary to regard −4 here as a negative number; all that is meant is that 4x2 has to be subtracted.
(v.) When we have to multiply two multinomials arranged according to powers of x, the method of detached coefficients enables us to omit the powers of x during the multiplication. If any power is absent, we treat it as present, but with coefficient 0. Thus, to multiply x3 − 2x + 1 by 2x2+4, we write the process
- giving 2x5 + 2x2 − 8x + 4 as the result.
38. Construction and Transformation of Equations.—(i.) The statement of problems in equational form should precede the solution of equations.
(ii.) The solution of equations is effected by transformation, which may be either arithmetical or algebraical. The principles of arithmetical transformation follow from those stated in §§ 15-18 by replacing X, A, B, m, M, x, n, a and p by any expressions involving or not involving the unknown quantity or number and representing positive numbers or (in the case of X, A, B and M) positive quantities. The principle of algebraic transformation has been stated in § 22; it is that, if A=B is an equation (i.e. if either or both of the expressions A and B involves x, and A is arithmetically equal to B for the particular value of x which we require), and if B=is an identity (i.e. if B and C are expressions involving x which are different in form but are arithmetically equal for all values of x), then the statement A=C is an equation which is true for the same value of x for which A=B is true.
(iii.) A special rule of transformation is that any expression may be transposed from one side of an equation to the other, provided its sign is changed. This is the rule of transposition. Suppose, for instance, that P+Q−R+S=T. This may be written (P+Q−R)+S=T; and this statement, by definition of the sign −, is the same as the statement that (P+Q−R)=T−S. Similarly the statements P+Q−R−S=T and P+Q−R=T+S are the same. These transpositions are purely arithmetical. To transpose a term which is not the last term on either side we must first use the commutative law, which involves an algebraical transformation. Thus from the equation P+Q−R+S=T and the identity P+Q−R+S=P−R+S+Q we have the equation P−R+S+Q=T, which is the same statement as P−R+S=T−Q.
(iv.) The procedure is sometimes stated differently, the transposition being regarded as a corollary from a general theorem that the roots of an equation are not altered if the same expression is added to or subtracted from both members of the equation. The objection to this (cf. § 21 (ii.)) is that we do not need the general theorem, and that it is unwise to cultivate the habit of laying down a general law as a justification for an isolated action.
(v.) An alternative method of obtaining the rule of transposition is to change the zero from which we measure. Thus from P+Q−R+S=T we deduce P+(Q−R+S)=P+(T−P). If instead of measuring from zero we measure from P, we find Q−R+S=T−P. The difference between this and (iii.) is that we transpose the first term instead of the last; the two methods corresponding to the two cases under (i.) of § 15 (2).
(vi.) In the same way, we do not lay down a general rule that an equation is not altered by multiplying both members by the same number. Suppose, for instance, that 25(x+1)=43(x−2). Here each member is a number, and the equation may, by the commutative law for multiplication, be written 2(x+1)5=4(x−2)3. This means that, whatever unit A we take, 2(x+1)5 A and 4(x−2)3 A are equal. We therefore take A to be 15, and find that 6(x+1)=20(x−2). Thus, if we have an equation P=Q, where P and Q are numbers involving fractions, we can clear of fractions, not by multiplying P and Q by a number m, but by applying the equal multiples P and Q to a number m as unit. If the P and Q of our equation were quantities expressed in terms of a unit A, we should restate the equation in terms of a unit A/m, as explained in §§ 18 and 21 (i.) (a).
(vii.) One result of the rule of transposition is that we can transpose all the terms in x to one side of equation, and all the terms not containing x to the other. An equation of the form ax=b, where a and b do not contain x, is the standard form of simple equation.
(viii.) The quadratic equation is the equation of two expressions, monomial or multinomial, none of the terms involving any power of x except x and x2. The standard form is usually taken to be
ax2+bx+c=0,
from which we find, by transformation,
(2 ax+b)2=b2–4ac,
x=√{b2 − 4ac} − b2a.
This only gives one root. As to the other root, see § 47 (iii.).
39. Fractional Expressions.—An equation may involve a fraction of the form PQ, where Q involves x.
(i.) If P and Q can (algebraically) be written in the forms RA and SA respectively, where A may or may not involve x, then PQ=RASA=RS, provided A is not=0.
(ii.) In an equation of the form PQ=UV, the expressions P, Q, U, V are usually numerical. We then have PQ QV=UV QV, or PV=UQ, as in § 38 (vi.). This is the rule of cross-multiplication.
(iii.) The restriction in (i.) is important. Thus x2−1x2 + x − 2=(x−1) (x+1)(x−1) (x+2) is equal to x+1x+2, except when x=1. For this latter value it becomes 00, which has no direct meaning, and requires interpretation (§ 61).
40. Powers of a Binomial.—We know that (A+a)2=A2 + 2Aa + a2. Continuing to develop the successive powers of A+a into multinomials, we find that (A+a)³=A³ + 3A2a + 3Aa2 + a³, &c.; each power containing one more term than the preceding power, and the coefficients, when the terms are arranged in descending powers of A, being given by the following table:—
1 | ||||||
1 | 1 | |||||
1 | 2 | 1 | ||||
1 | 3 | 3 | 1 | |||
1 | 4 | 6 | 4 | 1 | ||
1 | 5 | 10 | 10 | 5 | 1 | |
1 | 6 | 15 | 20 | 15 | 6 | 1 |
&c., |
where the first line stands for (A+a)0=1. A0a0, and the successive numbers in the (n+1)th line are the coefficients of Ana0, An−1a1, . . . A0an in the n+1 terms of the multinomial equivalent to (A+a)n.
In the same way we have (A−a)2=A2−2Aa+a2, (A−a)3=A3−3A2a + 3Aa2−a3, . . . , so that the multinomial equivalent to (A − a)n has the same coefficients as the multinomial equivalent to (A+a)n, but with signs alternately + and −.
The multinomial which is equivalent to (A ± a)n, and has its terms arranged in ascending powers of a, is called the expansion of (A ± a)n.
41. The binomial theorem gives a formula for writing down the coefficient of any stated term in the expansion of any stated power of a given binomial.
(i.) For the general formula, we need only consider (A+a)n. It is clear that, since the numerical coefficients of A and of a are each 1, the coefficients in the expansions arise from the grouping and addition of like terms (§ 37 (ii.)). We therefore determine the coefficients by counting the grouped terms individually, instead of adding them. To individualize the terms, we replace (A+a) (A+a) (A+a) … by (A+a) (B+b) (C+c) …, so that no two terms are the same; the “like” -ness which determines the placing of two terms in one group being the fact that they become equal (by the commutative law) when B, C, … and b, c, .... are each replaced by A and a respectively.
Suppose, for instance, that n=5, so that we take five factors (A+a) (B+b) (C+c) (D+d) (E+e) and find their product. The coefficient of A2a3 in the expansion of (A+a)5 is then the number of terms such as ABcde, AbcDe, AbCde,..., in each of which there are two large and three small letters. The first term is ABCDE, in which all the letters are large; and the coefficient of A2a3 is therefore the number of terms which can be obtained from ABCDE by changing three, and three only, of the large letters into small ones.
We can begin with any one of the 5 letters, so that the first change can be made in 5 ways. There are then 4 letters left, and we can change any one of these. Then 3 letters are left, and we can change any one of these. Hence the change can be made in 3. 4. 5 ways.
If, however, the 3. 4. 5 results of making changes like this are written down, it will be seen that any one term in the required product is written down several times. Consider, for instance, the term AbcDe, in which the small letters are bce. Any one of these 3 might have appeared first, any one of the remaining 2 second, and the remaining 1 last. The term therefore occurs 1. 2. 3 times. This applies to each of the terms in which there are two large and three small letters. The total number of such terms in the multinomial equivalent to (A+a) (B+b) (C+c) (D+d) (E+e) is therefore (3. 4. 5) ÷ (1. 2. 3); and this is therefore the coefficient of A2a3 in the expansion of (A+a)5.
The reasoning is quite general; and, in the same way, the coefficient of An−rar in the expansion of (A+a)n is {(n−r+1) (n−r+2) . . . (n−1)n} ÷ {1. 2. 3 . . . r}. It is usual to write this as a fraction, inverting the order of the factors in the numerator. Then, if we denote it by n(r), so that
n(r) ≡ n(n−1)...(n−r+1)1. 2. 3...r | (1), |
we have
(A+a)n=n(0)An + n(1)An−1a + ... + n(r)An−rar + ... + n(n)an | (2), |
where n(0), introduced for consistency of notation, is defined by
n(0)≡1 | (3), |
This is the binomial theorem for a positive integral index.
(ii.) To verify this, let us denote the true coefficient of An−rar by (nr), so that we have to prove that (nr)=n(r), where n(r) is defined by (1); and let us inspect the actual process of multiplying the expansion of (A+a)n−1 by A+a in order to obtain that of (A+a)n. Using detached coefficients (§ 37 (v.)), the multiplication is represented by the following:—
1 + (n−11) + (n−12) + ... + (n−1r) + ... + 1 | |
1 + (n−11) + ... + (n−1r−1) + ... + (n−1n−2) + 1 | |
1 + (n1)+(n2) + ... + (nr) + ... + (nn−1) + 1, | |
so that | (nr)=(n−1r) + (n−1r−1). |
n(r)=(n −1)(r) + (n −1)(r−1) | (4), |
(nr)=n(r).
Hence the formula (2) is also true for the nth power of A+a. But it is true for the 1st and the 2nd powers; therefore it is true for the 3rd; therefore for the 4th; and so on. Hence it is true for all positive integral powers of n.
(iii.) The product 1. 2. 3 . . . r is denoted by |r or r !, and is called factorial r. The form r ! is better for printing, but the form |r is more convenient for ordinary use. If we denote n(n−1) . . . (n−r+ 1) (r factors) by n(r), then n(r) ≡ n(r)/r !.
(iv.) We can write n(r) in the more symmetrical form
n(r) = n!(n−r)! r | (5), |
which shows that
n(r) = n(n−r) | (6). |
We should have arrived at this form in (i.) by considering the selection of terms in which there are to be two large and three small letters, the large letters being written down first. The terms can be built up in 5! ways; but each will appear 2! 3! times.
(v.) Since n(r) is an integer, n(r) is divisible by r !; i.e. the product of any r consecutive integers is divisible by r ! (see § 42 (ii.)).
(vi.) The product r ! arose in (i.) by the successive multiplication of r, r − 1, r − 2, . . . 1. In practice the successive factorials 1!, 2!, 3! . . . are supposed to be obtained successively by introduction of new factors, so that
r ! = r . (r −1)! | (7). |
Thus in defining r ! as 1. 2. 3 . . . r we regard the multiplications as taking place from left to right; and similarly in n(r). A product in which multiplications are taken in this order is called a continued product.
(vii.) In order to make the formula (5) hold for the extreme values n(0) and n(n) we must adopt the convention that
0! = 1 | (8). |
This is consistent with (7), which gives 1! = 1.0!. It should be observed that, for r = 0, (4) is replaced by
n(0) = (n − 1)(0) | (9), |
and similarly, for the final terms, we should note that
p(q) = 0 if q > p | (10). |
(viii.) If ur denotes the term involving ar in the expansion of (A+a)n, then ur/ur −1 = {(n−r+1)/r}.a/A. This decreases as r increases; its value ranging from na/A to a/(nA). If na<A, the terms will decrease from the beginning; if n A<a, the terms will increase up to the end; if na > A and nA > a, the terms will first increase up to a greatest term (or two consecutive equal greatest terms) and then decrease.
(ix) The position of the greatest term will depend on the relative values of A and a; if a/A is small, it will be near the beginning. Advantage can be taken of this, when n is large, to make approximate calculations, by omitting terms that are negligible.
(a) Let Sr denote the sum u0+u1+ . . . ur, this sum being taken so as to include the greatest term (or terms); and let ur+1/ur = θ, so that θ<1. Then the sum of the remaining terms ur+1+ur+2+ . . . +un is less than(1+θ+θ2+ . . . +θn−r −1)ur+1, which is less than ur+1/(1−θ); and therefore (A+a)n lies between Sr, and Sr + ur+1/(1−θ). We can therefore stop as soon as ur+1/(1−θ) becomes negligible.
(b) In the same way, for the expansion of (A − a)n, let σr denote u0−u1+ . . . ± ur. Then, provided σr includes the greatest term, it will be found that (A − a)n lies between σr and σr+1.
For actual calculation it is most convenient to write the theorem in the form
(A±a)n = An(1±x)n = An±n1x.An+n−12x.n1x.An±...
where x ≡ a/A; thus the successive terms are obtained by successive multiplication. To apply the method to the calculation of Nn, it is necessary that we should be able to express N in the form A+a or A−a, where a is small in comparison with A, An is easy to calculate and a/A is convenient as a. multiplier.
42. The reasoning adopted in § 41 (ii.) illustrates two general methods of procedure. We know that (A+a)n is equal to a multinomial of n+1 terms with unknown coefficients, and we require to find these coefficients. We therefore represent them by separate symbols, in the same way that we represent the unknown quantity in an equation by a symbol. This is the method of undetermined coefficients. We then obtain a set of equations, and by means of these equations we establish the required result by a process known as mathematical induction. This process consists in proving that a property involving p is true when p is any positive integer by proving (1) that it is true when p = 1, and (2) that if it is true when p = n, where n is any positive integer, then it is true when p = n+1. The following are some further examples of mathematical induction.
(i.) By adding successively 1, 3, 5 . . . we obtain 1, 4, 9, . . . This suggests that, if un is the sum of the first n odd numbers, then un = n2. Assume this true for u1, u2, . . ., un. Then un+1 = un+(2n+1) = n2+(2n+1) = (n+1)2, so that it is true for un+1. But it is true for u1. Therefore it is true generally.
(ii.) We can prove the theorem of § 41 (v.) by a double application of the method.
(a) It is clear that every integer is divisible by 1!.
(b) Let us assume that the product of every set of p consecutive integers is divisible by p!, and let us try to prove that the product of every set of p+1 consecutive integers is divisible by (p+1)!. Denote the product n(n+1) . . . (n+r −1) by n[r]. Then the assumption is that, whatever positive integral value n may have, n[p] is divisible by p!.
(1) n[p+1]−(n−1)[p+1] = n(n+1). . . (n+p−1) = (p+1) . n[p]. But, by hypothesis, n[p+1]−(n−1)[p+1] is divisible by p!. Therefore if (n−1)[p+1] is divisible by (p+1)!, n[p+1] is divisible by (p+1)!.
(2) But 1[p+1] = (p+1)!, which is divisible by (p+1)!.
(3) Therefore n[p+1] is divisible by (p+1)!, whatever positive integral value n may have.
(c) Thus, if the theorem of § 41 (v.) is true for r = p, it is true for r = p+1. But it is true for r = 1. Therefore it is true generally.
(iii.) Another application of the method is to proving the law of formation of consecutive convergent to a continued fraction (see Continued Fractions).
43. Binomial Coefficients.—The numbers denoted by n(r) in § 41 are the binomial coefficients shown in the table in § 40; n(r) being the (r+1)th number in the (n+1)th row. They have arisen as the coefficients in the expansion of (A+a)n; but they may be considered independently as a system of numbers defined by (1) of § 41. The individual numbers are connected by various relations, some of which are considered in this section.
(i.) From (4) of § 41 we have
n(r)−(n−1)(r) = (n−1)(r −1) | (11). |
Changing n into n−1, n−2, . . ., and adding the results,
n(r)−(n−s)(r) = (n−1)(r −1)+(n−2)(r −1)+ ... +(n−s)(r −1) | (12). |
In particular,
n(r) = (n−1)(r −1)+(n−2)(r −1)+...+(r −1)(r −1) | (13). |
Similarly, by writing (4) in the form
n(r)−(n−1)(r −1) = (n−1)(r) | (14), |
changing n and r into n−1 and r −1, repeating the process, and adding, we find, taking account of (9),
n(r) = (n−1)(r)+(n−2)(r −1)+...+(n−r −1)0 | (15). |
(ii.) It is therefore more convenient to rearrange the table of § 40 as shown below, on the left; the table on the right giving the key to the arrangement.
1 | 0(0) | |||||||||||||||||
1 | 1(1) | |||||||||||||||||
1 | 1 | 1(0) | 2(2) | |||||||||||||||
2 | 1 | 2(1) | 3(3) | |||||||||||||||
1 | 3 | 1 | 2(0) | 3(2) | 4(4) | |||||||||||||
3 | 4 | 1 | 3(1) | 4(3) | 5(5) | |||||||||||||
1 | 6 | 5 | 1 | 3(0) | 4(2) | 5(4) | 6(6) | |||||||||||
4 | 10 | 6 | 1 | 4(1) | 5(3) | 6(5) | 7(7) | |||||||||||
1 | 10 | 15 | 7 | 1 | 4(0) | 5(2) | 6(4) | 7(6) | 8(8) | |||||||||
&c., | &c., |
O(0)=1 | (16), |
which is consistent with the relations in (i.). In this table any number is equal to the sum of the numbers which lie horizontally above it in the preceding column, and the difference of any two numbers in a column is equal to the sum of the numbers horizontally between them in the preceding column.
The coefficients in the expansion of (A+a)n for any particular value of n are obtained by reading diagonally upwards from left to right from the (n+1)th number in the first column.
(iii.) The table might be regarded as constructed by successive applications of (9) and (4); the initial data being (16) and (10). Alternatively, we might consider that we start with the first diagonal row (downwards from the left) and construct the remaining diagonal rows by successive applications of (15). Constructed in this way, the successive diagonal rows, commencing with the first, give the figurate numbers of the first, second, third, . . . order. The (r+1)th figurate number of the nth order, i.e. the (r+1)th number in the nth diagonal row, is n(n+1) . . . (n+r−1)/r!=n[r]/r!; this may, by analogy with the notation of §41, be denoted by n[r]. We then have
(n+1)[r]=(r+1)[n]=(n+r)!/(n! r!)=(n+r)(r)=(n+r)(n) | (17). |
(iv.) By means of (17) the relations between the binomial coefficients in the form p(q) may be replaced by others with the coefficients expressed in the form p(q). The table in (ii.) may be written
1[0] | |||||||||
1[1] | |||||||||
2[0] | 1[2] | ||||||||
2[1] | 1[3] | ||||||||
3[0] | 2[2] | 1[4] | |||||||
3[1] | 2[3] | 1[5] | |||||||
4[0] | 3[2] | 2[4] | 1[6] | ||||||
4[1] | 3[3] | 2[5] | 1[7] | ||||||
5[0] | 4[2] | 3[4] | 2[6] | 1[8] | |||||
&c., |
The most important relations are
n[r]=n[r−1]+(n−1)(r) | (18); |
O[r]=0 | (19); |
n[r]-(n−s)[r]=n[r-1]+(n−1)[r-1]+...+(n−s+1)[r-1] | (20); |
n[r]=n[r−1]+(n−1)[r−1]+...+1[r−1] | (21). |
(v.) It should be mentioned that the notation of the binomial coefficients, and of the continued products such as n(n−1) . . . (n−r+1), is not settled. Some writers, for instance, use the symbol n, in place, in some cases, of n(r), and, in other cases, of n(r). It is convenient to retain x, to denote xr/r!, so that we have the consistent notation
xr=xr/r!, n(r)=n(r)/r!, n[r]=n[r]/r!.
The binomial theorem for positive integral index may then be written
(x+y)n=xny0+xn−1y1+ ... + xn−ryr+ ... + x0yn.
This must not be confused with the use of suffixes to denote particular terms of a series or a progression (as in § 41 (viii.) and (ix.)).
44. Permutations and Combinations.—The discussion, in § 41 (i.), of the number of terms of a particular kind in a particular product, forms part of the theory of combinatorial analysis (q.v.), which deals with the grouping and arrangement of individuals taken from a defined stock. The following are some particular cases; the proof usually follows the lines already indicated. Certain of the individuals may be distinguishable from the remainder of the stock, but not from each other; these may be called a type.
(i.) A permutation is a linear arrangement, read in a definite direction of the line. The number (nPr) of permutations of r individuals out of a stock of n, all being distinguishable, is n(r). In particular, the number of permutations of the whole stock is n!.
If a of the stock are of one type, b of another, c of another, . . . the number of distinguishable permutations of the whole stock is n!÷(a!b!c! . . .).
(ii.) A combination is a group of individuals without regard to arrangement. The number (nCr) of combinations of r individuals out of a stock of n has in effect been proved in § 41 (i.) to be n(r). This property enables us to establish, by simple reasoning, certain relations between binomial coefficients. Thus (4) of § 41 (ii.) follows from the fact that, if A is any one of the n individuals, the nCr groups of r consist of n−1Cr−1 which contain A and n−1Cr which do not contain A. Similarly, considering the various ways in which a group of r may be obtained from two stocks, one containing m and the other containing n, we find that
m+nCr=mCr·nC0+mCr−1·nC1+ ... + mC0·nCr,
which gives
(m+n)(r)=m(r)·n(0)+m(r-1)·n(1)+...+m(0)·n(r) | (22). |
This may also be written
(m+n)(r)=m(r)·n(0)+r(1)·m(r-1)·n(1)+...+r(r)·m(0)·n(r) | (23). |
If r is greater than m or n (though of course not greater than m+n), some of the terms in (22) and (23) will be zero.
(iii.) If there are n types, the number of individuals in each type being unlimited (or at any rate not less than r), the number (nHr) of distinguishable groups of r individuals out of the total stock is n[r]. This is sometimes called the number of homogeneous products of r dimensions formed out of n letters; i.e. the number of products such as xr, xr−3y3 xr−2z2, . . . that can be formed with positive integral indices out of n letters x, y, z, . . ., the sum of the indices in each product being r.
(iv.) Other developments of the theory deal with distributions, partitions, &c. (see Combinatorial Analysis).
(v.) The theory of probability (q.v.) also comes under this head. Suppose that there are a number of arrangements of r terms or elements, the first of which a is always either A or not-A, the second b is B or not-B, the third c is C or not-C, and so on. If, out of every N cases, where N may be a very large number, a is A in pN cases and not-A in (1−p)N cases, where p is a fraction such that pN is an integer, then p is the probability or frequency of occurrence of A. We may consider that we are dealing always with a single arrangement abc . . .. and that the number of times that a is made A bears to the number of times that a is made not-A the ratio of p to 1−p; or we may consider that there are N individuals, for pN of which the attribute a is A, while for (1−p)N it is not-A. If, in this latter case, the proportion of cases in which b is B to cases in which b is not-B is the same for the group of pN individuals in which a is A as for the group of (1−p)N in which a is not-A, then the frequencies of A and of B are said to be independent; if this is not the case they are said to be correlated. The possibilities of a, instead of being A and not-A, may be A1, A2, . . ., each of these having its own frequency; and similarly for b, c, . . . If the frequency of each A is independent of the frequency of each B, then the attributes a and b are independent; otherwise they are correlated.
45. Application of Binomial Theorem to Rational Integral Functions.—An expression of the form c0xn+c1xn−1+cn, where c0, c1, . . . do not involve x, and the indices of the powers of x are all positive integers, is called a rational integral function of x of degree n.
If we represent this expression by f(x), the expression obtained by changing x into x+h is f(x+h); and each term of this may be expanded by the binomial theorem. Thus we have
f(x+)h=c0xn+nc0xn−1h1!+n(n−1)c0xn−2h22!+...
+c1xn−1+(n−1)c1xn−2h1!+(n−1)(n−2)c1xn−3h22!+...
+c2xn−2+(n−2)c2xn−3h1!+(n−2)(n−3)c2xn−4h22!+...
+ &c.
= {c0xn+c1xn−1+c2xn−2+...}
+nc0xn−1+(n−1)c1xn−2+(n−2)c2xn−3 + ... h1!
+n(n−1)c0xn−2+(n−1)(n−2)c1xn−3 + ... h22!
+ &c.
It will be seen that the expression in curled brackets in each line after the first is obtained from the corresponding expression in the preceding line by a definite process; viz. xr is replaced by r.xr−1, except for r = 0, when x0 is replaced by 0. The expressions obtained in this way are called the first, second, . . . derived functions of f(x). If we denote these by f1(x), f2(x), . . ., so that fn(x) is obtained from fn−1(x) by the above process, we have
f(x+h) = f(x)+f1(x).h+f2(x)h2/2+...+fr(x)hr/r+...
This is a particular case of Taylor’s theorem (see Infinitesimal Calculus).
46. Relation of Binomial Coefficients to Summation of Series.—(i.) The sum of the first n terms of an ordinary arithmetical progression (a+b), (a+2b), . . . (a+nb) is (§ 28 (i.)) 12n{(a+b)+(a+nb)} = na+12n(n+1)b = n[1].a+n[2].b. Comparing this with the table in §43 (iv.), and with formula (21), we see that the series expressing the sum may be regarded as consisting of two, viz. a+a+ . . . and b+2b+3b+ . . . ; for the first series we multiply the table (i.e. each number in the table) by a, and for the second series we multiply it by b, and the terms and their successive sums are given for the first series by the first and the second columns, and for the second series by the second and the third columns.
(ii.) In the same way, if we multiply the table by c, the sum of the first n numbers in any column is equal to the nth number in the next following column. Thus we get a formula for the sum of n terms of a series such as
2.4.6+4.6.8+..., or 6.8.10.12+8.10.12.14+...
(iii.) Suppose we have such a series as 2.5+5.8+8.11+...This cannot be summed directly by the above method. But the nth term is (3n−1)(3n+2) = 18n[2]−6n[1]−2. The sum of n terms is therefore (§ 43 (iv.))
18n[3]−6n[2]−2n[1] = 3n3+6n2+n.
(iv.) Generally, let N be any rational integral function of n of degree r. Then, since n[r] is also a rational integral function of n of degree r, we can find a coefficient cr, not containing n, and such as to make N−crn[r] contain no power of n higher than nr−1. Proceeding in this way, we can express N in the form cr.n[r]+cr−1n[r−1]+ . . ., where cr, cr−1, cr−2, . . . do not contain n; and thence we can obtain the sum of the numbers found by putting n = 1, 2, 3, . . . n successively in N. These numbers constitute an arithmetical progression of the rth order.
(v.) A particular case is that of the sum 1r+2r+3r + . . . + nr, where r is a positive integer. It can be shown by the above reasoning that this can be expressed as a series of terms containing descending powers of n, the first term being nr+1/(r+1). The most important cases are
1 + 2 + 3 + ... + n = 12n(n+1),< 12+ 22+ 32+ ... + n2 = 16n(n+1)(2n+1), 13+ 23+ 33+ ... + n3 = 14n2(n+1)2 = (1+2+...+n)2.
The general formula (which is established by more advanced methods) is
12.0r+1r+2r+...+(n−1)r = 1r+1 nr+1+B1(r+1)(2)nr−1−B2(r+1)(4)nr−3+ . . . ,
where B1, B2, . . . are certain numbers known as Bernoulli’s numbers, and the terms within the bracket, after the first, have signs alternately + and −. The values of the first ten of Bernoulli’s numbers are
B1 = 16, B2 = 130, B3 = 142, B4 =130, B5 = 566, B6 = 6912730, B7 = 76, B8 = 3617510, B9 = 43867798, B10 = 174611330
IV. Negative Numbers and Formal Algebra.
47. Negative quantities will have arisen in various ways, e.g.
(i.) The logical result of the commutative law, applied to a succession of additions and subtractions, is to produce a negative quantity −3s. such that −3s. + 3s. = 0(§ 28 (vi.)).
(ii.) Simple equations, especially equations in which the unknown quantity is an interval of time, can often only be satisfied by a negative solution (§ 33).
(iii.) In solving a quadratic equation by the method of § 38 (viii.) we may be led to a result which is apparently absurd. If, for instance, we inquire as to the time taken to reach a given height by a body thrown upwards with a given velocity, we find that the time increases as the height decreases. Graphical representation shows that there are two solutions, and that an equation X2 = 9a2 may be taken to be satisfied not only by X = 3a but also by X = −3a.
48. The occurrence of negative quantities does not, however, involve the conception of negative numbers. In (iii.) of § 47, for instance, “−3a” does not mean that a is to be taken (−3.) times, but that a is to be taken 3 times, and the result treated as subtractive; i.e. −3a means −(3a), not (−3)a (cf. § 27 (i.)).
In the graphic method of representation the sign − may be taken as denoting a reversal of direction, so that, if + 3 represents a length of 3 units measured in one direction, −3 represents a length of 3 units measured in the other direction. But even so there are two distinct operations concerned in the −3, viz. the multiplication by 3 and the reversal of direction. The graphic method, therefore, does not give any direct assistance towards the conception of negative numbers as operators, though it is useful for interpreting negative quantities as results.
49. In algebraical transformations, however, such as (x−a)2 = x2−2ax+a2, the arithmetical rule of signs enables us to combine the sign − with a number and to treat the result as a whole, subject to its own laws of operation. We see first that any operation with 4a−3b can be regarded as an operation with (+)4a+(−)3b, subject to the conditions (1) that the signs (+) and (−) obey the laws (+)(+) = (+), (+)(−) = (−)(+) = (−), (−)(−) = (+), and (2) that, when processes of multiplication are completed, a quantity is to be added or subtracted according as it has the sign (+) or (−) prefixed. We are then able to combine any number with the + or the − sign inside the bracket, and to deal with this constructed symbol according to special laws; i.e. we can replace pr or −pr by (+p)r or (−p)r, subject to the conditions that (+p) (+q) = (−p)(+q) = (−pq), and that + (−s) means that s is to be subtracted.
These constructed symbols may be called positive and negative coefficients; or a symbol such as (−p) may be called a negative number, in the same way that we call 23 a fractional number.
This increases the extent of the numbers with which we have to deal; but it enables us to reduce the number of formulae. The binomial theorem may, for instance, be stated for (x+a)n alone; the formula for (x−a)n being obtained by writing it as {x+(−)a}n or {x+(−a)}n, so that
(x−a)n = xn−n(1)xn−1a+...+(−)rn(r)xn−rar+...,
where + (−)r means − or + according as r is odd or even.
The result of the extension is that the number or quantity represented by any symbol, such as P, may be either positive or negative. The numerical value is then represented by |P|; thus “|x|<1” means that x is between −1 and +1.
50. The use of negative coefficients leads to a difference between arithmetical division and algebraical division (by a multinomial), in that the latter may give rise to a quotient containing subtractive terms. The most important case is division by a binomial, as illustrated by the following examples:—
(1)(2)
2.10+1) 6.100+5.10+1 (3.10+1 2.10+1) 6.100+1.10−1 (3.10−1
6.100+3.106.100+3.10
2.10+1−2.10−1
2.10+1−2.10−1
In (1) the division is both arithmetical and algebraical, while in (2) it is algebraical, the quotient for arithmetical division being 2.10+9.
It may be necessary to introduce terms with zero coefficients. Thus, to divide 1 by 1+x algebraically, we may write it in the form 1+0.x+0.x2+0.x3+0.x4, and we then obtain
11+x = 1+0.x+0.x2+0.x3+0.x41+x = 1 − x + x2 + x3 + x41+x,
where the successive terms of the quotient are obtained by a process which is purely formal.
51. If we divide the sum of x2 and a2 by the sum of x and a, we get a quotient x−a and remainder 2a2, or a quotient a−x and remainder 2x2, according to the order in which we work. Algebraical division therefore has no definite meaning unless dividend and divisor are rational integral functions of some expression such as x which we regard as the root of the notation (§ 28 (iv.)), and are arranged in descending or ascending powers of x. If P and M are rational integral functions of x, arranged in descending powers of x, the division of P by M is complete when we obtain a remainder R whose degree (§ 45) is less than that of M. If R=0, then M is said to be a factor of P.
The highest common factor (or common factor of highest degree) of two rational integral functions of x is therefore found in the same way as the G.C.M. in arithmetic; numerical coefficients of the factor as a whole being ignored (cf. § 36 (iv.)).
52. Relation between Roots and Factors.—
(i.) If we divide the multinomial
P ≡ p0xn+p1xn−1+...+pn
by x−a, according to algebraical division, the remainder is
R ≡ p0an+p1an−1+...+pn.
This is the remainder-theorem; it may be proved by induction.
(ii.) If x=a satisfies the equation P=0, then p0an+p1an−1+...+pn=0; and therefore the remainder when P is divided by x−a is 0, i.e. x−a is a factor of P.
(iii.) Conversely, if x−a is a factor of P, then p0an+p1an−1+...+pn=0; i.e. x=a satisfies the equation P=0.
(iv.) Thus the problems of determining the roots of an equation P=0 and of finding the factors of P, when P is a rational integral function of x, are the same.
(v.) In particular, the equation P=0, where P has the value in (i.), cannot have more than n different roots.
The consideration of cases where two roots are equal belongs to the theory of equations (see Equation).
(vi.) It follows that, if two multinomials of the nth degree in x have equal values for more than n values of x, the corresponding coefficients are equal, so that the multinomials are equal for all values of x.
53. Negative Indices and Logarithms.—(i.) Applying the general principles of §§ 47-49 to indices, we find that we can interpret X−m as being such that
Xm.X−m=X0=1; i.e. X−m=1/Xm.
In the same way we interpret X−p/q as meaning 1/Xp/q.
(ii.) This leads to negative logarithms (see Logarithm).
54. Laws of Algebraic Form.—(i.) The results of the addition, subtraction and multiplication of multinomials (including monomials as a particular case) are subject to certain laws which correspond with the laws of arithmetic (§ 26(i.)) but differ from them in relating, not to arithmetical value, but to algebraic form. The commutative law in arithmetic, for instance, states that a+b and b+a, or ab and ba, are equal. The corresponding law of form regards a+b and b+a, or ab and ba, as being not only equal but identical (cf. § 37 (ii.)), and then says that A+B and B+A, or AB and BA, are identical, where A and B are any multinomials. Thus a(b+c) and (b+c)a give the same result, though it may be written in various ways, such as ab+ac, ca+ab, &c. In the same way the associative law is that A(BC) and (AB)C give the same formal result.
These laws can be established either by tracing the individual terms in a sum or a product or by means of the general theorem in § 52 (vi.).
(ii.) One result of these laws is that, when we have obtained any formula involving a letter a, we can replace a by a multinomial. For instance, having found that (x+a)2=x2+2ax+a2, we can deduce that (x+b+c)2={x+(b+c)}2=x2+2(b+c)x+(b+c)2.
(iii.) Another result is that we can equate coefficients of like powers of x in two multinomials obtained from the same expression by different methods of expansion. For instance, by equating coefficients of xr in the expansions of (1 + x)m+n and of (1+x)m.(1+x)n we obtain (22) of § 44 (ii.).
(iv.) On the other hand, the method of equating coefficients often applies without the assumption of these laws. In § 41 (ii.), for instance, the coefficient of An−rar in the expansion of (A+a)(A+a)n−1 has been called (nr); and it has then been shown that (nr)=(n−1r) + (n−1r−1). This does not involve any assumption of the identity of results obtained in different ways; for the expansions of (A+a)2, (A+a)3, . . . are there supposed to be obtained in one way only, viz. by successive multiplications by A+a.
55. Algebraical Division.—In order to extend these laws so as to include division, we need a definition of algebraical division. The divisions in §§ 50-52 have been supposed to be performed by a process similar to the process of arithmetical division, viz. by a series of subtractions. This latter process, however, is itself based on a definition of division in terms of multiplication (§§ 15, 16). If, moreover, we examine the process of algebraical division as illustrated in § 50, we shall find that, just as arithmetical division is really the solution of an equation (§ 14), and involves the tacit use of a symbol to denote an unknown quantity or number, so algebraical division by a multinomial really implies the use of undetermined coefficients (§ 42). When, for instance, we find that the quotient, when 6+5x+7x2+13x3+5x4 is divided by 2+3x+x2, is made up of three terms +3, −2x, and +5x2, we are really obtaining successively the values of c0, c1, and c2 which satisfy the identity 6+5x+7x2+13x3+5x4=(c0+c1x+c2x2)(2+3x+x2); and we could equally obtain the result by expanding the right-hand side of this identity and equating coefficients in the first three terms, the coefficients in the remaining terms being then compared to see that there is no remainder. We therefore define algebraical division by means of algebraical multiplication, and say that, if P and M are multinomials, the statement "P/M=Q" means that Q is a multinomial such that MQ (or QM) and P are identical. In this sense, the laws mentioned in § 54 apply also to algebraical division.
56. Extensions of the Binomial Theorem.—It has been mentioned in § 41 (ix.) that the binomial theorem can be used for obtaining an approximate value for a power of a number; the most important terms only being taken into account. There are extensions of the binomial theorem, by means of which approximate calculations can be made of fractions, surds, and powers of fractions and of surds; the main difference being that the number of terms which can be taken into account is unlimited, so that, although we may approach nearer and nearer to the true value, we never attain it exactly. The argument involves the theorem that, if θ is a positive quantity less than 1, θt can be made as small as we please by taking t large enough; this follows from the fact that t log θ can be made as large (numerically) as we please.
(i.) By algebraical division,
11+x=1+0.x+0.x2+...+0.xr+11+x=1 − x + x2−...+(−)rxr+(−)r+1xr+11+x | (24). |
If, therefore, we take 1/(1+x) as equal to 1−x+x2−...+(−)rxr, there is an error who whose numerical magnitude is |xr+1/(1+x)|; and, if |x| < 1, this can be made as small as we please.
This is the foundation of the use of recurring decimals; thus we can replace 411{=3699=36100/(1−1100)} by ·363636 (=36/102+36/104+36/106), with an error (in defect) of only 36/(106 .99).
(ii.) Repeated divisions of (24) by 1+x, r being replaced by r+1 before each division, will give
(1+x)−2=1−2x+3x2−4x3+...+(−)r(r+1)xr+(−)r+1xr+1{(r+1)(1+x)−1+(1+x)−2},
(1+x)−3=1−3x+6x2−10x3+...+(−)r.12(r+1)(r+2)xr+(−)(r+1)xr+112(r+1)(r+2)(1+x)−1+(r+1)(1+x)−2+(1+x)−3}, &c.
Comparison with the table of binomial coefficients in § 43 suggests that, if m is any positive integer,
(1+x)−m=Sr+Rr | (25), |
Rr ≡ (−)r+1xr+1{m[r](1+x)−1+(m−1)[r](1+x)−2+...+1[r](1+x)−m } | (27). |
This can be verified by induction. The same result would (§ 55) be obtained if we divided 1+0.x+0.x2+... at once by the expansion of (1+x)m.
(iii.) From (21) of § 43 (iv.) we see that |Rr| is less than m[r+1]xr+1 if x is positive, or than |m[r+1]xm+1(1+x)−m| if x is negative; and it can hence be shown that, if |x| < 1, |Rr| can be made as small as we please by taking r large enough, so that we can make Sr approximate as closely as we please to (1+x)−m.
(iv.) To assimilate this to the binomial theorem, we extend the definition of n(r) in (1) of § 41 (i) so as to cover negative integral values of n; and we then have
(−m)(r)=(−m)(−m−1)...(−m−r+1)r!=(−)rm[r] | (28), |
so that, if n ≡ −m,
Sr ≡ 1 + n(1)x+n(2)x2+...+n(r)xr | (29). |
(v.) The further extension to fractional values (positive or negative) of n depends in the first instance on the establishment of a method of algebraical evolution which bears the same relation to arithmetical evolution (calculation of a surd) that algebraical division bears to arithmetical division. In calculating √2, for instance, we proceed as if 2.000. . . were the exact square of some number of the form c0+c1/10+c2/102+. . .
In the same way, to find X1/q, where X ≡ 1+a1x+a2x2+... and q is a positive integer, we assume that X1/q=1+b1x+b2x2..., and we then (cf. § 55) determine b1, b2, . . . in succession so that (1+b1x+ b2x2+. . .)q shall be identical with X.
The application of the method to the calculation of (1+x)n, when n=p/q, q being a positive integer and p a positive or negative integer, involves, as in the case where n is a negative integer, the separate consideration of the form of the coefficients b1, b2, . . . and of the numerical value of 1 + b1x+b2x2+. . .+brxr.
(vi.) The definition of n(r), which has already been extended in (iv.) above, has to be further extended so as to cover fractional values of n, positive or negative. Certain relations still hold, the most important being (22) of § 44 (ii.), which holds whatever the values of m and of n may be; r, of course, being a positive integer. This may be proved either by induction or by the method of § 52 (vi.). The relation, when written in the form (23), is known as Vandermonde's theorem. By means of this theorem it can be shown that, whatever the value of n may be,
{1+(p/q)(1)x+(p/q)(2)x2+...+(p/q)(r)xr}q=1+p(1)x+p(2)x2+...+p(r)xr+terms in xr+1, xr+2,...xqr.
(vii.) The comparison of the numerical value of 1+n(1)x+n(2)x2+ ... +n(r)xr, when n is fractional, with that of (1+x)n, involves advanced methods (§ 64). It is found that this expression can be used for approximating to the value of (1+x)n, provided that |x| < 1; the results are as follows, where ur denotes n(r)xr and Sr, denotes u0+u1+u2+. . .+ur.
(a) If n > −1, then, provided r > n,
(1) If 1 > x > 0, (1+x)n lies between Sr and Sr+1;
(2) If 0 > x > −1 , (1+x)n lies between Sr and Sr +ur+1/(1+x).
(b) If n < −1, the successive terms will either constantly decrease (numerically) from the beginning or else increase up to a greatest term (or two equal consecutive greatest terms) and then constantly decrease. If Sr is taken so as to include the greatest term (or terms), then,
(1) If 1 > x > 0, (1+x)n lies between Sr and Sr+1;
(2) If 0 > x > −1, (1+x)n lies between Sr and Sr + ur+1/(1−ur+1/ur).
The results in (b) apply also if n is a negative integer.
(viii) In applying the theorem to concrete cases, conversion of a number into a continued fraction is often useful. Suppose, for instance, that we require to calculate (23/13)32. We want to express (23/13)3 in the form a2b, where b is nearly equal to 1. We find that 32 log10 (23/13)=.3716767=log10(2.3533)=log10(40/17) nearly; and thence that (23/13)32=(40/17) (1+1063/3515200)12, which can be calculated without difficulty to a large number of significant figures.
(ix.) The extension of n(r), and therefore of n[r], to negative and fractional values of n, enables us to extend the applicability of the binomial coefficients to the summation of series (§ 46 (ii)). Thus the nth term of the series 2⋅5+5⋅8+8⋅11+...in § 46 (iii.) is 18(n−13)[2]; formula (20) of § 43 (iv.) holds for the extended coefficients, and therefore the sum of n terms of this series is 18⋅(n−13)[3]−18⋅(−13)[3]=3n3+6n2+n. In this way we get the general rule that, to find the sum of n terms of a series, the rth term of which is (a+rb)(a+r+1⋅b)...(a+r+p−1⋅b), we divide the product of the p+1 factors which occur either in the nth or in the (n+1)th term by p+1, and by the common difference of the factors, and add to a constant, whose value is found by putting n=0.
57. Generating Functions.—The series 1−m[1]x+m[2]x2— . . . obtained by dividing 1+0⋅x+0⋅x2+ . . . by (1+x)n, or the series 1+(p/q)(1)x+(p/q)(2)x2 + . . . obtained by taking the qth root of 1+p(1)x+p(2)x2+ . . . , is an infinite series, i.e. a series whose successive terms correspond to the numbers 1, 2, 3, . . . It is often convenient, as in § 56 (ii.) and (vi.), to consider the mode of development of such a series, without regard to arithmetical calculation; i.e. to consider the relations between the coefficients of powers of x, rather than the values of the terms themselves. From this point of view, the function which, by algebraical operations on 1+0⋅x+0⋅x2+ . . . , produces the series, is called its generating function. The generating functions of the two series, mentioned above, for example, are (1+x)−m and (1+x)p/q. In the same way, the generating function of the series 1+2x+x2+0⋅x3+0⋅x4+. . . is (1+x)2.
Considered in this way, the relations between the coefficients of the powers of x in a series may sometimes be expressed by a formal equality involving the series as a whole. Thus (4) of § 41 (ii.) may be written in the form
1 + n(1)x+n(2)x2+...+n(r)xr+...=f (1+x){1+(n−1)(1)x+(n−1)(2)x2+...+(n−1)(r)xr+...};
the symbol “ =f ” being used to indicate that the equality is only formal, not arithmetical.
This accounts for the fact that the same table of binomial coefficients serves for the expansions of positive powers of 1 + x and of negative powers of 1 − x. For (4) may (§ 43 (iv.)) be written
(n − 1)[r]=n[r]−n[r−1],
and this leads to relations of the form
1+2x+3x2+... =f (1−x)(1+3x+6x2+10x3+...) | (30), |
each set of coefficients being the numbers in a downward diagonal of the table. In the same way (21) of § 43 (iv.) leads to such relations as
1+3x+6x2+... =f (1+x+x2+...)(1+2x+3x2+...) | (31), |
the relation of which to (30) is obvious.
An application of the method is to the summation of a recurring series, i.e. a series c0+c1x+c2x2+...whose coefficients are connected by a relation of the form p0cr+p1cr−1+...+pkcr−k=0, where p0, p1, . . . pk are independent of x and of r.
58. Approach to a Limit.—There are two kinds of approach to a limit, which may be illustrated by the series forming the expansion of (x+h)n, where n is a negative integer and 1 > h/x > 0.
(i) Denote n(r)xn−rhr by ur, and u0+u1+ . . . +ur by Sr. Then (§ 56 (iii.)) (x+h)n lies between Sr and Sr+1; and provided Sr includes the numerically greatest term, |Sr+1−Sr| constantly decreases as r increases, and can be made as small as we please by taking r large enough. Thus by taking r=0, 1, 2, . . . we have a sequence S0, S1, S2, . . . (i.e. a succession of numbers corresponding to the numbers 1, 2, 3, . . .) which possesses the property that, by starting far enough in the sequence, the range of variation of all subsequent terms can be made as small as we please, but (x+h)n always lies between the two values determining the range. This is expressed by saying that the sequence converges to (x+h)n as its limit; it may be stated concisely in any of the three ways,
(x+h)n=lim(xn+n(1)xn−1h+...+n(r)xn−rhr+...), (x+h)n=lim Sr, Sr ≐ (x+h)n.
It will be noticed that, although the differences between successive terms of the sequence will ultimately become indefinitely small, there will always be intermediate numbers that do not occur in the sequence. The approach to the limit will therefore be by a series of jumps, each of which, however small, will be finite; i.e. the approach will be discontinuous.
(ii) Instead of examining what happens as r increases, let us examine what happens as h/x decreases, r remaining unaltered. Denote h/x by θ, where 1 > θ > 0; and suppose further that θ < |1/n|, so that the first term of the series u0+u1+u2+ . . . is the greatest (numerically). Then {(x+h)n−Sr}/h r+1 lies between n(r+1)xn−r−1 and n(r+1)xn−r−1(1+θ)n; and the difference between these can be made as small as we please by taking h small enough. Thus we can say that the limit of {(x+h)n−Sr}/hr+1 is n(r+1)xn−r−1; but the approach to this limit is of a different kind from that considered in (i.), and its investigation involves the idea of continuity.
V. Continuity.
59. The idea of continuity must in the first instance be introduced from the graphical point of view; arithmetical continuity being impossible without a considerable extension of the idea of number (§ 65). The idea is utilized in the elementary consideration of a differential coefficient; and its importation into the treatment of certain functions as continuous is therefore properly associated with the infinitesimal calculus.
60. The first step consists in the functional treatment of equations. Thus, to solve the equation ax2+bx+c = 0, we consider, not merely the value of x for which ax2+bx+c is 0, but the value of ax2+bx+c for every possible value of x. By graphical treatment we are able, not merely to see why the equation has usually two roots, and also to understand why there is in certain cases only one root (i.e. two equal roots) and in other cases no root, but also to see why there cannot be more than two roots.
Simultaneous equations in two unknowns x and y may be treated in the same way, except that each equation gives a functional relation between x and y. (“Indeterminate equations” belong properly to the theory of numbers.)
61. From treating an expression involving x as a. function of x which may change continuously when x changes continuously, we are led to regard two functions x and y as changing together, so that (subject to certain qualifications) to any succession of values of x or of y there corresponds a succession of values of y or of x; and thence, if (x, y) and (x+h, y+k) are pairs of corresponding values, we are led to consider the limit (§ 58 (ii.)) of the ratio k/h when h and k are made indefinitely small. Thus we arrive at the differential coefficient of ƒ(x) as the limit of the ratio of ƒ(x+θ)−ƒ(x) to θ when θ is made indefinitely small; and this gives an interpretation of nxn−1 as the derived function of xn (§ 45).
This conception of a limit enables us to deal with algebraical expressions which assume such forms as 00 for particular values of the variable (§ 39 (iii.)). We cannot, for instance, say that the fraction x2−1x−1 is arithmetically equal to x+1 when x = 1, as well as for other values of x; but we can say that the limit of the ratio of x2−1 to x−1 when x becomes indefinitely nearly equal to 1 is the same as the limit of x+1.
On the other hand, if ƒ(y) has a definite and finite value for y = x, it must not be supposed that this is necessarily the same as the limit which ƒ(y) approaches when y approaches the value x, though this is the case with the functions with which we are usually concerned.
62. The elementary idea of a differential coefficient is useful in reference to the logarithmic and exponential series. We know that log10N(1+θ) = log10N+log10(1+θ), and inspection of a table of logarithms shows that, when θ is small, log10(1+θ) is approximately equal to λθ, where λ is a certain constant, whose value is .434... If we took logarithms to base a, we should have
approximately. If therefore we choose a quantity e such that
which gives (by more accurate calculation)
The deduction of the expansions
is then more simply obtained by the differential calculus than by ordinary algebraic methods.
63. The theory of inequalities is closely connected with that of maxima and minima, and therefore seems to come properly under this head. The more simple properties, however, only require the use of elementary methods. Thus to show that the arithmetic mean of n positive numbers is greater than their geometric mean (i.e. than the nth root of their product) we show that if any two are unequal their product may be increased, without altering their sum, by making them equal, and that if all the numbers are equal their arithmetic mean is equal to their geometric mean.
VI. Special Developments.
64. One case of convergence of a sequence has already been considered in § 58 (i.). The successive terms of the sequence in that case were formed by successive additions of terms of a series; the series is then also said to converge to the limit which is the limit of the sequence.
Another example of a sequence is afforded by the successive convergents to a continued fraction of the form a0+1a1 + 1a2 +. . . , where a0, a1, a2, . . . are integers. Denoting these convergents by P0/Q0, P1/Q1, P2/Q2, . . . they may be regarded as obtained from a series P0Q0 + (P1Q1 − P0Q0 ) + (P2Q2 − P1Q1) + . . . ; the successive terms of this series, after the first, are alternately positive and negative, and consist of fractions with numerators 1 and denominators continually increasing.
Another kind of sequence is that which is formed by introducing the successive factors of a continued product; e.g. the successive factors on the right-hand side of Wallis’s theorem
A continued product of this kind can, by taking logarithms, be replaced by an infinite series.
In the particular case considered in § 58 (i.) we were able to examine the approach of the sequence S0, S1, S2, . . . to its limit X by direct examination of the value of X−Sr. In most cases this is not possible; and we have first to consider the convergence of the sequence or of the series which it represents, and then to determine its limit by indirect methods. This constitutes the general theory of convergence of series (see Series).
The word “sequence,” as defined in § 58 (i.), includes progressions such as the arithmetical and geometrical progressions, and, generally, the succession of terms of a series. It is usual, however, to confine it to those sequences (e.g. the sequence formed by taking successive sums of a series) which have to be considered in respect of their convergence or non-convergence.
In order that numerical results obtained by summing the first few terms of a series may be of any value, it is usually necessary that the series should converge to a limit; but there are exceptions to this rule. For instance, when n is large, n! is approximately equal to √(2πn)⋅(n/e)n; the approximation may be improved by Stirling's theorem
loge2 + loge3 + ... + loge(n−1) + 12logen = 12loge(2π) + nlogen − n + B11⋅2⋅n − B23⋅4⋅n3 + ... + (−)r−1Br(2r−1)⋅2r⋅n2r−1+ ...
,where B1, B2, . . . are Bernoulli's numbers (§ 46 (v.)), although the series is not convergent.
65. Consideration of the binomial theorem for fractional index, or of the continued fraction representing a surd, or of theorems such as Wallis's theorem (§ 64), shows that a sequence, every term of which is rational, may have as its limit an irrational number, i.e. a number which cannot be expressed as the ratio of two integers.
These are isolated cases of irrational numbers. Other cases arise when we consider the continuity of a function. Suppose, for instance, that y = x2; then to every rational value of x there corresponds a rational value of y, but the converse does not hold. Thus there appear to be discontinuities in the values of y.
The difficulty is due to the fact that number is naturally not continuous, so that continuity can only be achieved by an artificial development. The development is based on the necessity of being able to represent geometrical magnitude by arithmetical magnitude; and it may be regarded as consisting of three stages. Taking any number n to be represented by a point on a line at distance nL from a fixed point O, where L is a unit of length, we start with a series of points representing the integers 1, 2, 3, . . . This series is of course discontinuous. The next step is to suppose that fractional numbers are represented in the same way. This extension produces a change of character in the series of numbers. In the original integral series each number had a definite number next to it, on each side, except 1, which began the series. But in the new series there is no first number, and no number can be said to be next to any other number, since, whatever two numbers we take, others can be inserted between them. On the other hand, this new series is not continuous; for we know that there are some points on the line which represent surds and other irrational numbers, and these numbers are not contained in our series. We therefore take a third step, and obtain theoretical continuity by considering that every point on the line, if it does not represent a rational number, represents something which may be called an irrational number.
This insertion of irrational numbers (with corresponding negative numbers) requires for its exact treatment certain special methods, which form part of the algebraic theory of number, and are dealt with under Number.
66. The development of the theory of equations leads to the amplification of real numbers, rational and irrational, positive and negative, by imaginary and complex numbers. The quadratic equation x2+b2 = 0, for instance, has no real root; but we may treat the roots as being +b√−1, and −b√−1, if √−1 is treated as something which obeys the laws of arithmetic and emerges into reality under the condition √−1.√−1 = −1. Expressions of the form b√−1 and a+b√−1, Where a and b are real numbers, are then described as imaginary and complex numbers respectively; the former being a particular case of the latter.
Complex numbers are conveniently treated in connexion not only with the theory of equations but also with analytical trigonometry, which suggests the graphic representation of a+b√−1 by a line of length (a2+b2)12 drawn in a direction different from that of the line along which real numbers are represented.
References.—W. K. Clifford, The Common Sense of the Exact Sciences (1885), Chapters i. and iii., forms a good introduction to algebra. As to the teaching of algebra, see references under Arithmetic to works on the teaching of elementary mathematics. Among school-books may be mentioned those of W. M. Baker and A. A. Bourne, W. G. Borchardt, W. D. Eggar, F. Gorse, H. S. Hall and S. R. Knight, A. E. F. Layng, R. B. Morgan. G. Chrystal, Introduction to Algebra (1898); H. B. Fine, A College Algebra (1905); C. Smith, A Treatise on Algebra (1st ed. 1888, 3rd ed. 1892), are more suitable for revision purposes; the second of these deals rather fully with irrational numbers. For the algebraic theory of number, and the convergence of sequences and of series, see T. J. I’A. Bromwich, Introduction to the Theory of Infinite Series (1908); H. S. Carslaw, Introduction to the Theory of Fourier’s Series (1906); H. B. Fine, The Number-System of Algebra (1891); H. P. Manning, Irrational Numbers (1906); J. Pierpont, Lectures on the Theory of Functions of Real Variables (1905). For general reference, G. Chrystal, Text-Book of Algebra (pt. i. 5th ed. 1904. pt. ii. 2nd ed. 1900) is indispensable; unfortunately, like many of the works here mentioned, it lacks a proper index. Reference may also be made to the special articles mentioned at the commencement of the present article, as well as to the articles on Differences, Calculus of; Infinitesimal Calculus; Interpolation; Vector Analysis. The following may also be consulted:—E. Borel and J. Drach, Introduction à l’étude de la théorie des nombres et de l’algèbre supérieure (1895); C. de Comberousse, Cours de mathématiques, vols. i. and iii. (1884–1887); H. Laurent, Traité d’analyse, vol. i. (1885); E. Netto, Vorlesungen über Algebra (vol. i. 1896, vol. ii. 1900); S. Pincherle, Algebra complementare (1893); G. Salmon, Lessons introductory to the Modern Higher Algebra (4th ed., 1885); J. A. Serret, Cours d’algèbre supérieure (4th ed., 2 vols., 1877); O. Stolz and J. A. Gmeiner, Theoretische Arithmetik (pt. i. 1900, pt. ii. 1902) and Einleitung in die Funktionen-theorie (pt. i. 1904, pt. ii. 1905)—these being developments from O. Stolz, Vorlesungen über allgemeine Arithmetic (pt. i. 1885, pt. ii. 1886); J. Tannery, Introduction à la théorie des fonctions d’une variable (1st ed. 1886, 2nd ed. 1904); H. Weber, Lehrbuch der Algebra, 2 vols. (1st ed. 1895–1896, 2nd ed. 1898–1899; vol. i. of 2nd ed. transl. by Griess as Traité d’algèbre supérieure, 1898). For a fuller bibliography, see Encyclopädie der math. Wissenschaften (vol. i., 1898). A list of early works on algebra is given in Encyclopedia Britannica, 9th ed., vol. i. p. 518. (W. F. Sh.)
B. Special Kinds of Algebra
1. A special algebra is one which differs from ordinary algebra in the laws of equivalence which its symbols obey. Theoretically, no limit can be assigned to the number of possible algebras; the varieties actually known use, for the most part, the same signs of operation, and differ among themselves principally by their rules of multiplication.
2. Ordinary algebra developed very gradually as a kind of shorthand, devised to abbreviate the discussion of arithmetical problems and the statement of arithmetical facts. Although the distinction is one which cannot be ultimately maintained, it is convenient to classify the signs of algebra into symbols of quantity (usually figures or letters), symbols of operation, such as +, √, and symbols of distinction, such as brackets. Even when the formal evolution of the science was fairly complete, it was taken for granted that its symbols of quantity invariably stood for numbers, and that its symbols of operation were restricted to their ordinary arithmetical meanings. It could not escape notice that one and the same symbol, such as √(a−b), or even (a−b), sometimes did and sometimes did not admit of arithmetical interpretation, according to the values attributed to the letters involved. This led to a prolonged controversy on the nature of negative and imaginary quantities, which was ultimately settled in a very curious way. The progress of analytical geometry led to a geometrical interpretation both of negative and also of imaginary quantities; and when a “meaning” or, more properly, an interpretation, had thus been found for the symbols in question, a reconsideration of the old algebraic problem became inevitable, and the true solution, now so obvious, was eventually obtained. It was at last realized that the laws of algebra do not depend for their validity upon any particular interpretation, whether arithmetical, geometrical or other; the only question is whether these laws do or do not involve any logical contradiction. When this fundamental truth had been fully grasped, mathematicians began to inquire whether algebras might not be discovered which obeyed laws different from those obtained by the generalization of arithmetic. The answer to this question has been so manifold as to be almost embarrassing. All that can be done here is to give a sketch of the more important and independent special algebras at present known to exist.
3. Although the results of ordinary algebra will be taken for granted, it is convenient to give the principal rules upon which it is based. They are
(a+b)+c = a+(b+c) (a) (a×b)×c = a×(b×c) (a′)
a+b = b+a (c) a×b = b×a (c′)
a(b+c) = ab+ac (d)
(a−b)+b = a (i) (a÷b)×b = a(i′)
These formulae express the associative and commutative laws of the operations + and ×, the distributive law of ×, and the definitions of the inverse symbols − and ÷, which are assumed to be unambiguous. The special symbols 0 and 1 are used to denote a−a and a÷a. They behave exactly like the corresponding symbols in arithmetic; and it follows from this that whatever “meaning” is attached to the symbols of quantity, ordinary algebra includes arithmetic, or at least an image of it. Every ordinary algebraic quantity may be regarded as of the form α+β√−1, where α, β are “real”; that is to say, every algebraic equivalence remains valid when its symbols of quantity are interpreted as complex numbers of the type α+β√−1 (cf. Number). But the symbols of ordinary algebra do not necessarily denote numbers; they may, for instance, be interpreted as coplanar points or vectors. Evolution and involution are usually regarded as operations of ordinary algebra; this leads to a notation for powers and roots, and a theory of irrational algebraic quantities analogous to that of irrational numbers.
4. The only known type of algebra which does not contain arithmetical elements is substantially due to George Boole. Although originally suggested by formal logic, it is most simply interpreted as an algebra of regions in space. Let i denote a definite region of space; andNon-numerical algebra. let a, b, &c., stand for definite parts of i. Let a+b denote the region made up of a and b together (the common part, if any, being reckoned only once), and let a × b or ab mean the region common to a and b. Then a+a = aa = a; hence numerical coefficients and indices are not required. The inverse symbols −, ÷ are ambiguous, and in fact are rarely used. Each symbol a is associated with its supplement ā which satisfies the equivalences a+ā = i, aā = 0, the latter of which means that a and ā have no region in common. Finally, there is a law of absorption expressed by a+ab = a. From every proposition in this algebra a reciprocal one may be deduced by interchanging + and ×, and also the symbols 0 and i. For instance, x+y = x+xy and xy = x(x+y) are reciprocal. The operations + and × obey all the ordinary laws a, c, d (§ 3).
5. A point A in space may be associated with a (real, positive, or negative) numerical quantity a, called its weight, and denoted by the symbol αA. The sum of two weighted points αA, βB is, by definition, the point (α+β)G, where G divides AB so that AG: GB = β:α. Möbius’s barycentric calculus.It can be proved by geometry that
where P is in fact the centroid of masses α, β, γ placed at A, B, C respectively. So, in general, if we put
αA+βB+γC+...+λL = (α+β+γ+...+λ)X.
X is, in general, a determinate point, the barycentre of αA, βB, &c. (or of A, B, &c. for the weights α, β, &c.). If (α+β+...+λ) happens to be zero, X lies at infinity in a determinate direction; unless −αA is the barycentre of βB, γC,...λL, in which case αA+βB+...+λL vanishes identically, and X is indeterminate. If ABCD is a tetrahedron of reference, any point P in space is determined by an equation of the form
(α+β+γ+δ)P = αA + βB + γC + δD:
α, β, γ, δ are, in fact, equivalent to a set of homogeneous coordinates of P. For constructions in a fixed plane three points of reference are sufficient. It is remarkable that Möbius employs the symbols AB, ABC, ABCD in their ordinary geometrical sense as lengths, areas and volumes, except that he distinguishes their sign; thus AB = −BA, ABC = −ACB, and so on. If he had happened to think of them as “products,” he might have anticipated Grassmann's discovery of the extensive calculus. From a merely formal point of view, we have in the barycentric calculus a set of “special symbols of quantity” or “extraordinaries” A, B, C, &c., which combine with each other by means of operations + and − which obey the ordinary rules, and with ordinary algebraic quantities by operations × and ÷, also according to the ordinary rules, except that division by an extraordinary is not used.
6. A quaternion is best defined as a symbol of the type
q = Σαses = α0e0 + α1e1 = α2e2 + α3e3,
where e0, . . . e3 are independentHamilton’s quaternions. extraordinaries and α0, . . . α3 ordinary algebraic quantities, which may be called the co-ordinates of q. The sum and product of two quaternions are defined by the formulae
Σαses + Σβses = Σ(αs+βs)es |
Σαrer × Σβses = Σαr+βseres, |
where the products eres, are further reduced according to the following multiplication table, in which, for example, the
e0 | e1 | e2 | e3 | |
e0 | e0 | e1 | e2 | e3 |
e1 | e1 | −e0 | e3 | −e2 |
e2 | e2 | −e3 | −e0 | e1 |
e3 | e3 | e2 | −e1 | −e0 |
second line is to be read e1e0 = e1, e12 = −e0, e1e2 = e3, e1e3 = −e2. The effect of these definitions is that the sum and the product of two quaternions are also quaternions, that addition is associative and commutative; and that multiplication is associative and distributive, but not commutative. Thus e1e2 = −e2e1, and if q, q′ are any two quaternions, qq′ is generally different from q′q. The symbol e0 behaves exactly like 1 in ordinary algebra; Hamilton writes 1, i, j, k instead of e0, e1, e2, e3, and in this notation all the special rules of operation may be summed up by the equalities
i 2 = j 2=k2 = ijk = −1.
Putting q = α+βi+γj+δk, Hamilton calls α the scalar part of q, and denotes it by Sq; he also writes Vq for βi+γj+δk, which is called the vector part of q. Thus every quaternion may be written in the form q=Sq+Vq, where either Sq or Vq may separately vanish; so that ordinary algebraic quantities (or scalars, as we shall call them) and pure vectors may each be regarded as special cases of quaternions.
The equations q′+x = q and y+q′ = q are satisfied by the same quaternion, which is denoted by q−q′, On the other hand, the equations q′x = q and yq′ = q have, in general, different solutions. It is the value of y which is generally denoted by q÷q′; a special symbol for x is desirable, but has not been established. If we put q′0 = Sq′−Vq′, then q′0 is called the conjugate of q′, and the scalar q′q′0=q′0q′ is called the norm of q′ and written Nq′. With this notation the values of x and y may be expressed in the forms
x = q′0q/Nq′, y = qq0′/Nq′,
which are free from ambiguity, since scalars are commutative with quaternions. The values of x and y are different, unless V(qq′0)=0.
In the applications of the calculus the co-ordinates of a quaternion are usually assumed to be numerical; when they are complex, the quaternion is further distinguished by Hamilton as a biquaternion. Clifford’s biquaternions are quantities ξq+ηr, where q, r are quaternions, and ξ, η are symbols (commutative with quaternions) obeying the laws ξη=ηξ=0 (cf. Quaternions).
7. In the extensive calculus of the nth category, we have, first of all, n independent “units,” e1, e2, . . . en. From these are derived symbols of the type
A1 = α1e1+α2e2+...+αnen=∑αe,
which we shall call extensive quantities of the first species (and, when necessary, of theGrassmann’s extensive calculus. nth category). The coordinates α1, . . . αn are scalars, and in particular applications may be restricted to real or complex numerical values.
If B1=∑βe, there is a law of addition expressed by
A1 + B1 = ∑(αi + βi)ei = B1 + A1.
this law of addition is associative as well as commutative. The inverse operation is free from ambiguity, and, in fact,
A1 − B1 = ∑(αi − βi)ei.
To multiply A1 by a scalar, we apply the rule
ξA1 = A1ξ = ∑(ξαi )ei.
and similarly for division by a scalar.
All this is analogous to the corresponding formulae in the barycentric calculus and in quaternions, it remains to consider the multiplication of two or more extensive quantities The binary products of the units ei are taken to satisfy the equalities
ei2 = 0,eiej = −ejei ;
this reduces them to 12n(n−1) distinct values, exclusive of zero. These values are assumed to be independent, so we have 12n(n−1) derived units of the second species or order. Associated with these new units there is a system of extensive quantities of the second species, represented by symbols of the type
A2 = ∑αiEi(2)[i = 1, 2, ... 12n(n−1)],
where E1(2), E2(2), &c., are the derived units of the second species. If A1 = ∑αiei, B1 = ∑βeiei, the distributive law of multiplication is preserved by assuming
A1B1 = ∑(αiβj)eiej ;
it follows that A1B1= −B1A1, and that A12=0.
By assuming the truth of the associative law of multiplication, and taking account of the reducing formulae for binary products, we may construct derived units of the third, fourth . . . nth species. Every unit of the r th species which does not vanish is the product of r different units of the first species; two such units are independent unless they are permutations of the same set of primary units ei, in which case they are equal or opposite according to the usual rule employed in determinants. Thus, for instance—
e1.e2e3=e1e2.e3=e1e2e3=−e2e1e3=e2e3e1;
and, in general, the number of distinct units of the r th species in the nth category (r ⪕ n) is Cn,r. Finally, it is assumed that (in the nth category) e1e2e3 . . . en=1, the suffixes being in their natural order.
Let Ar=ΣαE(r) and Bs=ΣβE(s) be two extensive quantities of species r and s; then if r+s ⪕ n, they may be multiplied by the rule
ArBs=Σ(αβ)E(r)E(s)
where the products E(r)E(s) may be expressed as derived units of species (r+s). The product BsAr is equal or opposite to A(r)B(s), according as rs is even or odd. This process may be extended to the product of three or more factors such as A(r)B(s)C(t) . . . provided that r+s+t+ . . . does not exceed n. The law is associative; thus, for instance, (AB)C=A(BC). But the commutative law does not always hold; thus, indicating species, as before, by suffixes, ArBsCt=(−1)rs+st+trCtBsAr, with analogous rules for other cases.
If r+s > n, a product such as ErEs, worked out by the previous rules, comes out to be zero. A characteristic feature of the calculus is that a meaning can be attached to a symbol of this kind by adopting a new rule, called that of regressive multiplication, as distinguished from the foregoing, which is progressive. The new rule requires some preliminary explanation. If E is any extensive unit, there is one other unit E′, and only one, such that the (progressive) product EE′=1. This unit is called the supplement of E, and denoted by |E. For example, when n=4,
|e1=e2e3e4, |e1e2=e3e4 |e2e3e4=−e1,
and so on. Now when r+s > n, the product Er Es is defined to be that unit of which the supplement is the progressive product |Er|Es. For instance, if n=4, Er=e1e2, Es=e2e3=e4, we have
|Er|Es=(−e2e4)(−e1)=e1e2e4=|e3
consequently, by the rule of regressive multiplication,
e1e3.e2e3e4=e3.
Applying the distributive law, we obtain, when r+s>n,
ArBs=ΣαE(r)ΣβE(s)=Σ(αβ)ErEs,
where the regressive products ErEs are to be reduced to units of species (r+s−n) by the foregoing rule.
If A=ΣαE, then, by definition, |A=Σα|E, and hence
A|(B+C)=A|B+A|C.
Now this is formally analogous to the distributive law of multiplication; and in fact we may look upon A|B as a particular way of multiplying A and B (not A and B). The symbol AB, from this point of view, is called the inner product of A and B, as distinguished from the outer product |AB. An inner product may be either progressive or regressive. In the course of reducing such expressions as (AB)C, (AB){C(DE)} and the like, where a chain of multiplications has to be performed in a certain order, the multiplications may be all progressive, or all regressive, or partly, one, partly the other. In the first two cases the product is said to be pure, in the third case mixed: A pure product is associative; a mixed product, speaking generally, is not.
The outer and inner products of two extensive quantities A, B, are in many ways analogous to the quaternion symbols Vab and Sab respectively. As in quaternions, so in the extensive calculus, there are numerous formulae of transformation which enable us to deal with extensive quantities without expressing them in terms of the primary units. Only a few illustrations can be given here, Let a, b, c, d, e, f be quantities of the first species in the fourth category; A, B, C . . . quantities of the third species in the same category. Then
(de) (abc)=(abde)c + (cade)b + (bcde)a
= (abce)d − (abcd)e,
(ab)(AB)=(aA)(bB) − (aB)(bA)
ab|c=(a|c)b − (b|c)a, (ab|cd)=(a|c)(b|d) − (a|d)(b|c).
These may be compared and contrasted with such quaternion formulae as
S(VabVcd)=SadSbc − SacSbd
dSabc=aSbcd − bScda + cSadb
where a, b, c, d denote arbitrary vectors.
8. An n-tuple linear algebra (also called a complex number system) deals with quantities of the type A=Σαiei derived from n special units e1, e2 . . . en. The sum and product of two quantities are defined in the first instance by the formulaeLinear Algebras.
Σαe+Σβe=Σ(α+β)e, Σαiei × Σβjej=Σ(αi+βj)eiej,
so that the laws a, c, d of § 3 are satisfied. The binary products eiej, however, are expressible as linear functions of the units ei by means of a “multiplication table” which defines the special characteristics of the algebra in question. Multiplication may or may not be commutative, and in the same way it may or may not be associative. The types of linear associative algebras, not assumed to be commutative, have been enumerated (with some omissions) up to sextuple algebras inclusive by B. Peirce. Quaternions afford an example of a quadruple algebra of this kind; ordinary algebra is a special case of a duplex linear algebra. If, in the extensive calculus of the nth category, all the units (including 1 and the derived units E) are taken to be homologous instead of being distributed into species, we may regard it as a (2n−1)-tuple linear algebra, which, however, is not wholly associative. It should be observed that while the use of special units, or extraordinaries, in a linear algebra is convenient, especially in applications, it is not indispensable. Any linear quantity may be denoted by a symbol (a1, a2, . . . an) in which only its scalar coefficients occur; in fact, the special units only serve, in the algebra proper, as umbrae or regulators of certain operations on scalars (see Number). This idea finds fuller expression in the algebra of matrices, as to which it must suffice to say that a matrix is a symbol consisting of a rectangular array of scalars, and that matrices may be combined by a rule of addition which obeys the usual laws, and a rule of multiplication which is distributive and associative, but not, in general, commutative. Various special algebras (for example, quaternions) may be expressed in the notation of the algebra of matrices.
9. In ordinary algebra we have the disjunctive law that if ab=0, then either a=0 or b=0. This applies also to quaternions, but not to extensive quantities, nor is it true for linear algebras in general. One of the most important questions in investigating a linear algebra is to decide the necessary relations between a and b in order that this product may be zero.
10. The algebras discussed up to this point may be considered as independent in the sense that each of them deals with a class of symbols of quantity more or less homogeneous, and a set of operations applying to them all. But when an algebra is used with a particular interpretation, or even in the course of its formal development, it frequently Subsidiary algebras.happens that new symbols of operation are, so to speak, superposed upon the algebra, and are found to obey certain formal laws of combination of their own. For instance, there are the symbols Δ, D, E used in the calculus of finite differences; Aronhold’s symbolical method in the calculus of invariants; and the like. In most cases these subsidiary algebras, as they may be called, are inseparable from the applications in which they are used; but in any attempt at a natural classification of algebra (at present a hopeless task), they would have to be taken into account. Even in ordinary algebra the notation for powers and roots disturbs the symmetry of the rational theory; and when a schoolboy illegitimately extends the distributive law by writing √(a + b)=√a + √b, he is unconsciously emphasizing this want of complete harmony.
Authorities.—A. de Morgan, “On the Foundation of Algebra,” Trans. Camb. P.S. (vii., viii., 1839–1844); G. Peacock, Symbolical Algebra (Cambridge, 1845); G. Boole, Laws of Thought (London, 1854); E. Schröder, Lehrbuch der Arithmetik u. Algebra (Leipzig, 1873), Vorlesungen über die Algebra der Logik (ibid., 1890–1895); A. F. Möbius, Der barycentrische Calcul (Leipzig, 1827) (reprinted in his collected works, vol. i., Leipzig, 1885); W. R. Hamilton, Lectures on Quaternions (Dublin, 1853), Elements of Quaternions (ibid., 1866); H. Grassmann, Die lineale Ausdehnungslehre (Leipzig, 1844), Die Ausdehnungslehre (Berlin, 1862) (these are reprinted with valuable emendations and notes in his Gesammelte math. u. phys. Werke, vol. i., Leipzig (2 parts), 1894, 1896), and papers in Grunert’s Arch. vi., Crelle, xlix.-lxxxiv., Math. Ann. vii. xii.; B. and C. S. Peirce, “Linear Associative Algebra,” Amer. Journ. Math. iv. (privately circulated, 1871); A. Cayley, on Matrices, Phil. Trans. cxlviii., on Multiple Algebra, Quart. M. Journ. xxii.; J. J. Sylvester, on Universal Algebra (i.e. Matrices), Amer. Journ. Math. vi.; H. J. S. Smith, on Linear Indeterminate Equations, Phil. Trans. cli.; R. S. Ball, Theory of Screws (Dublin, 1876); and papers in Phil. Trans. clxiv., and Trans. R. Ir. Ac. xxv.; W. K. Clifford, on Biquaternions, Proc. L. M. S. iv.; A. Buchheim, on Extensive Calculus and its Applications, Proc. L. M. S. xv.-xvii.; H. Taber, on Matrices, Amer. J. M. xii.; K. Weierstrass, “Zur Theorie der aus n Haupteinheiten gebildeten complexen Grössen,” Götting. Nachr. (1884); G. Frobenius, on Bilinear Forms, Crelle, lxxxiv., and Berl. Ber. (1896); L. Kronecker, on Complex Numbers and Modular Systems, Berl. Ber. (1888); G. Scheffers, “Complexe Zahlensysteme,” Math. Ann. xxxix. (this contains a bibliography up to 1890); S. Lie, Vorlesungen über continuirliche Gruppen (Leipzig, 1893), ch. xxi.; A. M‘Aulay, “Algebra after Hamilton, or Multenions,” Proc. R. S. E., 1908, 28. p. 503. For a more complete account see H. Hankel Theorie der complexen Zahlensysteme (Leipzig, 1867); O. Stolz, Vorlesungen über allgemeine Arithmetik (ibid., 1883); A. N. Whitehead, A Treatise on Universal Algebra, with Applications (vol. i., Cambridge, 1898) (a very comprehensive work, to which the writer of this article is in many ways indebted); and the Encyclopädie d. math. Wissenschaften (vol. i., Leipzig, 1898), &c., §§ A 1 (H. Schubert), A 4 (E. Study), and B 1 c (G. Landsberg). For the history of the development of ordinary algebra M. Cantor’s Vorlesungen über Geschichte der Mathematik is the standard authority. (G. B. M.)
C. History
Various derivations of the word “algebra,” which is of Arabian origin, have been given by different writers. The first mention of the word is to be found in the title of a work by Mahommed ben Musa al-Khwarizmi (Hovarezmi), who flourished about the beginning of the 9th century. Etymology.The full title is ilm al-jebr wa’l-muqābala, which contains the ideas of restitution and comparison, or opposition and comparison, or resolution and equation, jebr being derived from the verb jabara, to reunite, and muqābala, from gabala, to make equal. (The root jabara is also met with in the word algebrista, which means a “bone-setter,” and is still in common use in Spain.) The same derivation is given by Lucas Paciolus (Luca Pacioli), who reproduces the phrase in the transliterated form alghebra e almucabala, and ascribes the invention of the art to the Arabians.
Other writers have derived the word from the Arabic particle al (the definite article), and geber, meaning “man.” Since, however, Geber happened to be the name of a celebrated Moorish philosopher who flourished in about the 11th or 12th century, it has been supposed that he was the founder of algebra, which has since perpetuated his name. The evidence of Peter Ramus (1515–1572) on this point is interesting, but he gives no authority for his singular statements. In the preface to his Arithmeticae libri duo et totidem Algebrae (1560) he says: “The name Algebra is Syriac, signifying the art or doctrine of an excellent man. For Geber, in Syriac, is a name applied to men, and is sometimes a term of honour, as master or doctor among us. There was a certain learned mathematician who sent his algebra, written in the Syriac language, to Alexander the Great, and he named it almucabala, that is, the book of dark or mysterious things, which others would rather call the doctrine of algebra. To this day the same book is in great estimation among the learned in the oriental nations, and by the Indians, who cultivate this art, it is called aljabra and alboret; though the name of the author himself is not known.” The uncertain authority of these statements, and the plausibility of the preceding explanation, have caused philologists to accept the derivation from al and jabara. Robert Recorde in his Whetstone of Witte (1557) uses the variant algeber, while John Dee (1527–1608) affirms that algiebar, and not algebra, is the correct form, and appeals to the authority of the Arabian Avicenna.
Although the term “algebra” is now in universal use, various other appellations were used by the Italian mathematicians during the Renaissance. Thus we find Paciolus calling it l ’Arte Magiore; ditta dal vulgo la Regula de la Cosa over Alghebra e Almucabala. The name l ’arte magiore, the greater art, is designed to distinguish it from l ’arte minore, the lesser art, a term which he applied to the modern arithmetic. His second variant, la regula de la cosa, the rule of the thing or unknown quantity, appears to have been in common use in Italy, and the word cosa was preserved for several centuries in the forms coss or algebra, cossic or algebraic, cossist or algebraist, &c. Other Italian writers termed it the Regula rei et census, the rule of the thing and the product, or the root and the square. The principle underlying this expression is probably to be found in the fact that it measured the limits of their attainments in algebra, for they were unable to solve equations of a higher degree than the quadratic or square.
Franciscus Vieta (François Viète) named it Specious Arithmetic, on account of the species of the quantities involved, which he represented symbolically by the various letters of the alphabet. Sir Isaac Newton introduced the term Universal Arithmetic, since it is concerned with the doctrine of operations, not affected on numbers, but on general symbols.
Notwithstanding these and other idiosyncratic appellations, European mathematicians have adhered to the older name, by which the subject is now universally known.
It is difficult to assign the invention of any art or science definitely to any particular age or race. The few fragmentary records, which have come down to us from past civilizations, must not be regarded as representing the totality of their knowledge, and the omission of a science or art does not necessarily imply that the science or art was unknown. It was formerly the custom to assign the invention of algebra to the Greeks, but since the decipherment of the Rhind papyrus by Eisenlohr this view has changed, for in this work there are distinct signs of an algebraic analysis. The particular problem—a heap (hau) and its seventh makes 19—is solved as we should now solve a simple equation; but Ahmes varies his methods in other similar problems. This discovery carries the invention of algebra back to about 1700 B.C., if not earlier.
It is probable that the algebra of the Egyptians was of a most rudimentary nature, for otherwise we should expect to find traces of it in the works of the Greek geometers, of whom Thales of Miletus (640–546 B.C.) was the first. Notwithstanding the prolixity of writers and the Greek algebra.number of the writings, all attempts at extracting an algebraic analysis from their geometrical theorems and problems have been fruitless, and it is generally conceded that their analysis was geometrical and had little or no affinity to algebra. The first extant work which approaches to a treatise on algebra is by Diophantus (q.v.), an Alexandrian mathematician, who flourished about A.D. 350. The original, which consisted of a preface and thirteen books, is now lost, but we have a Latin translation of the first six books and a fragment of another on polygonal numbers by Xylander of Augsburg (1575), and Latin and Greek translations by Gaspar Bachet de Merizac (1621–1670). Other editions have been published, of which we may mention Pierre Fermat’s (1670), T. L. Heath’s (1885) and P. Tannery’s (1893–1895). In the preface to this work, which is dedicated to one Dionysus, Diophantus explains his notation, naming the square, cube and fourth powers, dynamis, cubus, dynamodinimus, and so on, according to the sum in the indices. The unknown he terms arithmos, the number, and in solutions he marks it by the final ς; he explains the generation of powers, the rules for multiplication and division of simple quantities, but he does not treat of the addition, subtraction, multiplication and division of compound quantities. He then proceeds to discuss various artifices for the simplification of equations, giving methods which are still in common use. In the body of the work he displays considerable ingenuity in reducing his problems to simple equations, which admit either of direct solution, or fall into the class known as indeterminate equations. This latter class he discussed so assiduously that they are often known as Diophantine problems, and the methods of resolving them as the Diophantine analysis (see Equation, Indeterminate). It is difficult to believe that this work of Diophantus arose spontaneously in a period of general stagnation. It is more than likely that he was indebted to earlier writers, whom he omits to mention, and whose works are now lost; nevertheless, but for this work, we should be led to assume that algebra was almost, if not entirely, unknown to the Greeks.
The Romans, who succeeded the Greeks as the chief civilized power in Europe, failed to set store on their literary and scientific treasures; mathematics was all but neglected; and beyond a few improvements in arithmetical computations, there are no material advances to be recorded.
In the chronological development of our subject we have now to turn to the Orient. Investigation of the writings of Indian mathematicians has exhibited a fundamental distinction between the Greek and Indian mind, the former being pre-eminently geometrical and speculative, the latter arithmetical and mainly practical. We find that Indian algebra.geometry was neglected except in so far as it was of service to astronomy; trigonometry was advanced, and algebra improved far beyond the attainments of Diophantus.
The earliest Indian mathematician of whom we have certain knowledge is Aryabhatta, who flourished about the beginning of the 6th century of our era. The fame of this astronomer and mathematician rests on his work, the Aryabhattiyam, the third chapter of which is devoted to mathematics. Ganessa, an eminent astronomer, mathematician and scholiast of Bhaskara, quotes this work and makes separate mention of the cuttaca (“pulveriser”), a device for effecting the solution of indeterminate equations. Henry Thomas Colebrooke, one of the earliest modern investigators of Hindu science, presumes that the treatise of Aryabhatta extended to determinate quadratic equations, indeterminate equations of the first degree, and probably of the second. An astronomical work, called the Surya-siddhanta (“knowledge of the Sun”), of uncertain authorship and probably belonging to the 4th or 5th century, was considered of great merit by the Hindus, who ranked it only second to the work of Brahmagupta, who flourished about a century later. It is of great interest to the historical student, for it exhibits the influence of Greek science upon Indian mathematics at a period prior to Aryabhatta. After an interval of about a century, during which mathematics attained its highest level, there flourished Brahmagupta (b. A.D. 598), whose work entitled Brahma-sphuta-siddhanta (“The revised system of Brahma”) contains several chapters devoted to mathematics. Of other Indian writers mention may be made of Cridhara, the author of a Ganita-sara (“Quintessence of Calculation”), and Padmanabha, the author of an algebra.
A period of mathematical stagnation then appears to have possessed the Indian mind for an interval of several centuries, for the works of the next author of any moment stand but little in advance of Brahmagupta. We refer to Bhaskara Acarya, whose work the Siddhanta-ciromani (“Diadem of an Astronomical System”), written in 1150, contains two important chapters, the Lilavati (“the beautiful [science or art]”) and Viga-ganita (“root-extraction”), which are given up to arithmetic and algebra.
English translations of the mathematical chapters of the Brahma-siddhanta and Siddhanta-ciromani by H. T. Colebrooke (1817), and of the Surya-siddhanta by E. Burgess, with annotations by W. D. Whitney (1860), may be consulted for details.
The question as to whether the Greeks borrowed their algebra from the Hindus or vice versa has been the subject of much discussion. There is no doubt that there was a constant traffic between Greece and India, and it is more than probable that an exchange of produce would be accompanied by a transference of ideas. Moritz Cantor suspects the influence of Diophantine methods, more particularly in the Hindu solutions of indeterminate equations, where certain technical terms are, in all probability, of Greek origin. However this may be, it is certain that the Hindu algebraists were far in advance of Diophantus. The deficiencies of the Greek symbolism were partially remedied; subtraction was denoted by placing a dot over the subtrahend; multiplication, by placing bha (an abbreviation of bhavita, “the product”) after the factors; division, by placing the divisor under the dividend; and square root, by inserting ka (an abbreviation of karana, irrational) before the quantity. The unknown was called yāvattāvat, and if there were several, the first took this appellation, and the others were designated by the names of colours; for instance, x was denoted by yā and y by kā (from kālaka, black).
A notable improvement on the ideas of Diophantus is to be found in the fact that the Hindus recognized the existence of two roots of a quadratic equation, but the negative roots were considered to be inadequate, since no interpretation could be found for them. It is also supposed that they anticipated discoveries of the solutions of higher equations. Great advances were made in the study of indeterminate equations, a branch of analysis in which Diophantus excelled. But whereas Diophantus aimed at obtaining a single solution, the Hindus strove for a general method by which any indeterminate problem could be resolved. In this they were completely successful, for they obtained general solutions for the equations (since rediscovered by Leonhard Euler) and . A particular case of the last equation, namely, , sorely taxed the resources of modern algebraists. It was proposed by Pierre de Fermat to Bernhard Frenicle de Bessy, and in 1657 to all mathematicians. John Wallis and Lord Brounker jointly obtained a tedious solution which was published in 1658, and afterwards in 1668 by John Pell in his Algebra. A solution was also given by Fermat in his Relation. Although Pell had nothing to do with the solution, posterity has termed the equation Pell’s Equation, or Problem, when more rightly it should be the Hindu Problem, in recognition of the mathematical attainments of the Brahmans.
Hermann Hankel has pointed out the readiness with which the Hindus passed from number to magnitude and vice versa. Although this transition from the discontinuous to continuous is not truly scientific, yet it materially augmented the development of algebra, and Hankel affirms that if we define algebra as the application of arithmetical operations to both rational and irrational numbers or magnitudes, then the Brahmans are the real inventors of algebra.
The integration of the scattered tribes of Arabia in the 7th century by the stirring religious propaganda of Mahomet was accompanied by a meteoric rise in the intellectual powers of a hitherto obscure race. The Arabs became the custodians of Indian and Greek science, whilst Europe was rent by internal dissensions. Under the rule of the Arabian algebra.Abbasids, Bagdad became the centre of scientific thought; physicians and astronomers from India and Syria flocked to their court; Greek and Indian manuscripts were translated (a work commenced by the Caliph Mamun (813–833) and ably continued by his successors); and in about a century the Arabs were placed in possession of the vast stores of Greek and Indian learning. Euclid’s Elements were first translated in the reign of Harun-al-Rashid (786–809), and revised by the order of Mamun. But these translations were regarded as imperfect, and it remained for Tobit ben Korra (836–901) to produce a satisfactory edition. Ptolemy’s Almagest, the works of Apollonius, Archimedes, Diophantus and portions of the Brahmasiddhanta, were also translated. The first notable Arabian mathematician was Mahommed ben Musa al-Khwarizmi, who flourished in the reign of Mamun. His treatise on algebra and arithmetic (the latter part of which is only extant in the form of a Latin translation, discovered in 1857) contains nothing that was unknown to the Greeks and Hindus; it exhibits methods allied to those of both races, with the Greek element predominating. The part devoted to algebra has the title al-jebr wa’l-muqābala, and the arithmetic begins with “Spoken has Algoritmi,” the name Khwarizmi or Hovarezmi having passed into the word Algoritmi, which has been further transformed into the more modern words algorism and algorithm, signifying a method of computing.
Tobit ben Korra (836–901), born at Harran in Mesopotamia, an accomplished linguist, mathematician and astronomer, rendered conspicuous service by his translations of various Greek authors. His investigation of the properties of amicable numbers (q.v.) and of the problem of trisecting an angle, are of importance. The Arabians more closely resembled the Hindus than the Greeks in the choice of studies; their philosophers blended speculative dissertations with the more progressive study of medicine; their mathematicians neglected the subtleties of the conic sections and Diophantine analysis, and applied themselves more particularly to perfect the system of numerals (see Numeral), arithmetic and astronomy (q.v.). It thus came about that while some progress was made in algebra, the talents of the race were bestowed on astronomy and trigonometry (q.v.). Fahri des al Karhi, who flourished about the beginning of the 11th century, is the author of the most important Arabian work on algebra. He follows the methods of Diophantus; his work on indeterminate equations has no resemblance to the Indian methods, and contains nothing that cannot be gathered from Diophantus. He solved quadratic equations both geometrically and algebraically, and also equations of the form x2n+axn+b=0; he also proved certain relations between the sum of the first n natural numbers, and the sums of their squares and cubes.
Cubic equations were solved geometrically by determining the intersections of conic sections. Archimedes' problem of dividing a sphere by a plane into two segments having a prescribed ratio, was first expressed as a cubic equation by Al Mahani, and the first solution was given by Abu Gafar al Hazin. The determination of the side of a regular heptagon which can be inscribed or circumscribed to a given circle was reduced to a more complicated equation which was first successfully resolved by Abul Gud. The method of solving equations geometrically was considerably developed by Omar Khayyam of Khorassan, who flourished in the 11th century. This author questioned the possibility of solving cubics by pure algebra, and biquadratics by geometry. His first contention was not disproved until the 15th century, but his second was disposed of by Abul Wefa (940–998), who succeeded in solving the forms and .
Although the foundations of the geometrical resolution of cubic equations are to be ascribed to the Greeks (for Eutocius assigns to Menaechmus two methods of solving the equation and ), yet the subsequent development by the Arabs must be regarded as one of their most important achievements. The Greeks had succeeded in solving an isolated example; the Arabs accomplished the general solution of numerical equations.
Considerable attention has been directed to the different styles in which the Arabian authors have treated their subject. Moritz Cantor has suggested that at one time there existed two schools, one in sympathy with the Greeks, the other with the Hindus; and that, although the writings of the latter were first studied, they were rapidly discarded for the more perspicuous Grecian methods, so that, among the later Arabian writers, the Indian methods were practically forgotten and their mathematics became essentially Greek in character.
Turning to the Arabs in the West we find the same enlightened spirit; Cordova, the capital of the Moorish empire in Spain, was as much a centre of learning as Bagdad. The earliest known Spanish mathematician is Al Madshritti (d. 1007), whose fame rests on a dissertation on amicable numbers, and on the schools which were founded by his pupils at Cordova, Dania and Granada. Gabir ben Aflah of Sevilla, commonly called Geber, was a celebrated astronomer and apparently skilled in algebra, for it has been supposed that the word “algebra” is compounded from his name.
When the Moorish empire began to wane the brilliant intellectual gifts which they had so abundantly nourished during three or four centuries became enfeebled, and after that period they failed to produce an author comparable with those of the 7th to the 11th centuries.
In Europe the decline of Rome was succeeded by a period, lasting several centuries, during which the sciences and arts were all but neglected. Political and ecclesiastical dissensions occupied the greatest intellects, and the only progress to be recorded is in the art of computing or arithmetic, and the translation of Arabic manuscripts. The first successful attempt to revive the study of algebra in Christendom was due to Leonardo of Pisa, an Italian merchant trading in the Mediterranean.Algebra in Europe. His travels and mercantile experience had led him to conclude that the Hindu methods of computing were in advance of those then in general use, and in 1202 he published his Liber Abaci, which treats of both algebra and arithmetic. In this work, which is of great historical interest, since it was published about two centuries before the art of printing was discovered, he adopts the Arabic notation for numbers, and solves many problems, both arithmetical and algebraical. But it contains little that is original, and although the work created a great sensation when it was first published, the effect soon passed away, and the book was practically forgotten. Mathematics was more or less ousted from the academic curricula by the philosophical inquiries of the schoolmen, and it was only after an interval of nearly three centuries that a worthy successor to Leonardo appeared. This was Lucas Paciolus (Lucas de Burgo), a Minorite friar, who, having previously written works on algebra, arithmetic and geometry, published, in 1494, his principal work, entitled Summa de Arithmetica, Geometria, Proportioni et Proportionalita. In it he mentions many earlier writers from whom he had learnt the science, and although it contains very little that cannot be found in Leonardo’s work, yet it is especially noteworthy for the systematic employment of symbols, and the manner in which it reflects the state of mathematics in Europe during this period. These works are the earliest printed books on mathematics. The renaissance of mathematics was thus effected in Italy, and it is to that country that the leading developments of the following century were due. The first difficulty to be overcome was the algebraical solution of cubic equations, theCubic equations. pons asinorum of the earlier mathematicians. The first step in this direction was made by Scipio Ferro (d. 1526), who solved the equation . Of his discovery we know nothing except that he declared it to his pupil Antonio Marie Floridas. An imperfect solution of the equation was discovered by Nicholas Tartalea (Tartaglia) in 1530, and his pride in this achievement led him into conflict with Floridas, who proclaimed his own knowledge of the form resolved by Ferro. Mutual recriminations led to a public discussion in 1535, when Tartalea completely vindicated the general applicability of his methods and exhibited the inefficiencies of that of Floridas. This contest over, Tartalea redoubled his attempts to generalize his methods, and by 1541 he possessed the means for solving any form of cubic equation. His discoveries had made him famous all over Italy, and he was earnestly solicited to publish his methods; but he abstained from doing so, saying that he intended to embody them in a treatise on algebra which he was preparing. At last he succumbed to the repeated requests of Girolamo or Geronimo Cardano, who swore that he would regard them as an inviolable secret. Cardan or Cardano, who was at that time writing his great work, the Ars Magna, could not restrain the temptation of crowning his treatise with such important discoveries, and in 1545 he broke his oath and gave to the world Tartalea’s rules for solving cubic equations. Tartalea, thus robbed of his most cherished possession, was in despair. Recriminations ensued until his death in 1557, and although he sustained his claim for priority, posterity has not conceded to him the honour of his discovery, for his solution is now known as Cardan’s Rule.
Cubic equations having been solved, biquadratics soon followed suit. As early as 1539 Cardan had solved certain particular cases, but it remained for his pupil, Lewis (Ludovici) Ferrari, to devise a general method. His solution, which is sometimes erroneously ascribed to Rafael Bombelli, was published in the Ars Magna. Biquadratic equations.In this work, which is one of the most valuable contributions to the literature of algebra, Cardan shows that he was familiar with both real positive and negative roots of equations whether rational or irrational, but of imaginary roots he was quite ignorant, and he admits his inability to resolve the so-called “irreducible case” (see Equation). Fundamental theorems in the theory of equations are to be found in the same work. Clearer ideas of imaginary quantities and the “irreducible case” were subsequently published by Bombelli, in a work of which the dedication is dated 1572, though the book was not published until 1579.
Contemporaneously with the remarkable discoveries of the Italian mathematicians, algebra was increasing in popularity in Germany, France and England. Michael Stifel and Johann Scheubelius (Scheybl) (1494 – 1570) flourished in Germany, and although unacquainted with the work of Cardan and Tartalea, their writings are noteworthy for their perspicuity and the introduction of a more complete symbolism for quantities and operations. Stifel introduced the sign (+) for addition or a positive quantity, which was previously denoted by plus, piū, or the letter p. Subtraction, previously written as minus, mene or the letter m, was symbolized by the sign (−) which is still in use. The square root he denoted by (√), whereas Paciolus, Cardan and others used the letter R.
The first treatise on algebra written in English was by Robert Recorde, who published his arithmetic in 1552, and his algebra entitled The Whetstone of Witte, which is the second part of Arithmetik, in 1557. This work, which is written in the form of a dialogue, closely resembles the works of Stifel and Scheubelius, the latter of whom he often quotes. It includes the properties of numbers; extraction of roots of arithmetical and algebraical quantities, solutions of simple and quadratic equations, and a fairly complete account of surds. He introduced the sign (=) for equality, and the terms binomial and residual. Of other writers who published works about the end of the 16th century, we may mention Jacques Peletier, or Jacobus Peletarius (De occulta parte Numerorum, quam Algebram vocant, 1558); Petrus Ramus (Arithmeticae Libri duo et totidem Algebrae, 1560), and Christoph Clavius, who wrote on algebra in 1580, though it was not published until 1608. At this time also flourished Simon Stevinus (Stevin) of Bruges, who published an arithmetic in 1585 and an algebra shortly afterwards. These works possess considerable originality, and contain many new improvements in algebraic notation; the unknown (res) is denoted by a small circle, in which he places an integer corresponding to the power. He introduced the terms multinomial, trinomial, quadrinomial, &c., and considerably simplified the notation for decimals.
About the beginning of the 17th century various mathematical works by Franciscus Vieta were published, which were afterwards collected by Franz van Schooten and republished in 1646 at Leiden. These works exhibit great originality and mark an important epoch in the history of algebra. Vieta, who does not avail himself of the discoveries of his predecessors—the negative roots of Cardan, the revised notation of Stifel and Stevin, &c.—introduced or popularized many new terms and symbols, some of which are still in use. He denotes quantities by the letters of the alphabet, retaining the vowels for the unknown and the consonants for the knowns; he introduced the vinculum and among others the terms coefficient, affirmative, negative, pure and adfected equations. He improved the methods for solving equations, and devised geometrical constructions with the aid of the conic sections. His method for determining approximate values of the roots of equations is far in advance of the Hindu method as applied by Cardan, and is identical in principle with the methods of Sir Isaac Newton and W. G. Horner.
We have next to consider the works of Albert Girard, a Flemish mathematician. This writer, after having published an edition of Stevin’s works in 1625, published in 1629 at Amsterdam a small tract on algebra which shows a considerable advance on the work of Vieta. Girard is inconsistent in his notation, sometimes following Vieta, sometimes Stevin; he introduced the new symbols ff for greater than and § for less than; he follows Vieta in using the plus (+) for addition, he denotes subtraction by Recorde’s symbol for equality (=), and he had no sign for equality but wrote the word out. He possessed clear ideas of indices and the generation of powers, of the negative roots of equations and their geometrical interpretation, and was the first to use the term imaginary roots. He also discovered how to sum the powers of the roots of an equation.
Passing over the invention of logarithms (q.v.) by John Napier, and their development by Henry Briggs and others, the next author of moment was an Englishman, Thomas Harriot, whose algebra (Artis analyticae praxis) was published posthumously by Walter Warner in 1631. Its great merit consists in the complete notation and symbolism, which avoided the cumbersome expressions of the earlier algebraists, and reduced the art to a form closely resembling that of to-day. He follows Vieta in assigning the vowels to the unknown quantities and the consonants to the knowns, but instead of using capitals, as with Vieta, he employed the small letters; equality he denoted by Recorde’s symbol, and he introduced the signs > and < for greater than and less than. His principal discovery is concerned with equations, which he showed to be derived from the continued multiplication of as many simple factors as the highest power of the unknown, and he was thus enabled to deduce relations between the coefficients and various functions of the roots. Mention may also be made of his chapter on inequalities, in which he proves that the arithmetic mean is always greater than the geometric mean.
William Oughtred, a contemporary of Harriot, published an algebra, Clavis mathematicae, simultaneously with Harriot’s treatise. His notation is based on that of Vieta, but he introduced the sign ✕ for multiplication, ∺ for continued proportion, ∷ for proportion, and denoted ratio by one dot. This last character has since been entirely restricted to multiplication, and ratio is now denoted by two dots (:). His symbols for greater than and less than (⫎ and ┐) have been completely superseded by Harriot’s signs.
So far the development of algebra and geometry had been mutually independent, except for a few isolated applications of geometrical constructions to the solution of algebraical problems. Certain minds had long suspected the advantages which would accrue from the unrestricted application of algebra to geometry, but it was not until the advent of the philosopher Réné Descartes that the co-ordination was effected. In his famous Geometria (1637), which is really a treatise on the algebraic representation of geometric theorems, he founded the modern theory of analytical geometry (see Geometry), and at the same time he rendered signal service to algebra, more especially in the theory of equations. His notation is based primarily on that of Harriot; but he differs from that writer in retaining the first letters of the alphabet for the known quantities and the final letters for the unknowns.
The 17th century is a famous epoch in the progress of science, and the mathematics in no way lagged behind. The discoveries of Johann Kepler and Bonaventura Cavalieri were the foundation upon which Sir Isaac Newton and Gottfried Wilhelm Leibnitz erected that wonderful edifice, the Infinitesimal Calculus (q.v.). Many new fields were opened up, but there was still continual progress in pure algebra. Continued fractions, one of the earliest examples of which is Lord Brouncker’s expression for the ratio of the circumference to the diameter of a circle (see Circle), were elaborately discussed by John Wallis and Leonhard Euler; the convergency of series treated by Newton, Euler and the Bernoullis; the binomial theorem, due originally to Newton and subsequently expanded by Euler and others, was used by Joseph Louis Lagrange as the basis of his Calcul des Fonctions. Diophantine problems were revived by Gaspar Bachet, Pierre Fermat and Euler; the modern theory of numbers was founded by Fermat and developed by Euler, Lagrange and others; and the theory of probability was attacked by Blaise Pascal and Fermat, their work being subsequently expanded by James Bernoulli, Abraham de Moivre, Pierre Simon Laplace and others. The germs of the theory of determinants are to be found in the works of Leibnitz; Étienne Bézout utilized them in 1764 for expressing the result obtained by the process of elimination known by his name, and since restated by Arthur Cayley.
In recent times many mathematicians have formulated other kinds of algebras, in which the operators do not obey the laws of ordinary algebra. This study was inaugurated by George Peacock, who was one of the earliest mathematicians to recognize the symbolic character of the fundamental principles of algebra. About the same time, D. F. Gregory published a paper “on the real nature of symbolical algebra.” In Germany the work of Martin Ohm (System der Mathematik, 1822) marks a step forward. Notable service was also rendered by Augustus de Morgan, who applied logical analysis to the laws of mathematics.
The geometrical interpretation of imaginary quantities had a far-reaching influence on the development of symbolic algebras. The attempts to elucidate this question by H. Kühn (1750–1751) and Jean Robert Argand (1806) were completed by Karl Friedrich Gauss, and the formulation of various systems of vector analysis by Sir William Rowan Hamilton, Hermann Grassmann and others, followed. These algebras were essentially geometrical, and it remained, more or less, for the American mathematician Benjamin Peirce to devise systems of pure symbolic algebras; in this work he was ably seconded by his son Charles S. Peirce. In England, multiple algebra was developed by James Joseph Sylvester, who, in company with Arthur Cayley, expanded the theory of matrices, the germs of which are to be found in the writings of Hamilton (see above, under (B); and Quaternions).
The preceding summary shows the specialized nature which algebra has assumed since the 17th century. To attempt a history of the development of the various topics in this article is inappropriate, and we refer the reader to the separate articles.
References.—The history of algebra is treated in all historical works on mathematics in general (see Mathematics: References). Greek algebra can be specially studied in T. L. Heath’s Diophantus. See also John Wallis, Opera Mathematica (1693–1699), and Charles Hutton, Mathematical and Philosophical Dictionary (1815), article “Algebra.” (C. E.*)