Translation:Demonstratio nova theorematis omnem functionem algebraicam rationalem integram unius variabilis in factores reales primi vel secundi gradus resolvi posse

From Wikisource
Jump to navigation Jump to search
New Proof of the Theorem that Every Algebraic Rational Function of One Variable Can Be Resolved into Factors of the First or Second Degree (1799)
by Carl Friedrich Gauss, translated from Latin by Wikisource
4438125New Proof of the Theorem that Every Algebraic Rational Function of One Variable Can Be Resolved into Factors of the First or Second Degree1799Carl Friedrich Gauss


1.[edit]

Any given algebraic equation can be reduced to the form

such that is a positive integer. If we denote the first part of this equation by and assume that the equation is satisfied by multiple unequal values of say by setting etc., then the function will be divisible by the product of the factors etc. Conversely, if the product of several simple factors etc. divides the function the equation will be satisfied by setting equal to any of the quantities etc. Finally, if is equal to the product of such simple factors (whether they are all different or some of them are identical), then no other simple factors besides these can divide the function Therefore, an equation of degree cannot have more roots than but it is evident that an equation of degree can have fewer roots, even if is resolvable into simple factors: if some of these factors are identical, then the number of different ways the equation can be satisfied will necessarily be less than Nevertheless, for the sake of elegance, geometers prefer to say that the equation has roots even in this case, but that some of them are equal to each other: a liberty they could certainly take.

2.[edit]

The things explained so far are sufficiently demonstrated in algebraic works and do not violate geometric rigor anywhere. However, analysts seem to have adopted the theorem on which almost the entire theory of equations is built too hastily and without proper solid proof: that a function such as can always be resolved into simple factors, or, which is entirely consistent with it, that an equation of degree does indeed have roots. But since in quadratic equations, we often encounter cases that contradict this theorem, algebraists were forced to introduce a certain imaginary quantity whose square is and then they acknowledged that if quantities of the form are treated as real, then the theorem holds not only for quadratic but also for cubic and biquadratic equations. However, it is by no means permissible to infer from this that any equation of the fifth degree or higher can be satisfied by admitting quantities of the form or as it is often expressed (although I would prefer a less slippery phrase), that the roots of such an equation can be reduced to the form This theorem, as stated in the title of this paper, does not differ from what has been mentioned before. The aim of this dissertation is to provide a new rigorous demonstration of this theorem.

Moreover, since analysts discovered that there are infinitely many equations that have no roots at all unless quantities of the form are admitted, these fictitious quantities, considered as a special kind of quantity and called imaginary to distinguish them from real ones, have been introduced into the entire analysis; on what grounds? I do not dispute this point here. I will complete my demonstration without any aid from imaginary quantities, although I would be allowed the same freedom that all recent analysts have used.

3.[edit]

Although what is presented in most elementary books as the proof of our theorem is so trivial and deviates so much from geometric rigor that it is scarcely worth mentioning, I will touch on it briefly so that nothing seems to be lacking. To demonstrate that the equation

or indeed has roots, they attempt to prove that can be resolved into simple factors. To achieve this, they assume simple factors etc., where etc. are still unknown, and they set the product of these equal to the function Then, by comparing coefficients, they deduce equations from which they claim that the unknowns etc. can be determined, and the number of these equations is also In particular, unknowns can be eliminated, leading to an equation that contains only the unknown Leaving aside other criticisms that could be made of this argumentation, let us simply ask how we can be certain that the final equation indeed has any roots. Why couldn’t it be the case that neither this final equation nor any magnitude proposed satisfies any value in the entire range of real and imaginary quantities? However, experts will easily see that this final equation must necessarily be entirely identical to the proposed one if the calculation is properly conducted; namely, after eliminating the unknowns etc., the equation

should emerge. There is no need to elaborate further on this reasoning.

Some authors, who seem to have perceived the weakness of this method, take it as an axiom that any equation indeed has roots, whether possible or impossible. However, they do not seem to have clearly explained what is meant by possible and impossible quantities. If possible quantities are meant to denote the same as real and impossible as imaginary, this axiom cannot be admitted without proper demonstration, and instead requires proof. Nevertheless, the terms do not seem to be intended in that sense, but rather the meaning of the axiom appears to be: ‘Although we are not yet certain that there necessarily exist real or imaginary quantities satisfying a given equation of degree we will assume it for a while; for if by chance it should happen that so many real and imaginary quantities cannot be found, then at least we have an escape, and we can say that the remaining ones are impossible.’ If someone prefers to use this phrase rather than simply saying that the equation in this case will not have so many roots, I have no objection; but if, at that point, they treat these impossible roots as if they were something true, and, for example, say that the sum of all the roots of the equation is even if some of them are impossible (which expression explicitly means even if some are missing), then I cannot approve of it. For impossible roots, accepted in this sense, are still roots, and then that axiom cannot be admitted without some kind of demonstration, as it is not unreasonable to ask whether equations can exist that do not even have impossible roots.[1]

4.[edit]

Before I review the demonstrations of our theorem by other geometers and point out what seems to me to be objectionable in each, I make the following observation: it is sufficient to show only that for any equation of any degree

or (where the coefficients etc. are assumed to be real), it can be satisfied in at least one way by the value of expressed as For it is evident that will then be divisible by a real quadratic factor if is not and by a real simple factor if In both cases, the factor will be real and of lower degree than and since, by the same reasoning, it must have a real factor of the first or second degree, it is clear that by continuing this operation, the function will eventually be resolved into real simple or double factors, or, if you prefer to use two simple imaginaries instead of each real double factor, into simple factors.

5.[edit]

The first proof of the theorem is owed to the illustrious geometer d’Alembert and can be found in Recherches sur le calcul integral, Histoire de l’Acad. de Berlin, Année 1746, p. 182 ff. The same proof is also in Bougainville, Traité du calcul integral, à Paris 1754, p. 47 ff. The main points of his method are as follows.

First, he shows that if any function of the variable becomes either for or for and if it can acquire an infinitely small positive real value by assigning a real value to then this function can also obtain an infinitely small negative real value, either from a real value of or an imaginary value of the form Namely, denoting as the infinitely small value of and as the corresponding value of he asserts that can be expressed by a highly convergent series etc., where the exponents etc. are continuously increasing rational quantities, and thus become at least a certain distance from the positive starting point and make the terms in which they appear infinitely small. Now, if among all these exponents there is none that is a fraction with an even denominator, all the terms of the series become real for both positive and negative values of but if some fractions with even denominators are found among these exponents, it is established that, for negative values of the corresponding terms are expressed in the form However, due to the convergence of the infinite series in the former case, it suffices to retain only the first (i.e., the largest) term; in the latter case, it is unnecessary to go beyond the term that first introduces an imaginary part.

Through similar reasoning, it can be shown that if can obtain a negative infinitely small value from a real value of then that function can acquire a positive real infinitely small value from a real value of or from an imaginary value in the form

Hence he further concludes that a real finite value of can also be found, in the former case negative and in the latter case positive, which can be produced from an imaginary value of of the form

From this, it follows that if is a function of such that it obtains a real value from a real value of and also obtains an infinitely small real value, either greater or smaller than from a real value of then it can also receive an infinitely small real value, which is either smaller and greater than (respectively), by assigning to a value of the form This is easily derived from the above if is conceived to be replaced by and by

Finally, d’Alembert asserts that can traverse any interval between two real values (i.e., becoming equal to and and all intermediate real values), by assigning to values of the form that is, the function can increase or decrease by any finite real quantity (depending on whether or ), while remains always in the form For if a real quantity is given (which is supposed to lie between and ), such that cannot be made equal to by such a value of then necessarily a maximum value of must be given (when and a minimum when ), say which would be obtained from a value of in such a way that no value of in a similar form could be assigned that would bring the function closer to by the smallest excess. Now, if in the equation between and is substituted everywhere for and then the real part and the part involving the factor are equated, two equations will result from this (in which and occur mixed with constants) that, through elimination, can yield two others, in one of which and constants are found, and in the other, is free containing only and constants. Therefore, since has traversed all values from to by real values of according to the above, can still approach the value by assigning values to such as respectively. From this, it follows that i.e. it is still in the form contrary to the hypothesis.

Now, if is supposed to represent a function such as it is clear that there is no problem, and such real values can be assigned to so that traverses any interval between two real values. Therefore, can also obtain some value in the form from which becomes Q.E.D.[2]

6.[edit]

The objections that can be raised against d’Alembert’s demonstration mostly come down to the following.

1. d’Alembert raises no doubt about the existence of values of for which the given values of correspond but assumes it and only investigates the form of these values.

Although this objection is in itself very serious, it pertains here only to the form of expression, which can easily be corrected to completely invalidate it.

2. The assertion that can always be expressed by such a series as he posits is certainly false if is supposed to represent any transcendental function (as d’A. hints at in several places). This is evident, for example, if or However, if we restrict the demonstration to the case where is an algebraic function of (which is sufficient for the present matter), the proposition is certainly true. Nevertheless, d’A. provided no evidence to support his assumption; the illustrious Bougainville assumes that is an algebraic function of and recommends the use of Newton’s parallelogram series for finding it.

3. He uses infinitely small quantities more freely than can be justified with geometric rigor or at least would be granted by a careful analyst in our age (where they rightly face skepticism). He also did not explain the leap from the value of being infinitely small to it being finite sufficiently clearly. His conclusion that can obtain a finite value seems to be derived not so much from the possibility of an infinitely small value of as from the fact that, denoting as a very small quantity, due to the great convergence of the series, the closer the true value of is approached, or the more accurately the equation expressing the relation between and or and is satisfied, the more terms of the series are taken. Furthermore, the entire argument seems too vague to draw any rigorous conclusions from it: it should be noted that there are series that, no matter how small a value is assigned to the quantity according to which their powers progress, always diverge, so that if they continue far enough, you can reach terms greater than any given quantity[3]. This happens when the coefficients of the series constitute a hypergeometric progression. Therefore, it should have been necessarily demonstrated that such a hypergeometric series cannot arise in the present case.

However, it seems to me that d’A. did not rightly resort to infinite series here, and they are not suitable for establishing this fundamental theorem of the theory of equations.

4. From the assumption that can attain the value but not the value it does not necessarily follow that there must be a value between and which can reach but not exceed. Another case remains: namely, it could be that there is a limit between and that can be approached as closely as desired by but never actually reaches it. From the arguments provided by d’A., it only follows that can always surpass any value it has reached by a finite quantity, for example, when it has become it can still increase by some finite quantity and with this, a new increment may occur, then another increase etc., so that no increment should be considered final, but there can always be a new one added. Although the multitude of possible increments is not limited by any boundaries, it could still be the case that if the increments etc. continuously decrease, the sum etc. never reaches a certain limit, no matter how many terms are considered.

While this case cannot occur when represents a complete algebraic function of without a demonstration, this inability to occur must necessarily be considered a methodological flaw. However, when is a transcendental function or even a fractional algebraic function, this case can indeed occur, for example, whenever a certain value of corresponds to an infinitely large value of Then, the d’Alembertian method seems not without many difficulties and possibly in some cases impossible to reduce to unquestionable principles.

For these reasons, I cannot consider the d’Alembertian demonstration as satisfactory. However, despite this, I do not believe that the true essence of the demonstration is in any way compromised by all objections, and I think that based on the same foundation (though with a vastly different rationale and at least a more comprehensive perspective), not only a rigorous demonstration of our theorem can be built, but also everything that can be desired concerning the theory of transcendent equations. I will discuss this matter more extensively on another occasion; see meanwhile below, article 24.

7.[edit]

After d’Alembert, Euler published his investigations on the same subject in Recherches sur les racines imaginaires des equations, Hist. de l’Acad. de Berlin A. 1749, p. 223 sqq. He presented two methods, and the essence of the first is summarized in the following.

First, Euler aims to demonstrate that if is any power of then the function (where the coefficient of the second term is ) can always be resolved into two real factors, in which has up to dimensions. To achieve this, he considers two factors:

where the coefficients etc., etc. are still unknown. Their product is set equal to the function The comparison of coefficients yields equations, and it only needs to be shown that the unknowns etc., etc. (whose number is also ) can be assigned real values satisfying these equations. Euler asserts that if is considered as known initially, such that the number of unknowns is one less than the number of equations, then by properly combining these using algebraic methods, all etc., etc. can be rationally determined, without any extraction of roots, by and the known coefficients etc. Furthermore, all etc., etc. can be eliminated, resulting in the equation where is an integral function of and the known coefficients. It suffices here to know one property of this equation, namely, that the last term in (which does not involve the unknown ) must be negative. It follows that the equation must have at least one real root, meaning that and consequently etc., etc. can be determined in at least one real way. This property can be confirmed through the following reflections: When etc. is assumed to be a factor of the function will necessarily be the sum of roots of the equation and thus it must have as many values as there are ways to choose out of roots, which is given by the combinatorial calculation This number will always be oddly even (a not difficult demonstration is omitted here): if is assumed for this number, then will be odd; the equation will then be of the degree. Now, since the second term is missing in the equation the sum of all roots will be It is clear that if the sum of any roots is the sum of the remaining roots will be i.e., if is among the values of then will also be among them. Hence, Euler concludes that is the product of double factors of the form etc., representing etc., all roots of the equation Therefore, due to the multitude of odd factors, the last term in will be the square of the product etc., with a negative sign. Moreover, the square of this product can always be rationally determined from the coefficients etc., and will consequently be a real quantity. Thus, the square of this with a negative sign will certainly be a negative quantity. Q.E.D.

Since these two real factors of are of degree and is a power of the number each factor can again be resolved into two real factors of dimension by the same reasoning. However, as through repeated halving of the number one necessarily eventually reaches two, it is evident that by the continuation of this operation, the function will ultimately be resolved into real second-degree factors.

If, on the other hand, a function is presented in which the second term is not lacking, say also denoting as the power of binary, this will, by the substitution be transformed into a similar function lacking a second term. Hence, it is easily concluded that such a function is also resolvable into real second-degree factors.

Finally, for a given function of degree where is not a binary power: let the nearest higher binary power be denoted as and multiply the given function by arbitrary real simple factors. From the resolvability of the product into real second-degree factors, it is straightforwardly derived that the given function must also be resolvable into real factors of the second or first degree.

8.[edit]

Against this demonstration, one can object:

1. The rule by which E. concludes that from equations, unknowns etc., etc. can all be rationally determined, is not general but often admits exceptions. For example, if in Art. 3 one attempts to express the remaining unknowns and coefficients rationally by considering some unknowns as known, they will easily find that this is impossible, and that the unknown quantities can only be determined by an equation of degree Although it can be immediately seen a priori that this must necessarily happen, it could be rightly doubted whether, even in the present case, for certain values of the situation is such that the unknowns etc., etc. cannot be determined by an equation possibly of a degree greater than For the case where the equation is of the fourth degree, E. extracts rational values of the coefficients through and the given coefficients; the same can indeed be done in all higher-degree equations, but it certainly requires a more extensive explanation. However, it seems worthwhile to delve more deeply and more generally into those formulas that rationally express etc. through etc.; I will undertake a more detailed discussion on this and other matters related to the theory of elimination (an argument by no means exhausted) on another occasion.

2. However, even if it is demonstrated that for an equation of any degree formulas can always be found that express etc., etc. rationally through etc., it is certain that for certain specific values of the coefficients etc., those formulas can become indeterminate, so that not only is it impossible to define those unknowns rationally from etc., but in some cases, no real values of etc., etc. correspond to any real value of For the confirmation of this matter, for brevity, I refer the reader to E.’s dissertation itself, where on p. 236 the equation of the fourth degree is more extensively explained. Everyone will immediately see that the formulas for the coefficients become indeterminate if and the value is assumed for and their values cannot be assigned without extracting roots, and even more, not real values, if the quantity is negative. Although in this case it is easy to see that can still have other real values for which real values of correspond, still, someone might fear that the solution of this difficulty (which E. did not touch at all) may require much more effort in higher-degree equations. Certainly, this matter should by no means be passed over in silence in an exact demonstration.

3. The illustrious E. tacitly assumes that the equation has roots, and he establishes that their sum is because the second term in is absent. My opinion on this assumption (which all authors use in this argument) was already declared in Art. 3 above. The proposition that the sum of all roots of an equation equals the negative of the coefficient of the first term does not seem to be applicable to equations except those which have roots. Since it must be proved by this very demonstration that the equation actually has roots, it does not seem permissible to assume their existence. Undoubtedly, those who have not yet penetrated the fallacy of this paralogism will respond, it is not demonstrated here that the equation can be satisfied (for this expression means having roots), but that it can only be satisfied by values of of the form the former is to be taken as an axiom. However, since other forms of quantities cannot be conceived beyond the real and imaginary it is not clear enough how what should be demonstrated differs from what is assumed as an axiom; indeed, even if it were possible to conceive other forms of quantities, such as etc., it should not be admitted without demonstration that the equation can be satisfied by some value of either real, or in the form or in the form or in etc. Therefore, that axiom cannot have any other meaning than this: Any equation can be satisfied either by a real value of the unknown, or by an imaginary value in the form or perhaps by a value in some hitherto unknown form, or by a value that is not contained in any form whatsoever. But how such quantities, about which one cannot even imagine an idea — true shadows of shadows — can be added or multiplied, is not understood with the clarity demanded in mathematics[4].

Now, I do not intend to render the conclusions that E. derived from his assumption at all suspect through these objections; rather, I am confident that they can be confirmed by a method neither difficult nor very different from the Eulerian one, in such a way that there should be no doubt left for anyone, even the slightest. I only criticize the form, which, although it can be of great utility in discovering new truths, seems to be not at all commendable in demonstrating before the public.

4. As for the demonstration of the assertion that the product etc., can be rationally determined from the coefficients in the illustrious E. has brought nothing at all. All that he explains on this matter in equations of fourth degree is as follows (where are the roots of the proposed equation ):

‘On m’objectera sans doute, que j’ai supposé ici, que la quantité etait une quantité réelle, et que son quarré était affirmatif; ce qui était encore douteux, vu que les racines etant imaginaires, il pourrait bien arriver, que le quarré de la quantité qui en est composée, fut négatif. Or je réponds à cela que ce cas ne saurait jamais avoir lieu; car quelque imaginaires que soient les racines on sait pourtant, qu’il doit y avoir [5]; ces quantites étant réelles. Mais puisque leur produit est déterminable comme on sait, par les quantités et sera par conséquent réel, tout comme nous avons vu, qu’il est effectivement et On reconnaı̂tra aisément de même, que dans les plus hautes équations cette même circonstance doit avoir lieu, et qu’on ne saurait me faire des objections de ce côté.’

However, E. did not add anywhere that the product etc., can be rationally determined by etc., although he seems to have always understood it implicitly, as without it the demonstration can have no force. Indeed, it is true in equations of the fourth degree that if the product is expanded, it yields However, it does not seem clear enough how, in all higher-degree equations, the product can be rationally determined by the coefficients. The distinguished de Foncenex, who first observed this (Miscell. phil. math. soc. Taurin. Vol. I, p. 117), rightly contends that without a rigorous demonstration of this proposition, the method loses all its force, and he admits that it seems quite difficult to him, describing the fruitless attempts he made in that direction[6]. However, this matter can be easily completed by the following method (of which I can only provide a summary here): Although it is not clear enough in equations of the fourth degree that the product can be determined by the coefficients it can be easily seen that the same product is also as well as and finally also Therefore, the product will be a quarter of the sum which, if expanded, can be foreseen a priori to be a rational integral function of the roots in which they all enter in the same way. Such functions can always be expressed rationally by the coefficients of the equation whose roots are — The same is also evident if the product is brought into this form:

The expanded product of this expression, involving all in the same way, can be easily foreseen. Knowledgeable individuals will simultaneously gather how this can be applied to higher-degree equations. I reserve the complete exposition of the demonstration, which brevity does not permit me to include here, along with a more extensive discussion of functions involving multiple variables, for another occasion.

Now, I observe that in addition to these four objections, there are still some other aspects in the demonstration of E. that could be criticized, which I pass over in silence lest I seem to be an overly severe critic, especially since the foregoing seems to sufficiently demonstrate that the demonstration, in the form in which it is proposed by E., cannot be considered complete.

After this demonstration, E. presents another way to reduce the theorem for equations whose degree is not a binary power to the resolution of such equations. However, since this method teaches nothing for equations whose degree is a binary power and, moreover, is equally susceptible to all the aforementioned objections (except the fourth) as the initial general demonstration, there is no need to elaborate on it here.

9.[edit]

In the same paper, on page 263, the illustrious E. endeavored to further confirm our theorem by another method, the essence of which is as follows: Given an equation an analytic expression representing its roots explicitly could not be found so far for exponents however, it seems certain (as E. asserts) that it can contain nothing else but arithmetic operations and root extractions, increasingly complicated as grows. If this is conceded, E. excellently demonstrates that, no matter how complicated the radical signs are among themselves, the formulas can always be represented by the form where are real quantities.

Against this reasoning, one can object that, after so many great efforts by geometers, there remains little hope of ever reaching a general solution for algebraic equations. It becomes more and more likely that such a resolution is entirely impossible and contradictory. This should not seem too paradoxical, especially since what is commonly called the solution of an equation is properly nothing other than its reduction to pure equations. For the solution of pure equations is not taught but assumed, and if you express the root of the equation as you have not solved it, nor have you done more than if you were to invent some symbol to denote the root of the equation and equate the root to it. It is true that pure equations, due to the ease of finding their roots by approximation and the elegant connection that all roots have with each other, excel above all others and are therefore not to be blamed for analysts denoting these roots by a specific symbol. However, it does not follow from this that the root of any equation can be expressed by these symbols. Or, in other words, it is assumed without sufficient reason that the solution of any equation can be reduced to the solution of pure equations. Perhaps it would not be so difficult to rigorously demonstrate the impossibility already for the fifth degree, about which I will present more extensive discussions elsewhere. Here, it suffices to note that the general solvability of equations, in the sense accepted here, is still highly doubtful, and therefore, the demonstration, whose entire validity depends on that assumption, currently carries no weight.

10.[edit]

Later, the distinguished de Foncenex, having noticed a deficiency in Euler’s initial demonstration (see objection 4 in article 8) that he could not rectify, attempted another approach, which he presented in the aforementioned commentary on page 120[7]. This approach is as follows:

Suppose we have the equation representing a function of degree in an unknown If is an odd number, then it is clear that this equation has a real root. However, if is even, the distinguished Foncenex attempts to prove in the following way that the equation has at least one root of the form Let where is an odd number, and suppose that is a divisor of the function Then each value of will be the sum of two roots of the equation (with the sign changed). Therefore, will have values, and if is assumed to be determined by the equation (where is a function involving and known coefficients in ), this will be of degree It can be easily seen that will be of the form where is an odd number. Now, unless is odd, assume again that is a divisor of By similar reasoning, will be determined by the equation where is a function of degree in Setting will be of the form where is an odd number. Now, unless is odd, assume again that is a divisor of and then will be determined by the equation which has degree where is of the form an odd number. It is evident that in the series of equations etc., the degree will be odd, and thus have a real root. For brevity, let us assume so that the equation has a real root It can be easily understood that the same reasoning holds for any other value of Then, the coefficient and the coefficients in (which can be easily seen to be integral functions of the coefficients in ) or are asserted by de Foncenex to be rationally determinable from and the coefficients of and are therefore real. It follows that the roots of the equation will be of the form They will also satisfy the equation i.e., this equation will have roots of the form Finally, by similar reasoning, it follows that even will be in the same form, and consequently, the root of the equation will also satisfy the given equation Hence, any equation will have at least one root in the form

11.[edit]

Objections 1, 2, 3, which I made against the first demonstration of Euler (art. 8), have the same force against this method. However, there is a difference, so that the second objection, to which Euler’s demonstration was only liable in certain special cases, must now apply to all cases. Specifically, it can be a priori demonstrated that even if a formula is given expressing the coefficient rationally in terms of and the coefficients in it must necessarily become indeterminate for multiple values of likewise, a formula expressing the coefficient in terms of must become indeterminate for certain values of and so on. This will be most clearly understood if we take the example of a quartic equation. Let us assume, therefore, that and let the roots of the equation be Then it is clear that the equation will be of the sixth degree, and its roots will be The equation will be of the fifteenth degree, and its values of will be

Now, in this equation, since its degree is odd, it will have to have a root, and it will indeed have the real root (which, with the sign of the first coefficient in changed, is equal and therefore not only real but also rational, if the coefficients in are rational). But it can be easily seen that if a formula is given that rationally expresses the value of in terms of the corresponding value of it must necessarily become indeterminate for For this value is a root of the equation and the three values of corresponding to it will be, for example, and all of which can be irrational. Clearly, a rational formula could not produce an irrational value of in this case, nor could it produce three distinct values. From this example, it is evident that the method of de Foncenex is by no means satisfactory, but if it is to be made complete from every aspect, a much deeper investigation into the theory of elimination is required.

12.[edit]

Finally, Lagrange dealt with our theorem in his work Sur la forme des racines imaginaires des équations, Nouv. Mém. de l’Acad. de Berlin 1772, p. 222 sqq. This great geometer especially endeavored to repair the deficiencies in Euler’s first demonstration, particularly addressing those aspects constituting objections two and four as outlined above (art. 8). He delved so deeply into these matters that nothing more is desired, except perhaps in the previous discussion on the theory of elimination (on which this entire investigation is based), certain doubts may seem to remain. However, he did not touch upon the third objection at all, and the entire inquiry is built on the assumption that the equation of degree indeed has roots.

Therefore, with careful consideration of what has been presented so far, I hope that experts will find a new demonstration of this most important theorem, derived from entirely different principles, to be not unwelcome. I now proceed to present it.

13.[edit]

Lemma. Let denote any positive integer. Then the function will be divisible by .

Proof. For the function becomes and hence it is divisible by any factor. For the quotient becomes and for any larger value, it will be It can be easily confirmed that by multiplying this function by the product becomes equal to the given function.

14.[edit]

Lemma. If the quantity and the angle are determined in such a way that we have the equations

then the function will be divisible by the double factor provided is not if then the same function will be divisible by the simple factor

Proof. I. From the preceding article, all of the following quantities will be divisible by

Therefore, the sum of these quantities will also be divisible by The terms of the first group constitute the sum the sum of the second group is due to [2]; and it is easily seen that the sum of the third group also vanishes, if [1] is multiplied by and [2] by and the products are subtracted. Hence, it follows that the function is divisible by and therefore, unless so is the function Q.E.P.

II. If then either or In the former case, due to [1], and therefore is divisible by or in the latter case, and generally Therefore, due to [1], when and hence the function is divisible by Q.E.S.

15.[edit]

The preceding theorem is often demonstrated with the aid of imaginary quantities, see Euler Introductio in Analysin Infinitorum Vol. I p.110; I deemed it worthwhile to show how it can be equally easily derived without their assistance. It is already evident that for the proof of our theorem, nothing else is required than to show: Given any function of the form and can be determined in such a way that equations [1] and [2] hold. From this, it will follow that has a real factor of the first or second degree; however, the division will necessarily produce a real quotient of a lower degree, which, for the same reason, will also have a factor of the first or second degree. By continuing this operation, will eventually be resolved into simple or double real factors. Thus, the goal of the following discussion is to prove that theorem.

16.[edit]

Imagine an infinite fixed plane (the plane of the table, Fig. 1), and on this, an infinite fixed straight line passing through the fixed point Assume any length as the unit so that all lines can be expressed by numbers. At any point on the plane, with a distance from the center and an angle erect a perpendicular equal to the value of the expression

which, for brevity, I will always denote by in the following. I always consider the distance as positive, and for points on the other side of the axis, the angle should be considered either as greater than two right angles or as negative (which here is equivalent). The ends of these perpendiculars (which should be taken above the plane for a positive value of below for a negative value of and on the plane itself when vanishes) will be on a continuous curved surface everywhere infinite, which, for brevity, I will call the first surface in the following. Similarly, in exactly the same way, another surface, whose height above any point on the plane is

which I will denote by for brevity. This surface will also be continuous and everywhere infinite, and I will distinguish from the former by the term second surface. Then it is evident that the whole matter revolves around proving that at least one point exists that lies simultaneously in the plane, on the first surface, and on the second surface.

17.[edit]

It can be easily seen that the first surface lies partly above and partly below the plane; for the distance from the center can be taken so large that the remaining terms in become negligible compared to the first term this term, however, can be either positive or negative for a properly determined angle Therefore, the fixed plane will necessarily intersect the first surface; I will call this intersection of the plane with the first surface the first line, which will be determined by the equation For the same reason, the plane will intersect the second surface; the intersection will constitute a curve determined by the equation which I will call the second line. Strictly speaking, each curve will consist of several branches that can be entirely separate, but each will be a continuous line. Indeed, the first line will always be such that it is called a complex, and the axis should be regarded as part of this curve; for any value assigned to will always be when is either or However, it is better to consider the complex of all branches passing through all points where as one curve (according to the usage generally accepted in higher geometry), and similarly for all branches passing through all points where It is now evident that the problem has been reduced to proving that at least one point exists in the plane where some branch of the first line intersects some branch of the second line. To achieve this, it will be necessary to closely examine the nature of these lines.

18.[edit]

First of all, I observe that both curves are algebraic, namely, if brought back to orthogonal coordinates, they are of order Starting with the abscissas from with toward and ordinates toward we have and thus, generally, for any

Therefore, both and will consist of several terms of this kind denoting as positive integers whose sum is at most Moreover, it can be easily foreseen that all terms of involve the factor and therefore, the first line is composed of a line (whose equation is ) and a curve of order However, it is not necessary to consider this distinction here.

A matter of greater significance will be the investigation of whether the first and second lines have infinite branches and how many of each. At an infinite distance from the point the first line, whose equation is will merge with the line whose equation is The latter exhibits straight lines intersecting at point where the first is the axis and the others are inclined at angles etc. degrees against it. Therefore, the first line has infinite branches, which, when described around the circle with an infinitely large radius, divide the circumference into equal parts. The division occurs in such a way that the circumference is intersected by the first branch at the intersection of the circle and the axis, by the second at a distance of by the third at a distance of and so on.

Similarly, the second line at an infinite distance from the center will have an asymptote expressed by the equation This asymptote is a complex of straight lines at point intersecting at equal angles, such that the first forms an angle of the second an angle of the third an angle of and so on. Therefore, the second line will also have infinite branches, each occupying the middle position between the two nearest branches of the first line. This arrangement causes the branches to intersect the circumference of a circle described with an infinitely large radius at points that are etc. away from the axis.

However, it is evident that the axis itself always constitutes two infinite branches of the first line, namely the first and This arrangement of the branches is clearly shown in Fig. 2, for the case where the branches of the second line are represented with dotted lines to distinguish them from the branches of the first line. The same applies to Fig. 4[8]. Since these conclusions are of utmost importance, and infinitely large quantities may offend some readers, I will demonstrate them without the support of the infinite in the following article.

19.[edit]

Theorem. With all the conditions as stated above, a circle can be described from the center on whose circumference there are points where and an equal number of points where arranged such that each latter point lies between two former points.

Denote the sum of all coefficients etc., up to by and let be taken such that and [9]. Then I say that in a circle described with a radius the conditions stated in the theorem necessarily hold. Specifically, for simplicity, designate the point on its circumference that is degrees away from its intersection with the left side of the axis, or for which by (1), and similarly, the point that is away from this intersection, or for which by (3); and the point where by (5), and so on up to which is degrees away from that intersection, if you always progress in the same direction (or from the opposite side), so that a total of points are on the circumference, spaced at equal intervals. Then one point will lie between and (1) for which similarly, there will be singular points between (3) and (5); between (7) and (9); between (11) and (13), and so on, with a total of points. Likewise, each point for which will lie between (1) and (3); between (5) and (7); between (9) and (11), with the total count also Finally, apart from these points, there will be no other points in the entire circumference for which either or is

Proof. I. In the point (1), we have and thus

However, the sum involving etc. cannot be greater than Therefore, it must necessarily be less than It follows that at this point, the value of is certainly positive. Hence, will have a positive value when lies between and i.e., from point (1) to (3), the value of will always be positive. By the same reasoning, will have a positive value from point (9) to (11) and generally from any point to where denotes any integer. Similarly, will have a negative value everywhere between (5) and (7), between (13) and (15), etc., and generally between and so it can never be in these intervals. But since the value is positive at (3) and negative at (5), it must be somewhere between (3) and (5); also somewhere between (7) and (9); between (11) and (13), etc., up to the interval between and (1) inclusive, so that altogether at points, Q.E.D.

II. That no other points with this property exist beyond these points can be understood as follows. Since there are none between (1) and (3), between (5) and (7), etc., it could not be otherwise unless more such points existed, which would happen only if at least two were in some interval between (3) and (5) or between (7) and (9), etc. Then necessarily in the same interval, would be either a maximum or minimum, and thus But and between (3) and (5) is always negative and Hence, it is easily seen that in this entire interval, is a negative quantity, and similarly, between (7) and (9) everywhere positive; between (11) and (13) negative, etc., so that cannot exist in any of these intervals. Therefore, etc. Q.E.S.

III. In a wholly similar manner, it is demonstrated that has a negative value everywhere between (3) and (5), between (11) and (13), etc., and generally between and positive, however, between (7) and (9), between (15) and (17), etc., and generally between and Hence, it immediately follows that must occur somewhere between (1) and (3), between (5) and (7), etc., i.e., in points. However, in none of these intervals can occur (which is easily proved similarly as above): therefore, more than those points on the circumference of the circle will not be given, where Q.E.T. et Q.

Moreover, the part of this theorem according to which more than points do not exist where nor more than where can also be demonstrated from the fact that the equations represent curves of order, such as, according to higher geometry, cannot be cut in more than points, a circle being a curve of the second order.

20.[edit]

If another circle with a radius greater than is described from the same center, then it will be divided in the same way: between points (3) and (5), there will be one point where likewise between (7) and (9), etc. It will be easily observed that the less the radius of this circle differs from the radius the closer such points between (3) and (5) should be on the circumferences of both circles. The same will occur if a circle with a radius somewhat smaller than but greater than and is described. From this, it is easily understood that the circumference of the circle described with a radius is actually cut at the point between (3) and (5) where by some branch of the first line; the same holds for the other points where Similarly, it is evident that the circumference of this circle is cut at all points where by some branch of the second line. These conclusions can also be expressed in the following way: When a circle of the appropriate size is described from the center branches of the first line and branches of the second line will enter this, in such a way that the two nearest branches of the first line are separated by some branch of the second line. See Fig. 2, where the circle is now of finite size, and the numbers assigned to each branch are not to be confused with the numbers by which I designated specific limits in the previous article and in this for the sake of brevity.

21.[edit]

Now, from this relative arrangement of the branches entering the circle, the intersection of some branch of the first line with a branch of the second line within the circle can be deduced in various ways. I am almost ignorant of which method to choose among the rest. The following seems very clear: Let’s designate (Fig. 2) a point on the circumference of the circle, where it is cut by a branch from the left side of the axis (which itself is one of the branches of the first line) as the nearest point where a branch of the second line enters, as the next point to this, where the second branch of the first line enters, as and so on up to so that in any point marked with an even number, a branch of the second line enters the circle, contrary to a branch of the first line expressed in all points by an odd number. It is well known from higher geometry that an algebraic curve (or each part of any algebraic curve if it happens to be composed of several) may either return into itself or extend infinitely on both sides, so if any branch of an algebraic curve enters a finite space, it must necessarily come out again somewhere from this space[10]. Hence, it is easily concluded that any point marked with an even number (or, for the sake of brevity, any even point) should be connected by a branch of the first line with another even point within the circle, and similarly, any point marked with an odd number should be connected with another similar point by a branch of the second line. Although the connection of these two points according to the nature of the function can be very different, so that it cannot be determined in general, it can be easily demonstrated that in any case, an intersection of the first line with the second line always occurs.

22.[edit]

The demonstration of this necessity seems most conveniently representable by reductio ad absurdum. Namely, let’s assume that the connection of any two even points and any two odd points can be arranged in such a way that no intersection of a branch of the first line with a branch of the second line arises from it. Since the axis is a part of the first line, clearly point must be connected with point Therefore, point cannot be connected with any point beyond the axis, i.e., with no point expressed by a number greater than otherwise the connecting line would necessarily cut the axis. So, if is assumed to be connected with point then By similar reasoning, if is connected with then because otherwise, the branch would necessarily cut the branch For the same reason, point will be connected with some point between and and it is clear that if etc., are assumed to be connected with etc., lies between and between and etc. Hence, it is evident that, finally reaching some point connected with point the branch entering the circle at point will necessarily cut the branch connecting points and However, since one of these two branches will belong to the first line and the other to the second, it is now clear that the assumption is contradictory, and therefore, an intersection of the first line with the second line must necessarily occur somewhere.

If this is combined with the preceding discussions, it will be concluded from all the explanations that the theorem, a rational algebraic function of one indeterminate can be resolved into factors of the first or second degree with real coefficients, has been rigorously demonstrated.

23.[edit]

Moreover, it can be easily deduced from the same principles, that not only one but at least intersections of the first line with the second line are given, although it is also possible for the first line to be cut by several branches of the second line at the same point, in which case the function will have multiple equal factors. However, since it suffices here to have demonstrated the necessity of one intersection, I do not dwell further on this matter for the sake of brevity. For the same reason, I do not pursue other properties of these lines here in more detail, such as the intersection always occurring at right angles, or if multiple branches of each curve coincide at the same point, the first line having as many branches as the second line, and these being alternately placed, intersecting at equal angles, etc.

Finally, I observe that it is not impossible for the preceding demonstration, which I built on geometric principles here, to be presented in a purely analytical form. However, I believed that the representation I explained here would be less abstract, and the essence of the proof could be put more clearly before the eyes than could be expected from an analytical demonstration.

As a bonus, I will suggest another method for proving our theorem, which, at first glance, will seem not only very different from the preceding demonstration but also from all the other demonstrations explained above, and yet it is fundamentally the same as the d’Alembertian method. I leave it to those familiar with the subject to compare it with the previous one and explore the parallelism between the two. It is attached solely for their benefit.

24.[edit]

Above the plane of Figure 4, relative to the axis and the fixed point I assume that the first and second surfaces are described in the same way as above. Take any point located on any branch of the first line, where (for example, any point lying on the axis), and unless at this point, proceed from this point in the first line towards the direction where the absolute magnitude of decreases. If, by chance, the absolute value of decreases in both directions at the point it is arbitrary where you proceed; but I will immediately explain what to do if increases in both directions. It is clear that as long as you always progress in the first line, you will necessarily reach a point where or one where the value of becomes a minimum, for example, the point In the former case, the sought point is found; in the latter, it can be demonstrated that in this point, multiple branches of the first line intersect (indeed, an equal number of branches), and their semiaxes are so arranged that if you deviate towards any of them (either here or there), the value of will continue to decrease. (For the sake of brevity, I must suppress the demonstration of this theorem, which, although not more difficult, is more extensive.) In this branch, you can then progress again until becomes (as happens in Fig. 4 at ) or again a minimum. Then, deviating again, you will necessarily reach a point where

Against this demonstration, a doubt could be raised about whether it is possible that no matter how far you progress, and even though the value of always decreases, these decrements continuously become slower, and nevertheless, that value never reaches a certain limit. This objection would correspond to the fourth in Article 6. But it would not be difficult to assign a limit, such that once you surpass it, the value of must necessarily not only change more rapidly but also not decrease any longer, so that before reaching this limit, the value must have necessarily been reached. However, I reserve the opportunity to elaborate more extensively on this and other points that I could only touch upon in this demonstration on another occasion.

We discovered the principles on which this demonstration is based in October 1797.

  1. I always understand the term imaginary quantity here to refer to a quantity in the form as long as is not equal to In this sense, this expression has always been accepted by all geometers of the first order, and I consider those who wanted to call the quantity imaginary only in the case where and impossible only when not worth listening to, as this distinction is neither necessary nor of any utility. If imaginary quantities are to be retained in analysis altogether (which seems more advisable than abolishing them, provided they are solidly established), then they must necessarily be regarded as equally possible as real quantities; hence, I would prefer to encompass real and imaginary quantities under the common designation of possible quantities: conversely, I would call a quantity impossible if it should satisfy conditions that cannot be satisfied even by admitting imaginary ones, yet in a way that this phrase means the same as saying that such a quantity does not exist in the entire range of magnitudes. From this standpoint, I would not concede the formation of a peculiar class of quantities. If someone says that an equilateral right-angled triangle is impossible, no one will deny it. But if he wants to consider such an impossible triangle as a new kind of triangles and apply properties of other triangles to it, would anyone take it seriously? This would be playing with words or rather abusing them. Although even eminent mathematicians have often applied truths that manifestly presuppose the possibility of certain quantities to cases where the possibility was still doubtful; and while I do not deny that such licenses usually pertain only to the form and semblance of reasoning, which the keen edge of true geometry can soon penetrate: yet it seems more advisable and more worthy of the sublimity of a science celebrated as the most perfect example of clarity and certainty, either to entirely prohibit such liberties, or at least to use them sparingly and only where those less practiced might find it difficult to perceive the matter without their aid, and where it could still be handled as rigorously, if perhaps less briefly. However, I do not deny that what I have said here against the abuse of impossibilities can be applied in some respects against imaginaries as well: yet I reserve the vindication of these and the fuller exposition of this whole matter for another occasion.
  2. It is worth noting that d’Alembert in his exposition of this demonstration used geometric considerations, viewing as the abscissa and as the ordinate of the curve (in the manner of all geometers of the first part of this century, among whom the notion of functions was less common). However, since all his reasoning, if you consider only their essence, relies on purely analytic principles, and imaginary curves and expressions of imaginary ordinates may seem harder and more likely to confuse the modern reader, I preferred to use a purely analytic form of representation here. I added this note to prevent anyone from suspecting that something essential had been changed by comparing d’Alembert’s demonstration itself with this concise exposition.
  3. By the way, on this occasion, I note incidentally that there are many series that initially seem to converge greatly, most notably those used by Euler in the latter part of Institutiones Calculi Differentialis Chapter VI. for approximating the sum of other series (for the remaining series on p. 475-478 can indeed converge), which, as far as I know, has not been noticed by anyone so far. Therefore, it would be highly desirable to clearly and rigorously demonstrate why such series, which converge very quickly at first, then more slowly, and finally more and more slowly, nevertheless provide an approximation to the true sum, as long as not too many terms are taken, and until such a sum can be safely considered exact.
  4. All of this will be much elucidated by another dissertation already sweating under the press, where in a completely different argument, but nevertheless analogous, I could have used a similar license with exactly the same right, as has been done here in equations by all analysts. Although the proofs of several truths could have been completed in a few words with the help of such fictions, which otherwise become very difficult and require the most subtle artifices, I preferred to abstain from them altogether and hoped to have satisfied a few if I followed the method of analysts.
  5. E. per errorem habet unde etiam postea perperam statuit
  6. An error seems to have crept into this explanation, namely on p. 118, line 5. Instead of "characteris (on choisissait seulement Celles oü entrait p etc.)," one must necessarily read "une même racine quelconque de l’équation in-oposee," or something similar, as the former has no meaning.
  7. Explanations related to this commentary are found in the second volume of the same Miscellanea on p. 337. However, these are not relevant to the current discussion but pertain to the logarithms of negative quantities, which were discussed in the same work.
  8. Fig. 4 is constructed assuming in which case readers less accustomed to general and abstract discussions may find it challenging to visualize the respective positions of both curves concretely. The length of the line is assumed to be 10 (CN= 1.26255.)
  9. When condition one implies condition two; when condition two implies condition one.
  10. It seems to have been demonstrated quite well that an algebraic curve cannot suddenly break off anywhere (as happens, for example, in a transcendental curve whose equation is ), nor lose itself, as it were, after infinite spirals at some point (like the logarithmic spiral), and as far as I know, no one has cast doubt on this matter. However, if someone demands a demonstration that is not subject to any doubts, I will undertake it on another occasion. In the present case, it is evident that if a branch, for example, 2, did not come out from the circle anywhere (Fig. 3), you could enter the circle between and then move around the whole branch (which should get lost in the space of the circle), and finally be able to exit between and again, so that you never intersect the first line on the entire path. This is absurd because at the point where you entered the circle, you had the first surface above you, and in the exit, below; therefore, you must have necessarily intersected the first surface itself somewhere, i.e., at a point on the first line. However, from this reasoning based on the principles of the geometry of position, which are no less valid than the principles of the geometry of magnitudes, it follows only that if you enter a branch of the first line in the circle, you can exit somewhere else from the circle, always remaining on the first line, and not that your path is a continuous line in the sense in which it is understood in higher geometry. But here it suffices that the path is a continuous line in the common sense, i.e., not interrupted anywhere but cohering everywhere.