# The Meaning of Relativity/Lecture 1

THE MEANING OF RELATIVITY

LECTURE I

SPACE AND TIME IN PRE-RELATIVITY PHYSICS

The theory of relativity is intimately connected with the theory of space and time. I shall therefore begin with a brief investigation of the origin of our ideas of space and time, although in doing so I know that I introduce a controversial subject. The object of all science, whether natural science or psychology, is to co-ordinate our experiences and to bring them into a logical system. How are our customary ideas of space and time related to the character of our experiences?

The experiences of an individual appear to us arranged in a series of events; in this series the single events which we remember appear to be ordered according to the criterion of "earlier" and "later," which cannot be analysed further. There exists, therefore, for the individual, an I-time, or subjective time. This in itself is not measurable. I can, indeed, associate numbers with the events, in such a way that a greater number is associated with the later event than with an earlier one; but the nature of this association may be quite arbitrary. This association I can define by means of a clock by comparing the order of events furnished by the clock with the order of the given series of events. We understand by a clock something which provides a series of events which can be counted, and which has other properties of which we shall speak later.

By the aid of speech different individuals can, to a certain extent, compare their experiences. In this way it is shown that certain sense perceptions of different individuals correspond to each other, while for other sense perceptions no such correspondence can be established. We are accustomed to regard as real those sense perceptions which are common to different individuals, and which therefore are, in a measure, impersonal. The natural sciences, and in particular, the most fundamental of them, physics, deal with such sense perceptions. The conception of physical bodies, in particular of rigid bodies, is a relatively constant complex of such sense perceptions. A clock is also a body, or a system, in the same sense, with the additional property that the series of events which it counts is formed of elements all of which can be regarded as equal.

The only justification for our concepts and system of concepts is that they serve to represent the complex of our experiences; beyond this they have no legitimacy. I am convinced that the philosophers have had a harmful effect upon the progress of scientific thinking in removing certain fundamental concepts from the domain of empiricism, where they are under our control, to the intangible heights of the a priori. For even if it should appear that the universe of ideas cannot be deduced from experience by logical means, but is, in a sense, a creation of the human mind, without which no science is possible, nevertheless this universe of ideas is just as little independent of the nature of our experiences as clothes are of the form of the human body. This is particularly true of our concepts of time and space, which physicists have been obliged by the facts to bring down from the Olympus of the a priori in order to adjust them and put them in a serviceable condition.

We now come to our concepts and judgments of space. It is essential here also to pay strict attention to the relation of experience to our concepts. It seems to me that Poincaré clearly recognized the truth in the account he gave in his book, "La Science et l'Hypothese." Among all the changes which we can perceive in a rigid body those are marked by their simplicity which can be made reversibly by an arbitrary motion of the body; Poincaré calls these, changes in position. By means of simple changes in position we can bring two bodies into contact. The theorems of congruence, fundamental in geometry, have to do with the laws that govern such changes in position. For the concept of space the following seems essential. We can form new bodies by bringing bodies ${\displaystyle B,C,...}$ up to body ${\displaystyle A}$; we say that we continue body ${\displaystyle A}$. We can continue body ${\displaystyle A}$ in such a way that it comes into contact with any other body, ${\displaystyle X}$, The ensemble of all continuations of body ${\displaystyle A}$ we can designate as the "space of the body ${\displaystyle A}$." Then it is true that all bodies are in the "space of the (arbitrarily chosen) body ${\displaystyle A}$." In this sense we cannot speak of space in the abstract, but only of the "space belonging to a body ${\displaystyle A}$." The earth's crust plays such a dominant rôle in our daily life in judging the relative positions of bodies that it has led to an abstract conception of space which certainly cannot be defended. In order to free ourselves from this fatal error we shall speak only of "bodies of reference," or "space of reference." It was only through the theory of general relativity that refinement of these concepts became necessary, as we shall see later.

I shall not go into detail concerning those properties of the space of reference which lead to our conceiving points as elements of space, and space as a continuum. Nor shall I attempt to analyse further the properties of space which justify the conception of continuous series of points, or lines. If these concepts are assumed, together with their relation to the solid bodies of experience, then it is easy to say what we mean by the three-dimensionality of space; to each point three numbers, ${\displaystyle x_{1},x_{2},x_{3}}$ (co-ordinates), may be associated, in such a way that this association is uniquely reciprocal, and that ${\displaystyle x_{1},x_{2},x_{3}}$ vary continuously when the point describes a continuous series of points (a line).

It is assumed in pre-relativity physics that the laws of the orientation of ideal rigid bodies are consistent with Euclidean geometry. What this means may be expressed as follows: Two points marked on a rigid body form an interval. Such an interval can be oriented at rest, relatively to our space of reference, in a multiplicity of ways. If, now, the points of this space can be referred to co-ordinates ${\displaystyle x_{1},x_{2},x_{3}}$, in such a way that the differences of the co-ordinates, ${\displaystyle \Delta x_{1},\Delta x_{2},\Delta x_{3}}$ of the two ends of the interval furnish the same sum of squares,

 ${\displaystyle s^{2}=\Delta x_{1}^{2}+\Delta x_{2}^{2}+\Delta x_{3}^{2}}$ (1)
for every orientation of the interval, then the space of reference is called Euclidean, and the co-ordinates Cartesian.[1] It is sufficient, indeed, to make this assumption in the limit for an infinitely small interval. Involved in this assumption there are some which are rather less special, to which we must call attention on account of their fundamental significance. In the first place, it is assumed that one can move an ideal rigid body in an arbitrary manner. In the second place, it is assumed that the behaviour of ideal rigid bodies towards orientation is independent of the material of the bodies and their changes of position, in the sense that if two intervals can once be brought into coincidence, they can always and everywhere be brought into coincidence. Both of these assumptions, which are of fundamental importance for geometry and especially for physical measurements, naturally arise from experience; in the theory of general relativity their validity needs to be assumed only for bodies and spaces of reference which are infinitely small compared to astronomical dimensions.

The quantity ${\displaystyle s}$ we call the length of the interval. In order that this may be uniquely determined it is necessary to fix arbitrarily the length of a definite interval; for example, we can put it equal to I (unit of length). Then the lengths of all other intervals may be determined. If we make the ${\displaystyle x_{\nu }}$ linearly dependent upon a parameter ${\displaystyle \lambda }$,

 ${\displaystyle x_{\nu }=a_{\nu }+\lambda b_{\nu },}$
we obtain a line which has all the properties of the straight lines of the Euclidean geometry. In particular, it easily follows that by laying off ${\displaystyle n}$ times the interval ${\displaystyle s}$ upon a straight line, an interval of length ${\displaystyle n\cdot s}$ is obtained. A length, therefore, means the result of a measurement carried out along a straight line by means of a unit measuring rod. It has a significance which is as independent of the system of co-ordinates as that of a straight line, as will appear in the sequel.

We come now to a train of thought which plays an analogous rôle in the theories of special and general relativity. We ask the question: besides the Cartesian co-ordinates which we have used are there other equivalent co-ordinates? An interval has a physical meaning which is independent of the choice of co-ordinates; and so has the spherical surface which we obtain as the locus of the end points of all equal intervals that we lay off from an arbitrary point of our space of reference. If ${\displaystyle x_{\nu }}$ as well as ${\displaystyle x_{\nu }'}$ (${\displaystyle \nu }$ from 1 to 3) are Cartesian co-ordinates of our space of reference, then the spherical surface will be expressed in our two systems of co-ordinates by the equations

 ${\displaystyle \sum \Delta x_{\nu }^{2}={\text{const.}}}$ (2)
 ${\displaystyle \sum \Delta x_{\nu }'^{2}={\text{const.}}}$ (2a)

How must the ${\displaystyle x_{\nu }'}$ be expressed in terms of the ${\displaystyle x_{\nu }}$ in order that equations (2) and (2a) may be equivalent to each other? Regarding the ${\displaystyle x_{\nu }'}$ expressed as functions of the ${\displaystyle x_{\nu }}$, we can write, by Taylor's theorem, for small values of the ${\displaystyle \Delta x_{\nu }}$,

 ${\displaystyle \Delta x_{\nu }'=\sum \limits _{\alpha }{\frac {\delta x_{\nu }'}{\delta x_{\alpha }}}\Delta x_{\alpha }+{\frac {1}{2}}\sum \limits _{\alpha \beta }{\frac {\delta ^{2}x_{\nu }'}{\delta x_{\alpha }\delta x_{\beta }}}\Delta x_{\alpha }\Delta x_{\beta }}$

If we substitute (2a) in this equation and compare with (1), we see that the ${\displaystyle x_{\nu }'}$, must be linear functions of the ${\displaystyle x_{\nu }}$. If we therefore put

 ${\displaystyle x_{\nu }'=a_{\nu }+\sum \limits _{\alpha }b_{\nu \alpha }x_{\alpha }}$ (3)
 ${\displaystyle {\text{or }}\Delta x_{\nu }'=\sum \limits _{\alpha }b_{\nu \alpha }\Delta x_{\alpha }}$ (3a)

then the equivalence of equations (2) and (2a) is expressed in the form

 ${\displaystyle \sum \Delta x_{\nu }'^{2}=\lambda \sum \Delta x_{\nu }^{2}{\text{ (}}\lambda {\text{ independent of }}\Delta x_{\nu }{\text{)}}}$ (2b)

It therefore follows that ${\displaystyle \lambda }$ must be a constant. If we put ${\displaystyle \lambda =1}$, (2b) and (3a) furnish the conditions

 ${\displaystyle \sum \limits _{\nu }b_{\nu \alpha }b_{\nu \beta }=\delta _{\alpha \beta }}$ (4)

in which ${\displaystyle \delta _{\alpha \beta }=1}$, or ${\displaystyle \delta _{\alpha \beta }=0}$, according as ${\displaystyle \alpha =\beta }$ or ${\displaystyle \alpha \not =\beta }$. The conditions (4) are called the conditions of orthogonality, and the transformations (3), (4), linear orthogonal transformations. If we stipulate that ${\displaystyle s^{2}=\sum \Delta x_{\nu }^{2}}$ shall be equal to the square of the length in every system of co-ordinates, and if we always measure with the same unit scale, then ${\displaystyle \lambda }$ must be equal to 1. Therefore the linear orthogonal transformations are the only ones by means of which we can pass from one Cartesian system of co-ordinates in our space of reference to another. We see that in applying such transformations the equations of a straight line become equations of a straight line. Reversing equations (3a) by multiplying both sides by ${\displaystyle b_{\nu \beta }}$ and summing for all the ${\displaystyle \nu }$'s, we obtain

 ${\displaystyle \sum b_{\nu \beta }\Delta x_{\nu }'=\sum \limits _{\nu \alpha }b_{\nu \alpha }b_{\nu \beta }\Delta x_{\alpha }=\sum \limits _{\alpha }\delta _{\alpha \beta }\Delta x_{\alpha }=\Delta x_{\beta }.}$ (5)

The same coefficients, ${\displaystyle b}$, also determine the inverse substitution of ${\displaystyle \Delta x_{\nu }}$. Geometrically, ${\displaystyle b_{\nu \alpha }}$ is the cosine of the angle between the ${\displaystyle x_{\nu }'}$ axis and the ${\displaystyle x_{\alpha }}$ axis.

To sum up, we can say that in the Euclidean geometry there are (in a given space of reference) preferred systems of co-ordinates, the Cartesian systems, which transform into each other by linear orthogonal transformations. The distance ${\displaystyle s}$ between two points of our space of reference, measured by a measuring rod, is expressed in such co-ordinates in a particularly simple manner. The whole of geometry may be founded upon this conception of distance. In the present treatment, geometry is related to actual things (rigid bodies), and its theorems are statements concerning the behaviour of these things, which may prove to be true or false.

One is ordinarily accustomed to study geometry divorced from any relation between its concepts and experience. There are advantages in isolating that which is purely logical and independent of what is, in principle, incomplete empiricism. This is satisfactory to the pure mathematician. He is satisfied if he can deduce his theorems from axioms correctly, that is, without errors of logic. The question as to whether Euclidean geometry is true or not does not concern him. But for our purpose it is necessary to associate the fundamental concepts of geometry with natural objects; without such an association geometry is worthless for the physicist. The physicist is concerned with the question as to whether the theorems of geometry are true or not. That Euclidean geometry, from this point of view, affirms something more than the mere deductions derived logically from definitions may be seen from the following simple consideration.

between ${\displaystyle n}$ points of space there are ${\displaystyle {\frac {n(n-1)}{2}}}$ distances, ${\displaystyle s_{\mu \nu }}$; between these and the ${\displaystyle 3n}$ co-ordinates we have the relations

 ${\displaystyle s_{\mu \nu }^{2}=(x_{1(\mu )}-x_{1(\nu )})^{2}+(x_{2(\mu )}-x_{2(\nu )})^{2}+...}$

From these ${\displaystyle {\frac {n(n-1)}{2}}}$ equations the ${\displaystyle 3n}$ co-ordinates may be eliminated, and from this elimination at least ${\displaystyle {\frac {n(n-1)}{2}}-3n}$ equations in the ${\displaystyle s_{\mu \nu }}$ will result.[2] Since the ${\displaystyle s_{\mu \nu }}$ are measurable quantities, and by definition are independent of each other, these relations between the ${\displaystyle s_{\mu \nu }}$ are not necessary a priori.

From the foregoing it is evident that the equations of transformation (3), (4) have a fundamental significance in Euclidean geometry, in that they govern the transformation from one Cartesian system of co-ordinates to another. The Cartesian systems of co-ordinates are characterized by the property that in them the measurable distance between two points, ${\displaystyle s}$, is expressed by the equation

 ${\displaystyle s^{2}=\sum \Delta x_{\nu }^{2}.}$

If ${\displaystyle K_{(x_{\nu })}}$ and ${\displaystyle K_{(x_{\nu })}'}$ are two Cartesian systems of co-ordinates, then

 ${\displaystyle \sum \Delta x_{\nu }^{2}=\sum \Delta x_{\nu }'^{2}.}$

The right-hand side is identically equal to the left-hand side on account of the equations of the linear orthogonal transformation, and the right-hand side differs from the left-hand side only in that the ${\displaystyle x_{\nu }}$ are replaced by the ${\displaystyle x_{\nu }'}$. This is expressed by the statement that ${\displaystyle \sum \Delta x_{\nu }^{2}}$ is an invariant with respect to linear orthogonal transformations. It is evident that in the Euclidean geometry only such, and all such, quantities have an objective significance, independent of the particular choice of the Cartesian co-ordinates, as can be expressed by an invariant with respect to linear orthogonal transformations. This is the reason that the theory of invariants, which has to do with the laws that govern the form of invariants, is so important for analytical geometry.

As a second example of a geometrical invariant, consider a volume. This is expressed by

 ${\displaystyle V=\iiint dx_{1}dx_{2}dx_{3}.}$

By means of Jacobi's theorem we may write

 ${\displaystyle \iiint dx_{1}'dx_{2}'dx_{3}'=\iiint {\frac {\delta (x_{1}',x_{2}',x_{3}')}{\delta (x_{1},x_{2},x_{3})}}dx_{1}dx_{2}dx_{3}}$
where the integrand in the last integral is the functional determinant of the ${\displaystyle x_{\nu }'}$ with respect to the ${\displaystyle x_{\nu }}$, and this by (3) is equal to the determinant ${\displaystyle |b_{\mu \nu }|}$ of the coefficients of substitution, ${\displaystyle b_{\nu \alpha }}$. If we form the determinant of the ${\displaystyle \delta _{\mu \alpha }}$ from equation (4), we obtain, by means of the theorem of multiplication of determinants,
 ${\displaystyle 1=\left|\delta _{\alpha \beta }\right|=\left|\sum \limits _{\nu }b_{\nu \alpha }b_{\nu \beta }\right|=\left|b_{\mu \nu }\right|^{2};\left|b_{\mu \nu }\right|=\pm 1}$ (6)

If we limit ourselves to those transformations which have the determinant +1,[3] and only these arise from continuous variations of the systems of co-ordinates, then ${\displaystyle V}$ is an invariant.

Invariants, however, are not the only forms by means of which we can give expression to the independence of the particular choice of the Cartesian co-ordinates. Vectors and tensors are other forms of expression. Let us express the fact that the point with the current co-ordinates ${\displaystyle x_{\nu }}$ lies upon a straight line. We have

 ${\displaystyle x_{\nu }-A_{\nu }=\lambda B_{\nu }{\text{ (}}\nu {\text{ from 1 to 3).}}}$

Without limiting the generality we can put

 ${\displaystyle \sum B_{\nu }^{2}=1.}$

If we multiply the equations by ${\displaystyle b_{\beta \nu }}$ (compare (3a) and (5)) and sum for all the ${\displaystyle \nu }$'s, we get

 ${\displaystyle x_{\beta }'-A_{\beta }'=\lambda B_{\beta }'}$
where we have written
 ${\displaystyle B_{\beta }'=\sum \limits _{\nu }b_{\beta \nu }B_{\nu }{\text{; }}A_{\beta }'=\sum \limits _{\nu }b_{\beta \nu }A_{\nu }.}$

These are the equations of straight lines with respect to a second Cartesian system of co-ordinates ${\displaystyle K'}$. They have the same form as the equations with respect to the original system of co-ordinates. It is therefore evident that straight lines have a significance which is independent of the system of co-ordinates. Formally, this depends upon the fact that the quantities ${\displaystyle (x_{\nu }-A_{\nu })-\lambda B_{\nu }}$ are transformed as the components of an interval, ${\displaystyle \Delta x_{\nu }}$. The ensemble of three quantities, defined for every system of Cartesian co-ordinates, and which transform as the components of an interval, is called a vector. If the three components of a vector vanish for one system of Cartesian co-ordinates, they vanish for all systems, because the equations of transformation are homogeneous. We can thus get the meaning of the concept of a vector without referring to a geometrical representation. This behaviour of the equations of a straight line can be expressed by saying that the equation of a straight line is co-variant with respect to linear orthogonal transformations.

We shall now show briefly that there are geometrical entities which lead to the concept of tensors. Let ${\displaystyle P_{0}}$ be the centre of a surface of the second degree, ${\displaystyle P}$ any point on the surface, and ${\displaystyle \xi _{\nu }}$ the projections of the interval ${\displaystyle P_{0}P}$ upon the co-ordinate axes. Then the equation of the surface is

 ${\displaystyle \sum a_{\mu \nu }\xi _{\mu }\xi _{\nu }=1.}$
In this, and in analogous cases, we shall omit the sign of summation, and understand that the summation is to be carried out for those indices that appear twice. We thus write the equation of the surface
 ${\displaystyle a_{\mu \nu }\xi _{\mu }\xi _{\nu }=1.}$

The quantities ${\displaystyle a_{\mu \nu }}$ determine the surface completely, for a given position of the centre, with respect to the chosen system of Cartesian co-ordinates. From the known law of transformation for the ${\displaystyle \xi _{\nu }}$, (3a) for linear orthogonal transformations, we easily find the law of transformation for the ${\displaystyle a_{\mu \nu }}$[4]:

 ${\displaystyle a_{\sigma \tau }'=b_{\sigma \mu }b_{\tau \nu }a_{\mu \nu }.}$

This transformation is homogeneous and of the first degree in the ${\displaystyle a_{\mu \nu }}$. On account of this transformation, the ${\displaystyle a_{\mu \nu }}$ are called components of a tensor of the second rank (the latter on account of the double index). If all the components, ${\displaystyle a_{\mu \nu }}$, of a tensor with respect to any system of Cartesian co-ordinates vanish, they vanish with respect to every other Cartesian system. The form and the position of the surface of the second degree is described by this tensor (${\displaystyle a}$).

Analytic tensors of higher rank (number of indices) may be defined. It is possible and advantageous to regard vectors as tensors of rank 1, and invariants (scalars) as tensors of rank 0. In this respect, the problem of the theory of invariants may be so formulated: according to what laws may new tensors be formed from given tensors? We shall consider these laws now, in order to be able to apply them later. We shall deal first only with the properties of tensors with respect to the transformation from one Cartesian system to another in the same space of reference, by means of linear orthogonal transformations. As the laws are wholly independent of the number of dimensions, we shall leave this number, ${\displaystyle n}$, indefinite at first.

Definition. If a figure is defined with respect to every system of Cartesian co-ordinates in a space of reference of ${\displaystyle n}$ dimensions by the ${\displaystyle n^{a}}$ numbers ${\displaystyle A_{\mu \nu \rho }...}$ (${\displaystyle a}$ = number of indices), then these numbers are the components of a tensor of rank ${\displaystyle a}$ if the transformation law is

 ${\displaystyle A_{\mu '\nu '\rho '}'...=b_{\mu '\mu }b_{\nu '\nu }b_{\rho '\rho }...A_{\mu \nu \rho }...}$ (7)

Remark. From this definition it follows that

 ${\displaystyle A_{\mu \nu \rho }=B_{\mu }C_{\nu }D_{\rho }...}$ (8)

is an invariant, provided that ${\displaystyle (B),(C),(D)...}$ are vectors. Conversely, the tensor character of ${\displaystyle (A)}$ may be inferred, if it is known that the expression (8) leads to an invariant for an arbitrary choice of the vectors ${\displaystyle (B),(C),}$ etc.

Addition and Subtraction. By addition and subtraction of the corresponding components of tensors of the same rank, a tensor of equal rank results:

 ${\displaystyle A_{\mu \nu \rho }...\pm B_{\mu \nu \rho }...=C_{\mu \nu \rho }...}$ (9)

The proof follows from the definition of a tensor given above.

Multiplication. From a tensor of rank ${\displaystyle a}$ and a tensor of rank ${\displaystyle \beta }$ we may obtain a tensor of rank ${\displaystyle \alpha +\beta }$ by multiplying all the components of the first tensor by all the components of the second tensor:

 ${\displaystyle T_{\mu \nu \rho }..._{\alpha \beta }...=A_{\mu \nu \rho }...B_{\alpha \beta \gamma }...}$ (10)

Contraction. A tensor of rank ${\displaystyle \alpha -2}$ may be obtained from one of rank ${\displaystyle \alpha }$ by putting two definite indices equal to each other and then summing for this single index:

 ${\displaystyle T_{\rho }...=A_{\mu \mu \rho }...(=\sum \limits _{\mu }A_{\mu \mu \rho }...)}$ (11)

The proof is

 ${\displaystyle A_{\mu \mu \rho }'=b_{\mu \alpha }b_{\mu \beta }b_{\mu \gamma }...A_{\alpha \beta \gamma }...=\delta _{\alpha \beta }b_{\rho \gamma }...A_{\alpha \beta \gamma }...=b_{\rho \gamma }...A_{\alpha \alpha \gamma }...}$

In addition to these elementary rules of operation there is also the formation of tensors by differentiation ("erweiterung"):

 ${\displaystyle T_{\mu \nu \rho ...\alpha }={\frac {\delta A_{\mu \nu \rho }...}{\delta x_{\alpha }}}}$ (12)

New tensors, in respect to linear orthogonal transformations, may be formed from tensors according to these rules of operation.

Symmetrical Properties of Tensors. Tensors are called symmetrical or skew-symmetrical in respect to two of their indices, ${\displaystyle \mu }$ and ${\displaystyle \nu }$, if both the components which result from interchanging the indices ${\displaystyle \mu }$ and ${\displaystyle \nu }$ are equal to each other or equal with opposite signs.

Condition for symmetry : ${\displaystyle A_{\mu \nu \rho }=A_{\mu \nu \rho }}$.
Condition for skew-symmetry : ${\displaystyle A_{\mu \nu \rho }=-A_{\nu \mu \rho }}$.

Theorem. The character of symmetry or skew-symmetry exists independently of the choice of co-ordinates, and in this lies its importance. The proof follows from the equation defining tensors.

Special Tensors.

I. The quantities ${\displaystyle \delta _{\rho \sigma }}$ (4) are tensor components (fundamental tensor).

Proof. If in the right-hand side of the equation of transformation ${\displaystyle A_{\mu \nu }'=b_{\mu \alpha }b_{\nu \beta }A_{\alpha \beta }}$, we substitute for ${\displaystyle A_{\alpha \beta }}$ the quantities ${\displaystyle \delta _{\alpha \beta }}$ (which are equal to 1 or 0 according as ${\displaystyle \alpha =\beta }$ or ${\displaystyle \alpha {\text{ }}\beta }$), we get

 ${\displaystyle A_{\mu \nu }'=b_{\mu \alpha }b_{\nu \alpha }=\delta _{\mu \nu }.}$

The justification for the last sign of equality becomes evident if one applies (4) to the inverse substitution (5).

II. There is a tensor ${\displaystyle (\delta _{\mu \nu \rho }...)}$ skew-symmetrical with respect to all pairs of indices, whose rank is equal to the number of dimensions, ${\displaystyle n}$, and whose components are equal to ${\displaystyle +1}$ or ${\displaystyle -1}$ according as ${\displaystyle \mu \nu \rho }$ is an even or odd permutation of 123 ...

The proof follows with the aid of the theorem proved above ${\displaystyle \left|b_{\rho \sigma }\right|=1}$.

These few simple theorems form the apparatus from the theory of invariants for building the equations of pre-relativity physics and the theory of special relativity.

We have seen that in pre-relativity physics, in order to specify relations in space, a body of reference, or a space of reference, is required, and, in addition, a Cartesian system of co-ordinates. We can fuse both these concepts into a single one by thinking of a Cartesian system of co-ordinates as a cubical frame-work formed of rods each of unit length. The co-ordinates of the lattice points of this frame are integral numbers. It follows from the fundamental relation

 ${\displaystyle s^{2}=\Delta x_{1}^{2}+\Delta x_{2}^{2}+\Delta x_{3}^{2}}$

that the members of such a space-lattice are all of unit length. To specify relations in time, we require in addition a standard clock placed at the origin of our Cartesian system of co-ordinates or frame of reference. If an event takes place anywhere we can assign to it three co-ordinates, ${\displaystyle x_{\nu }}$, and a time ${\displaystyle t}$ as soon as we have specified the time of the clock at the origin which is simultaneous with the event. We therefore give an objective significance to the statement of the simultaneity of distant events, while previously we have been concerned only with the simultaneity of two experiences of an individual. The time so specified is at all events independent of the position of the system of co-ordinates in our space of reference, and is therefore an invariant with respect to the transformation (3).

It is postulated that the system of equations expressing the laws of pre-relativity physics is co-variant with respect to the transformation (3), as are the relations of Euclidean geometry. The isotropy and homogeneity of space is expressed in this way.[5] We shall now consider some of the more important equations of physics from this point of view.

The equations of motion of a material particle are

 ${\displaystyle m{\frac {d^{2}x_{\nu }}{dt^{2}}}=X_{\nu }}$ (14)

${\displaystyle (dx_{\nu })}$ is a vector; ${\displaystyle dt}$, and therefore also ${\displaystyle {\frac {1}{dt}}}$, an invariant; thus ${\displaystyle \left({\frac {dx_{\nu }}{dt}}\right)}$ is a vector; in the same way it may be shown that ${\displaystyle \left({\frac {d^{2}x_{\nu }}{dt^{2}}}\right)}$ is a vector. In general, the operation of differentiation with respect to time does not alter the tensor character. Since ${\displaystyle m}$ is an invariant (tensor of rank 0), ${\displaystyle \left(m{\frac {d^{2}x_{\nu }}{dt^{2}}}\right)}$ is a vector, or tensor of rank 1 (by the theorem of the multiplication of tensors). If the force ${\displaystyle (X_{\nu })}$ has a vector character, the same holds for the difference ${\displaystyle \left(m{\frac {d^{2}x_{\nu }}{dt^{2}}}-X_{\nu }\right)}$. These equations of motion are therefore valid in every other system of Cartesian co-ordinates in the space of reference. In the case where the forces are conservative we can easily recognize the vector character of ${\displaystyle (X_{\nu })}$. For a potential energy, ${\displaystyle \Phi }$, exists, which depends only upon the mutual distances of the particles, and is therefore an invariant. The vector character of the force, ${\displaystyle X_{\nu }=-{\frac {\delta \Phi }{\delta x_{\nu }}}}$, is then a consequence of our general theorem about the derivative of a tensor of rank 0.

Multiplying by the velocity, a tensor of rank 1, we obtain the tensor equation

 ${\displaystyle \left(m{\frac {d^{2}x_{\nu }}{dt^{2}}}-X_{\nu }\right){\frac {dx_{\mu }}{dt}}=0.}$

By contraction and multiplication by the scalar ${\displaystyle dt}$ we obtain the equation of kinetic energy

 ${\displaystyle d\left({\frac {mq^{2}}{2}}\right)=X_{\nu }dx_{\nu }.}$

If ${\displaystyle \xi _{\nu }}$ denotes the difference of the co-ordinates of the material particle and a point fixed in space, then the ${\displaystyle \xi _{\nu }}$ have the character of vectors. We evidently have ${\displaystyle {\frac {d^{2}x_{\nu }}{dt^{2}}}={\frac {d^{2}\xi _{\nu }}{dt^{2}}}}$, so that the equations of motion of the particle may be written

 ${\displaystyle m{\frac {d^{2}\xi _{\nu }}{dt^{2}}}-X_{\nu }=0.}$

Multiplying this equation by ${\displaystyle \xi _{\mu }}$ we obtain a tensor equation

 ${\displaystyle \left(m{\frac {d^{2}\xi _{\nu }}{dt^{2}}}-X_{\nu }\right)\xi _{\mu }=0.}$

Contracting the tensor on the left and taking the time average we obtain the virial theorem, which we shall not consider further. By interchanging the indices and subsequent subtraction, we obtain, after a simple transformation, the theorem of moments,

 ${\displaystyle {\frac {d}{dt}}\left[m\left(\xi _{\mu }{\frac {d\xi _{\nu }}{dt}}-\xi _{\nu }{\frac {d\xi _{\mu }}{dt}}\right)\right]=\xi _{\mu }X_{\nu }-\xi _{\nu }X_{\mu }}$ (15)

It is evident in this way that the moment of a vector is not a vector but a tensor. On account of their skew-symmetrical character there are not nine, but only three independent equations of this system. The possibility of replacing skew-symmetrical tensors of the second rank in space of three dimensions by vectors depends upon the formation of the vector

 ${\displaystyle A_{\mu }={\frac {1}{2}}A_{\sigma \tau }\delta _{\sigma \tau \mu }.}$

If we multiply the skew-symmetrical tensor of rank 2 by the special skew-symmetrical tensor ${\displaystyle \delta }$ introduced above, and contract twice, a vector results whose components are numerically equal to those of the tensor. These are the so-called axial vectors which transform differently, from a right-handed system to a left-handed system, from the ${\displaystyle \Delta x_{\nu }}$. There is a gain in picturesqueness in regarding a skew-symmetrical tensor of rank 2 as a vector in space of three dimensions, but it does not represent the exact nature of the corresponding quantity so well as considering it a tensor.

We consider next the equations of motion of a continuous medium. Let ${\displaystyle \rho }$ be the density, ${\displaystyle u_{\nu }}$ the velocity components considered as functions of the co-ordinates and the time, ${\displaystyle X_{\nu }}$ the volume forces per unit of mass, and ${\displaystyle p_{\nu \sigma }}$ the stresses upon a surface perpendicular to the ${\displaystyle \sigma }$-axis in the direction of increasing ${\displaystyle x_{\nu }}$. Then the equations of motion are, by Newton's law,

 ${\displaystyle \rho {\frac {du_{\nu }}{dt}}=-{\frac {\delta p_{\nu \sigma }}{\delta x_{\sigma }}}+\rho X_{\nu }}$

in which ${\displaystyle {\frac {du_{\nu }}{dt}}}$ is the acceleration of the particle which at time ${\displaystyle t}$ has the co-ordinates ${\displaystyle x_{\nu }}$ If we express this acceleration by partial differential coefficients, we obtain, after dividing by ${\displaystyle \rho }$,

 ${\displaystyle {\frac {\delta u_{\nu }}{\delta t}}+{\frac {\delta u_{\nu }}{\delta x_{\sigma }}}u_{\sigma }=-{\frac {1}{\rho }}{\frac {\delta p_{\nu \sigma }}{\delta x_{\sigma }}}+X_{\nu }}$ (16)

We must show that this equation holds independently of the special choice of the Cartesian system of co-ordinates. ${\displaystyle (u_{\nu })}$ is a vector, and therefore ${\displaystyle {\frac {\delta u_{\nu }}{\delta t}}}$ is also a vector, ${\displaystyle {\frac {\delta u_{\nu }}{\delta x_{\sigma }}}}$ is a tensor of rank 2, ${\displaystyle {\frac {\delta u_{\nu }}{\delta x_{\sigma }}}u_{\tau }}$ is a tensor of rank 3. The second term on the left results from contraction in the indices ${\displaystyle \sigma ,\tau }$. The vector character of the second term on the right is obvious. In order that the first term on the right may also be a vector it is necessary for ${\displaystyle p_{\nu \sigma }}$ to be a tensor. Then by differentiation and contraction ${\displaystyle {\frac {\delta p_{\nu \sigma }}{\delta x_{\sigma }}}}$ results, and is therefore a vector, as it also is after multiplication by the reciprocal scalar ${\displaystyle {\frac {1}{\rho }}}$. That ${\displaystyle p_{\nu \sigma }}$ is a tensor, and therefore transforms according to the equation

 ${\displaystyle p_{\mu \nu }'=b_{\mu \alpha }b_{\nu \beta }p_{\alpha \beta },}$

is proved in mechanics by integrating this equation over an infinitely small tetrahedron. It is also proved there by application of the theorem of moments to an infinitely small parallelopipedon, that${\displaystyle p_{\nu \sigma }=p_{\sigma \nu }}$, and hence that the tensor of the stress is a symmetrical tensor. From what has been said it follows that, with the aid of the rules given above, the equation is co-variant with respect to orthogonal transformations in space (rotational transformations); and the rules according to which the quantities in the equation must be transformed in order that the equation may be co-variant also become evident.

The co-variance of the equation of continuity,

 ${\displaystyle {\frac {\delta \rho }{\delta t}}+{\frac {\delta (\rho u_{\nu })}{\delta x_{\nu }}}=0}$ (17)

requires, from the foregoing, no particular discussion.

We shall also test for co-variance the equations which express the dependence of the stress components upon the properties of the matter, and set up these equations for the case of a compressible viscous fluid with the aid of the conditions of co-variance. If we neglect the viscosity, the pressure, ${\displaystyle p}$, will be a scalar, and will depend only upon the density and the temperature of the fluid. The contribution to the stress tensor is then evidently

 ${\displaystyle p\delta _{\mu \nu }}$

in which ${\displaystyle \delta _{\mu \nu }}$ is the special symmetrical tensor. This term will also be present in the case of a viscous fluid. But in this case there will also be pressure terms, which depend upon the space derivatives of the ${\displaystyle u_{\nu }}$. We shall assume that this dependence is a linear one. Since these terms must be symmetrical tensors, the only ones which enter will be

 ${\displaystyle \alpha \left({\frac {\delta u_{\mu }}{\delta x_{\nu }}}+{\frac {\delta u_{\nu }}{\delta x_{\mu }}}\right)+\beta \delta _{\mu \nu }{\frac {\delta u_{\alpha }}{\delta x_{\alpha }}}}$

(for ${\displaystyle {\frac {\delta u_{\alpha }}{\delta x_{\alpha }}}}$ is a scalar). For physical reasons (no slipping) it is assumed that for symmetrical dilatations in all directions, i.e. when

 ${\displaystyle {\frac {\delta u_{1}}{\delta x_{1}}}={\frac {\delta u_{2}}{\delta x_{2}}}={\frac {\delta u_{3}}{\delta x_{3}}}{\text{; }}{\frac {\delta u_{1}}{\delta x_{2}}}{\text{, etc.}}=0,}$

there are no frictional forces present, from which it follows that ${\displaystyle \beta =-{\frac {2}{3}}\alpha }$. If only ${\displaystyle {\frac {\delta u_{1}}{\delta x_{3}}}}$ is different from zero, let ${\displaystyle p_{31}=-\eta {\frac {\delta u_{1}}{\delta x_{3}}}}$, by which ${\displaystyle \alpha }$ is determined. We then obtain for the complete stress tensor,

 ${\displaystyle p_{\mu \nu }=p\delta _{\mu \nu }-\eta \left[\left({\frac {\delta u_{\mu }}{\delta x_{\nu }}}+{\frac {\delta u_{\nu }}{\delta x_{\mu }}}\right)-{\frac {2}{3}}\left({\frac {\delta u_{1}}{\delta x_{1}}}+{\frac {\delta u_{2}}{\delta x_{2}}}+{\frac {\delta u_{3}}{\delta x_{3}}}\right)\delta _{\mu \nu }\right]}$ (18)

The heuristic value of the theory of invariants, which arises from the isotropy of space (equivalence of all directions), becomes evident from this example.

We consider, finally, Maxwell's equations in the form which are the foundation of the electron theory of Lorentz.

 {\displaystyle \left.{\begin{aligned}{\frac {\delta h_{3}}{\delta x_{2}}}-{\frac {\delta h_{2}}{\delta x_{3}}}&={\frac {1}{c}}{\frac {\delta e_{1}}{\delta t}}+{\frac {1}{c}}i_{1}\\{\frac {\delta h_{1}}{\delta x_{3}}}-{\frac {\delta h_{3}}{\delta x_{1}}}&={\frac {1}{c}}{\frac {\delta e_{2}}{\delta t}}+{\frac {1}{c}}i_{2}\\&\cdots \\{\frac {\delta e_{1}}{\delta x_{1}}}+{\frac {\delta e_{2}}{\delta x_{2}}}&+{\frac {\delta e_{3}}{\delta x_{3}}}=\rho \end{aligned}}\right\}} (19)
 {\displaystyle \left.{\begin{aligned}{\frac {\delta e_{3}}{\delta x_{2}}}-{\frac {\delta e_{2}}{\delta x_{3}}}&=-{\frac {1}{c}}{\frac {\delta h_{1}}{\delta t}}\\{\frac {\delta e_{1}}{\delta x_{3}}}-{\frac {\delta e_{3}}{\delta x_{1}}}&=-{\frac {1}{c}}{\frac {\delta h_{2}}{\delta t}}\\&\cdots \\{\frac {\delta h_{1}}{\delta x_{1}}}+{\frac {\delta h_{2}}{\delta x_{2}}}&+{\frac {\delta h_{3}}{\delta x_{3}}}=0\end{aligned}}\right\}} (20)

${\displaystyle \mathbf {i} }$ is a vector, because the current density is defined as the density of electricity multiplied by the vector velocity of the electricity. According to the first three equations it is evident that ${\displaystyle \mathbf {e} }$ is also to be regarded as a vector. Then ${\displaystyle \mathbf {h} }$ cannot be regarded as a vector.[6] The equations may, however, easily be interpreted if ${\displaystyle \mathbf {h} }$ is regarded as a symmetrical tensor of the second rank. In this sense, we write ${\displaystyle h_{23},h_{31},h_{12}}$ in place of ${\displaystyle h_{1},h_{2},h_{3}}$ respectively. Paying attention to the skew-symmetry of ${\displaystyle h_{\mu \nu }}$, the first three equations of (19) and (20) may be written in the form

 ${\displaystyle {\frac {\delta h_{\mu \nu }}{\delta x_{\nu }}}={\frac {1}{c}}{\frac {\delta e_{\mu }}{\delta t}}+{\frac {1}{c}}i_{\mu }}$ (19a)
 ${\displaystyle {\frac {\delta e_{\mu }}{\delta x_{\nu }}}-{\frac {\delta e_{\nu }}{\delta x_{\mu }}}=+{\frac {1}{c}}{\frac {\delta h_{\mu \nu }}{\delta t}}}$ (20a)

In contrast to ${\displaystyle \mathbf {e} }$, ${\displaystyle \mathbf {h} }$ appears as a quantity which has the same type of symmetry as an angular velocity. The divergence equations then take the form

 ${\displaystyle {\frac {\delta e_{\nu }}{\delta x_{\nu }}}=\rho }$ (19b)
 ${\displaystyle {\frac {\delta h_{\mu \nu }}{\delta x_{\rho }}}+{\frac {\delta h_{\nu \rho }}{\delta x_{\mu }}}+{\frac {\delta h_{\rho \mu }}{\delta x_{\nu }}}=0}$ (20b)

The last equation is a skew-symmetrical tensor equation of the third rank (the skew-symmetry of the left-hand side with respect to every pair of indices may easily be proved, if attention is paid to the skew-symmetry of ${\displaystyle h_{\mu \nu }}$). This notation is more natural than the usual one, because, in contrast to the latter, it is applicable to Cartesian left- handed systems as well as to right-handed systems without change of sign.

1. This relation must hold for an arbitrary choice of the origin and of the direction (ratios ${\displaystyle \Delta x_{1}:\Delta x_{2}:\Delta x_{3}}$) of the interval.
2. In reality there are ${\displaystyle {\frac {n(n-1)}{2}}-3n+6}$
3. There are thus two kinds of Cartesian systems which are designated as "right-handed" and "left-handed" systems. The difference between these is familiar to every physicist and engineer. It is interesting to note that these two kinds of systems cannot be defined geometrically, but only the contrast between them.
4. The equation ${\displaystyle a_{\sigma \tau }'\xi _{\sigma }'\xi _{\tau }'=1}$ may, by (5), be replaced by ${\displaystyle a_{\sigma \tau }'b_{\mu \sigma }b_{\nu \tau }\xi _{\sigma }\xi _{\tau }=1}$, from which the result stated immediately follows.
5. The laws of physics could be expressed, even in case there were a unique direction in space, in such a way as to be co-variant with respect to the transformation (3); but such an expression would in this case be unsuitable. If there were a unique direction in space it would simplify the description of natural phenomena to orient the system of co-ordinates in a definite way in this direction. But if, on the other hand, there is no unique direction in space it is not logical to formulate the laws of nature in such a way as to conceal the equivalence of systems of co-ordinates that are oriented differently. We shall meet with this point of view again in the theories of special and general relativity.
6. These considerations will make the reader familiar with tensor operations without the special difficulties of the four-dimensional treatment; corresponding cosiderations in the theory of special relativity (Minkowski's interpretation of the field) will then offer fewer difficulties.