The Meaning of Relativity/Lecture 1

From Wikisource
Jump to navigation Jump to search
The Meaning of Relativity
by Albert Einstein
Lecture I. Space and Time in Pre-Relativity Physics
1508297The Meaning of Relativity — Lecture I. Space and Time in Pre-Relativity PhysicsAlbert Einstein

THE MEANING OF RELATIVITY


LECTURE I

SPACE AND TIME IN PRE-RELATIVITY PHYSICS

The theory of relativity is intimately connected with the theory of space and time. I shall therefore begin with a brief investigation of the origin of our ideas of space and time, although in doing so I know that I introduce a controversial subject. The object of all science, whether natural science or psychology, is to co-ordinate our experiences and to bring them into a logical system. How are our customary ideas of space and time related to the character of our experiences?

The experiences of an individual appear to us arranged in a series of events; in this series the single events which we remember appear to be ordered according to the criterion of "earlier" and "later," which cannot be analysed further. There exists, therefore, for the individual, an I-time, or subjective time. This in itself is not measurable. I can, indeed, associate numbers with the events, in such a way that a greater number is associated with the later event than with an earlier one; but the nature of this association may be quite arbitrary. This association I can define by means of a clock by comparing the order of events furnished by the clock with the order of the given series of events. We understand by a clock something which provides a series of events which can be counted, and which has other properties of which we shall speak later.

By the aid of speech different individuals can, to a certain extent, compare their experiences. In this way it is shown that certain sense perceptions of different individuals correspond to each other, while for other sense perceptions no such correspondence can be established. We are accustomed to regard as real those sense perceptions which are common to different individuals, and which therefore are, in a measure, impersonal. The natural sciences, and in particular, the most fundamental of them, physics, deal with such sense perceptions. The conception of physical bodies, in particular of rigid bodies, is a relatively constant complex of such sense perceptions. A clock is also a body, or a system, in the same sense, with the additional property that the series of events which it counts is formed of elements all of which can be regarded as equal.

The only justification for our concepts and system of concepts is that they serve to represent the complex of our experiences; beyond this they have no legitimacy. I am convinced that the philosophers have had a harmful effect upon the progress of scientific thinking in removing certain fundamental concepts from the domain of empiricism, where they are under our control, to the intangible heights of the a priori. For even if it should appear that the universe of ideas cannot be deduced from experience by logical means, but is, in a sense, a creation of the human mind, without which no science is possible, nevertheless this universe of ideas is just as little independent of the nature of our experiences as clothes are of the form of the human body. This is particularly true of our concepts of time and space, which physicists have been obliged by the facts to bring down from the Olympus of the a priori in order to adjust them and put them in a serviceable condition.

We now come to our concepts and judgments of space. It is essential here also to pay strict attention to the relation of experience to our concepts. It seems to me that Poincaré clearly recognized the truth in the account he gave in his book, "La Science et l'Hypothese." Among all the changes which we can perceive in a rigid body those are marked by their simplicity which can be made reversibly by an arbitrary motion of the body; Poincaré calls these, changes in position. By means of simple changes in position we can bring two bodies into contact. The theorems of congruence, fundamental in geometry, have to do with the laws that govern such changes in position. For the concept of space the following seems essential. We can form new bodies by bringing bodies up to body ; we say that we continue body . We can continue body in such a way that it comes into contact with any other body, , The ensemble of all continuations of body we can designate as the "space of the body ." Then it is true that all bodies are in the "space of the (arbitrarily chosen) body ." In this sense we cannot speak of space in the abstract, but only of the "space belonging to a body ." The earth's crust plays such a dominant rôle in our daily life in judging the relative positions of bodies that it has led to an abstract conception of space which certainly cannot be defended. In order to free ourselves from this fatal error we shall speak only of "bodies of reference," or "space of reference." It was only through the theory of general relativity that refinement of these concepts became necessary, as we shall see later.

I shall not go into detail concerning those properties of the space of reference which lead to our conceiving points as elements of space, and space as a continuum. Nor shall I attempt to analyse further the properties of space which justify the conception of continuous series of points, or lines. If these concepts are assumed, together with their relation to the solid bodies of experience, then it is easy to say what we mean by the three-dimensionality of space; to each point three numbers, (co-ordinates), may be associated, in such a way that this association is uniquely reciprocal, and that vary continuously when the point describes a continuous series of points (a line).

It is assumed in pre-relativity physics that the laws of the orientation of ideal rigid bodies are consistent with Euclidean geometry. What this means may be expressed as follows: Two points marked on a rigid body form an interval. Such an interval can be oriented at rest, relatively to our space of reference, in a multiplicity of ways. If, now, the points of this space can be referred to co-ordinates , in such a way that the differences of the co-ordinates, of the two ends of the interval furnish the same sum of squares,

(1)
for every orientation of the interval, then the space of reference is called Euclidean, and the co-ordinates Cartesian.[1] It is sufficient, indeed, to make this assumption in the limit for an infinitely small interval. Involved in this assumption there are some which are rather less special, to which we must call attention on account of their fundamental significance. In the first place, it is assumed that one can move an ideal rigid body in an arbitrary manner. In the second place, it is assumed that the behaviour of ideal rigid bodies towards orientation is independent of the material of the bodies and their changes of position, in the sense that if two intervals can once be brought into coincidence, they can always and everywhere be brought into coincidence. Both of these assumptions, which are of fundamental importance for geometry and especially for physical measurements, naturally arise from experience; in the theory of general relativity their validity needs to be assumed only for bodies and spaces of reference which are infinitely small compared to astronomical dimensions.

The quantity we call the length of the interval. In order that this may be uniquely determined it is necessary to fix arbitrarily the length of a definite interval; for example, we can put it equal to I (unit of length). Then the lengths of all other intervals may be determined. If we make the linearly dependent upon a parameter ,

we obtain a line which has all the properties of the straight lines of the Euclidean geometry. In particular, it easily follows that by laying off times the interval upon a straight line, an interval of length is obtained. A length, therefore, means the result of a measurement carried out along a straight line by means of a unit measuring rod. It has a significance which is as independent of the system of co-ordinates as that of a straight line, as will appear in the sequel.

We come now to a train of thought which plays an analogous rôle in the theories of special and general relativity. We ask the question: besides the Cartesian co-ordinates which we have used are there other equivalent co-ordinates? An interval has a physical meaning which is independent of the choice of co-ordinates; and so has the spherical surface which we obtain as the locus of the end points of all equal intervals that we lay off from an arbitrary point of our space of reference. If as well as ( from 1 to 3) are Cartesian co-ordinates of our space of reference, then the spherical surface will be expressed in our two systems of co-ordinates by the equations

(2)

(2a)

How must the be expressed in terms of the in order that equations (2) and (2a) may be equivalent to each other? Regarding the expressed as functions of the , we can write, by Taylor's theorem, for small values of the ,

If we substitute (2a) in this equation and compare with (1), we see that the , must be linear functions of the . If we therefore put

(3)

(3a)

then the equivalence of equations (2) and (2a) is expressed in the form

(2b)

It therefore follows that must be a constant. If we put , (2b) and (3a) furnish the conditions

(4)

in which , or , according as or . The conditions (4) are called the conditions of orthogonality, and the transformations (3), (4), linear orthogonal transformations. If we stipulate that shall be equal to the square of the length in every system of co-ordinates, and if we always measure with the same unit scale, then must be equal to 1. Therefore the linear orthogonal transformations are the only ones by means of which we can pass from one Cartesian system of co-ordinates in our space of reference to another. We see that in applying such transformations the equations of a straight line become equations of a straight line. Reversing equations (3a) by multiplying both sides by and summing for all the 's, we obtain

(5)

The same coefficients, , also determine the inverse substitution of . Geometrically, is the cosine of the angle between the axis and the axis.

To sum up, we can say that in the Euclidean geometry there are (in a given space of reference) preferred systems of co-ordinates, the Cartesian systems, which transform into each other by linear orthogonal transformations. The distance between two points of our space of reference, measured by a measuring rod, is expressed in such co-ordinates in a particularly simple manner. The whole of geometry may be founded upon this conception of distance. In the present treatment, geometry is related to actual things (rigid bodies), and its theorems are statements concerning the behaviour of these things, which may prove to be true or false.

One is ordinarily accustomed to study geometry divorced from any relation between its concepts and experience. There are advantages in isolating that which is purely logical and independent of what is, in principle, incomplete empiricism. This is satisfactory to the pure mathematician. He is satisfied if he can deduce his theorems from axioms correctly, that is, without errors of logic. The question as to whether Euclidean geometry is true or not does not concern him. But for our purpose it is necessary to associate the fundamental concepts of geometry with natural objects; without such an association geometry is worthless for the physicist. The physicist is concerned with the question as to whether the theorems of geometry are true or not. That Euclidean geometry, from this point of view, affirms something more than the mere deductions derived logically from definitions may be seen from the following simple consideration.

between points of space there are distances, ; between these and the co-ordinates we have the relations

From these equations the co-ordinates may be eliminated, and from this elimination at least equations in the will result.[2] Since the are measurable quantities, and by definition are independent of each other, these relations between the are not necessary a priori.

From the foregoing it is evident that the equations of transformation (3), (4) have a fundamental significance in Euclidean geometry, in that they govern the transformation from one Cartesian system of co-ordinates to another. The Cartesian systems of co-ordinates are characterized by the property that in them the measurable distance between two points, , is expressed by the equation

If and are two Cartesian systems of co-ordinates, then

The right-hand side is identically equal to the left-hand side on account of the equations of the linear orthogonal transformation, and the right-hand side differs from the left-hand side only in that the are replaced by the . This is expressed by the statement that is an invariant with respect to linear orthogonal transformations. It is evident that in the Euclidean geometry only such, and all such, quantities have an objective significance, independent of the particular choice of the Cartesian co-ordinates, as can be expressed by an invariant with respect to linear orthogonal transformations. This is the reason that the theory of invariants, which has to do with the laws that govern the form of invariants, is so important for analytical geometry.

As a second example of a geometrical invariant, consider a volume. This is expressed by

By means of Jacobi's theorem we may write

where the integrand in the last integral is the functional determinant of the with respect to the , and this by (3) is equal to the determinant of the coefficients of substitution, . If we form the determinant of the from equation (4), we obtain, by means of the theorem of multiplication of determinants,

(6)

If we limit ourselves to those transformations which have the determinant +1,[3] and only these arise from continuous variations of the systems of co-ordinates, then is an invariant.

Invariants, however, are not the only forms by means of which we can give expression to the independence of the particular choice of the Cartesian co-ordinates. Vectors and tensors are other forms of expression. Let us express the fact that the point with the current co-ordinates lies upon a straight line. We have

Without limiting the generality we can put

If we multiply the equations by (compare (3a) and (5)) and sum for all the 's, we get

where we have written

These are the equations of straight lines with respect to a second Cartesian system of co-ordinates . They have the same form as the equations with respect to the original system of co-ordinates. It is therefore evident that straight lines have a significance which is independent of the system of co-ordinates. Formally, this depends upon the fact that the quantities are transformed as the components of an interval, . The ensemble of three quantities, defined for every system of Cartesian co-ordinates, and which transform as the components of an interval, is called a vector. If the three components of a vector vanish for one system of Cartesian co-ordinates, they vanish for all systems, because the equations of transformation are homogeneous. We can thus get the meaning of the concept of a vector without referring to a geometrical representation. This behaviour of the equations of a straight line can be expressed by saying that the equation of a straight line is co-variant with respect to linear orthogonal transformations.

We shall now show briefly that there are geometrical entities which lead to the concept of tensors. Let be the centre of a surface of the second degree, any point on the surface, and the projections of the interval upon the co-ordinate axes. Then the equation of the surface is

In this, and in analogous cases, we shall omit the sign of summation, and understand that the summation is to be carried out for those indices that appear twice. We thus write the equation of the surface

The quantities determine the surface completely, for a given position of the centre, with respect to the chosen system of Cartesian co-ordinates. From the known law of transformation for the , (3a) for linear orthogonal transformations, we easily find the law of transformation for the [4]:

This transformation is homogeneous and of the first degree in the . On account of this transformation, the are called components of a tensor of the second rank (the latter on account of the double index). If all the components, , of a tensor with respect to any system of Cartesian co-ordinates vanish, they vanish with respect to every other Cartesian system. The form and the position of the surface of the second degree is described by this tensor ().

Analytic tensors of higher rank (number of indices) may be defined. It is possible and advantageous to regard vectors as tensors of rank 1, and invariants (scalars) as tensors of rank 0. In this respect, the problem of the theory of invariants may be so formulated: according to what laws may new tensors be formed from given tensors? We shall consider these laws now, in order to be able to apply them later. We shall deal first only with the properties of tensors with respect to the transformation from one Cartesian system to another in the same space of reference, by means of linear orthogonal transformations. As the laws are wholly independent of the number of dimensions, we shall leave this number, , indefinite at first.

Definition. If a figure is defined with respect to every system of Cartesian co-ordinates in a space of reference of dimensions by the numbers ( = number of indices), then these numbers are the components of a tensor of rank if the transformation law is

(7)

Remark. From this definition it follows that

(8)

is an invariant, provided that are vectors. Conversely, the tensor character of may be inferred, if it is known that the expression (8) leads to an invariant for an arbitrary choice of the vectors etc.

Addition and Subtraction. By addition and subtraction of the corresponding components of tensors of the same rank, a tensor of equal rank results:

(9)

The proof follows from the definition of a tensor given above.

Multiplication. From a tensor of rank and a tensor of rank we may obtain a tensor of rank by multiplying all the components of the first tensor by all the components of the second tensor:

(10)

Contraction. A tensor of rank may be obtained from one of rank by putting two definite indices equal to each other and then summing for this single index:

(11)

The proof is

In addition to these elementary rules of operation there is also the formation of tensors by differentiation ("erweiterung"):

(12)

New tensors, in respect to linear orthogonal transformations, may be formed from tensors according to these rules of operation.

Symmetrical Properties of Tensors. Tensors are called symmetrical or skew-symmetrical in respect to two of their indices, and , if both the components which result from interchanging the indices and are equal to each other or equal with opposite signs.

Condition for symmetry : .
Condition for skew-symmetry : .

Theorem. The character of symmetry or skew-symmetry exists independently of the choice of co-ordinates, and in this lies its importance. The proof follows from the equation defining tensors.

Special Tensors.

I. The quantities (4) are tensor components (fundamental tensor).

Proof. If in the right-hand side of the equation of transformation , we substitute for the quantities (which are equal to 1 or 0 according as or ), we get

The justification for the last sign of equality becomes evident if one applies (4) to the inverse substitution (5).

II. There is a tensor skew-symmetrical with respect to all pairs of indices, whose rank is equal to the number of dimensions, , and whose components are equal to or according as is an even or odd permutation of 123 ...

The proof follows with the aid of the theorem proved above .

These few simple theorems form the apparatus from the theory of invariants for building the equations of pre-relativity physics and the theory of special relativity.

We have seen that in pre-relativity physics, in order to specify relations in space, a body of reference, or a space of reference, is required, and, in addition, a Cartesian system of co-ordinates. We can fuse both these concepts into a single one by thinking of a Cartesian system of co-ordinates as a cubical frame-work formed of rods each of unit length. The co-ordinates of the lattice points of this frame are integral numbers. It follows from the fundamental relation

that the members of such a space-lattice are all of unit length. To specify relations in time, we require in addition a standard clock placed at the origin of our Cartesian system of co-ordinates or frame of reference. If an event takes place anywhere we can assign to it three co-ordinates, , and a time as soon as we have specified the time of the clock at the origin which is simultaneous with the event. We therefore give an objective significance to the statement of the simultaneity of distant events, while previously we have been concerned only with the simultaneity of two experiences of an individual. The time so specified is at all events independent of the position of the system of co-ordinates in our space of reference, and is therefore an invariant with respect to the transformation (3).

It is postulated that the system of equations expressing the laws of pre-relativity physics is co-variant with respect to the transformation (3), as are the relations of Euclidean geometry. The isotropy and homogeneity of space is expressed in this way.[5] We shall now consider some of the more important equations of physics from this point of view.

The equations of motion of a material particle are

(14)

is a vector; , and therefore also , an invariant; thus is a vector; in the same way it may be shown that is a vector. In general, the operation of differentiation with respect to time does not alter the tensor character. Since is an invariant (tensor of rank 0), is a vector, or tensor of rank 1 (by the theorem of the multiplication of tensors). If the force has a vector character, the same holds for the difference . These equations of motion are therefore valid in every other system of Cartesian co-ordinates in the space of reference. In the case where the forces are conservative we can easily recognize the vector character of . For a potential energy, , exists, which depends only upon the mutual distances of the particles, and is therefore an invariant. The vector character of the force, , is then a consequence of our general theorem about the derivative of a tensor of rank 0.

Multiplying by the velocity, a tensor of rank 1, we obtain the tensor equation

By contraction and multiplication by the scalar we obtain the equation of kinetic energy

If denotes the difference of the co-ordinates of the material particle and a point fixed in space, then the have the character of vectors. We evidently have , so that the equations of motion of the particle may be written

Multiplying this equation by we obtain a tensor equation

Contracting the tensor on the left and taking the time average we obtain the virial theorem, which we shall not consider further. By interchanging the indices and subsequent subtraction, we obtain, after a simple transformation, the theorem of moments,

(15)

It is evident in this way that the moment of a vector is not a vector but a tensor. On account of their skew-symmetrical character there are not nine, but only three independent equations of this system. The possibility of replacing skew-symmetrical tensors of the second rank in space of three dimensions by vectors depends upon the formation of the vector

If we multiply the skew-symmetrical tensor of rank 2 by the special skew-symmetrical tensor introduced above, and contract twice, a vector results whose components are numerically equal to those of the tensor. These are the so-called axial vectors which transform differently, from a right-handed system to a left-handed system, from the . There is a gain in picturesqueness in regarding a skew-symmetrical tensor of rank 2 as a vector in space of three dimensions, but it does not represent the exact nature of the corresponding quantity so well as considering it a tensor.

We consider next the equations of motion of a continuous medium. Let be the density, the velocity components considered as functions of the co-ordinates and the time, the volume forces per unit of mass, and the stresses upon a surface perpendicular to the -axis in the direction of increasing . Then the equations of motion are, by Newton's law,

in which is the acceleration of the particle which at time has the co-ordinates If we express this acceleration by partial differential coefficients, we obtain, after dividing by ,

(16)

We must show that this equation holds independently of the special choice of the Cartesian system of co-ordinates. is a vector, and therefore is also a vector, is a tensor of rank 2, is a tensor of rank 3. The second term on the left results from contraction in the indices . The vector character of the second term on the right is obvious. In order that the first term on the right may also be a vector it is necessary for to be a tensor. Then by differentiation and contraction results, and is therefore a vector, as it also is after multiplication by the reciprocal scalar . That is a tensor, and therefore transforms according to the equation

is proved in mechanics by integrating this equation over an infinitely small tetrahedron. It is also proved there by application of the theorem of moments to an infinitely small parallelopipedon, that, and hence that the tensor of the stress is a symmetrical tensor. From what has been said it follows that, with the aid of the rules given above, the equation is co-variant with respect to orthogonal transformations in space (rotational transformations); and the rules according to which the quantities in the equation must be transformed in order that the equation may be co-variant also become evident.

The co-variance of the equation of continuity,

(17)

requires, from the foregoing, no particular discussion.

We shall also test for co-variance the equations which express the dependence of the stress components upon the properties of the matter, and set up these equations for the case of a compressible viscous fluid with the aid of the conditions of co-variance. If we neglect the viscosity, the pressure, , will be a scalar, and will depend only upon the density and the temperature of the fluid. The contribution to the stress tensor is then evidently

in which is the special symmetrical tensor. This term will also be present in the case of a viscous fluid. But in this case there will also be pressure terms, which depend upon the space derivatives of the . We shall assume that this dependence is a linear one. Since these terms must be symmetrical tensors, the only ones which enter will be

(for is a scalar). For physical reasons (no slipping) it is assumed that for symmetrical dilatations in all directions, i.e. when

there are no frictional forces present, from which it follows that . If only is different from zero, let , by which is determined. We then obtain for the complete stress tensor,

(18)

The heuristic value of the theory of invariants, which arises from the isotropy of space (equivalence of all directions), becomes evident from this example.

We consider, finally, Maxwell's equations in the form which are the foundation of the electron theory of Lorentz.

(19)

(20)

is a vector, because the current density is defined as the density of electricity multiplied by the vector velocity of the electricity. According to the first three equations it is evident that is also to be regarded as a vector. Then cannot be regarded as a vector.[6] The equations may, however, easily be interpreted if is regarded as a symmetrical tensor of the second rank. In this sense, we write in place of respectively. Paying attention to the skew-symmetry of , the first three equations of (19) and (20) may be written in the form

(19a)

(20a)

In contrast to , appears as a quantity which has the same type of symmetry as an angular velocity. The divergence equations then take the form

(19b)

(20b)

The last equation is a skew-symmetrical tensor equation of the third rank (the skew-symmetry of the left-hand side with respect to every pair of indices may easily be proved, if attention is paid to the skew-symmetry of ). This notation is more natural than the usual one, because, in contrast to the latter, it is applicable to Cartesian left- handed systems as well as to right-handed systems without change of sign.

  1. This relation must hold for an arbitrary choice of the origin and of the direction (ratios ) of the interval.
  2. In reality there are
  3. There are thus two kinds of Cartesian systems which are designated as "right-handed" and "left-handed" systems. The difference between these is familiar to every physicist and engineer. It is interesting to note that these two kinds of systems cannot be defined geometrically, but only the contrast between them.
  4. The equation may, by (5), be replaced by , from which the result stated immediately follows.
  5. The laws of physics could be expressed, even in case there were a unique direction in space, in such a way as to be co-variant with respect to the transformation (3); but such an expression would in this case be unsuitable. If there were a unique direction in space it would simplify the description of natural phenomena to orient the system of co-ordinates in a definite way in this direction. But if, on the other hand, there is no unique direction in space it is not logical to formulate the laws of nature in such a way as to conceal the equivalence of systems of co-ordinates that are oriented differently. We shall meet with this point of view again in the theories of special and general relativity.
  6. These considerations will make the reader familiar with tensor operations without the special difficulties of the four-dimensional treatment; corresponding cosiderations in the theory of special relativity (Minkowski's interpretation of the field) will then offer fewer difficulties.