Elementary Principles in Statistical Mechanics/Chapter XIV

From Wikisource
Jump to: navigation, search

CHAPTER XIV.

DISCUSSION OF THERMODYNAMIC ANALOGIES.

If we wish to find in rational mechanics an a priori foundation for the principles of thermodynamics, we must seek mechanical definitions of temperature and entropy. The quantities thus defined must satisfy (under conditions and with limitations which again must be specified in the language of mechanics) the differential equation



d\epsilon = T d\eta - A_1 da_1 - A_2 da_2 - \mathrm{etc.}
,
(482)
where \epsilon, T, and \eta denote the energy, temperature, and entropy of the system considered, and A_1 da_1 etc., the mechanical work (in the narrower sense in which the term is used in thermodynamics, i. e., with exclusion of thermal action) done upon external bodies.

This implies that we are able to distinguish in mechanical terms the thermal action of one system on another from that which we call mechanical in the narrower sense, if not indeed in every case in which the two may be combined, at least so as to specify cases of thermal action and cases of mechanical action.

Such a differential equation moreover implies a finite equation between \epsilon, \eta, and a_1, a_2, etc., which may be regarded as fundamental in regard to those properties of the system which we call thermodynamic, or which may be called so from analogy. This fundamental thermodynamic equation is determined by the fundamental mechanical equation which expresses the energy of the system as function of its momenta and coördinates with those external coördinates (a_1, a_2, etc.) which appear in the differential expression of the work done on external bodies. We have to show the mathematical operations by which the fundamental thermodynamic equation, which in general is an equation of few variables, is derived from the fundamental mechanical equation, which in the case of the bodies of nature is one of an enormous number of variables.

We have also to enunciate in mechanical terms, and to prove, what we call the tendency of heat to pass from a system of higher temperature to one of lower, and to show that this tendency vanishes with respect to systems of the same temperature.

At least, we have to show by a priori reasoning that for such systems as the material bodies which nature presents to us, these relations hold with such approximation that they are sensibly true for human faculties of observation. This indeed is all that is really necessary to establish the science of thermodynamics on an a priori basis. Yet we will naturally desire to find the exact expression of those principles of which the laws of thermodynamics are the approximate expression. A very little study of the statistical properties of conservative systems of a finite number of degrees of freedom is sufficient to make it appear, more or less distinctly, that the general laws of thermodynamics are the limit toward which the exact laws of such systems approximate, when their number of degrees of freedom is indefinitely increased. And the problem of finding the exact relations, as distinguished from the approximate, for systems of a great number of degrees of freedom, is practically the same as that of finding the relations which hold for any number of degrees of freedom, as distinguished from those which have been established on an empirical basis for systems of a great number of degrees of freedom.

The enunciation and proof of these exact laws, for systems of any finite number of degrees of freedom, has been a principal object of the preceding discussion. But it should be distinctly stated that, if the results obtained when the numbers of degrees of freedom are enormous coincide sensibly with the general laws of thermodynamics, however interesting and significant this coincidence may be, we are still far from having explained the phenomena of nature with respect to these laws. For, as compared with the case of nature, the systems which we have considered are of an ideal simplicity. Although our only assumption is that we are considering conservative systems of a finite number of degrees of freedom, it would seem that this is assuming far too much, so far as the bodies of nature are concerned. The phenomena of radiant heat, which certainly should not be neglected in any complete system of thermodynamics, and the electrical phenomena associated with the combination of atoms, seem to show that the hypothesis of systems of a finite number of degrees of freedom is inadequate for the explanation of the properties of bodies.

Nor do the results of such assumptions in every detail appear to agree with experience. We should expect, for example, that a diatomic gas, so far as it could be treated independently of the phenomena of radiation, or of any sort of electrical manifestations, would have six degrees of freedom for each molecule. But the behavior of such a gas seems to indicate not more than five.

But although these difficulties, long recognized by physicists,[1] seem to prevent, in the present state of science, any satisfactory explanation of the phenomena of thermodynamics as presented to us in nature, the ideal case of systems of a finite number of degrees of freedom remains as a subject which is certainly not devoid of a theoretical interest, and which may serve to point the way to the solution of the far more difficult problems presented to us by nature. And if the study of the statistical properties of such systems gives us an exact expression of laws which in the limiting case take the form of the received laws of thermodynamics, its interest is so much the greater.

Now we have defined what we have called the modulus (\Theta) of an ensemble of systems canonically distributed in phase, and what we have called the index of probability (\eta) of any phase in such an ensemble. It has been shown that between the modulus (\Theta), the external coördinates (a_1, etc.), and the average values in the ensemble of the energy (\epsilon), the index of probability (\eta), and the external forces (A_1, etc.) exerted by the systems, the following differential equation will hold:



d\overline\epsilon = - \Theta d\overline\eta - \overline A_1 da_1 - \overline A_2 da_2 - \mathrm{etc.}

(483)
This equation, if we neglect the sign of averages, is identical in form with the thermodynamic equation (482), the modulus (\Theta) corresponding to temperature, and the index of probability of phase with its sign reversed corresponding to entropy.[2]

We have also shown that the average square of the anomalies of \epsilon, that is, of the deviations of the individual values from the average, is in general of the same order of magnitude as the reciprocal of the number of degrees of freedom, and therefore to human observation the individual values are indistinguishable from the average values when the number of degrees of freedom is very great.[3] In this case also the anomalies of \eta are practically insensible. The same is true of the anomalies of the external forces (A_1, etc.), so far as these are the result of the anomalies of energy, so that when these forces are sensibly determined by the energy and the external coördinates, and the number of degrees of freedom is very great, the anomalies of these forces are insensible.

The mathematical operations by which the finite equation between \overline\epsilon, \overline\eta, and a_1, etc., is deduced from that which gives the energy (\epsilon) of a system in terms of the momenta (p_1\ldots p_n) and coördinates both internal (q_1\ldots q_n) and external (a_1, etc.), are indicated by the equation



e^{-\frac{\psi}{\Theta}} =
\mathop{\int\ldots\int}^{\rm all}_{\rm phases}\,dq_1\ldots dq_n \, dp_1 \ldots dp_n,

,
(484)
where


\psi = - \Theta \overline\eta + \overline\epsilon
.

We have also shown that when systems of different ensembles are brought into conditions analogous to thermal contact, the average result is a passage of energy from the ensemble of the greater modulus to that of the less,[4] or in case of equal moduli, that we have a condition of statistical equilibrium in regard to the distribution of energy,[5]

Propositions have also been demonstrated analogous to those in thermodynamics relating to a Carnot's cycle,[6] or to the tendency of entropy to increase,[7] especially when bodies of different temperature are brought into contact.[8]

We have thus precisely defined quantities, and rigorously demonstrated propositions, which hold for any number of degrees of freedom, and which, when the number of degrees of freedom (n) is enormously great, would appear to human faculties as the quantities and propositions of empirical thermodynamics.

It is evident, however, that there may be more than one quantity denned for finite values of n, which approach the same limit, when n is increased indefinitely, and more than one proposition relating to finite values of n, which approach the same limiting form for n = \infty. There may be therefore, and there are, other quantities which may be thought to have some claim to be regarded as temperature and entropy with respect to systems of a finite number of degrees of freedom.

The definitions and propositions which we have been considering relate essentially to what we have called a canonical ensemble of systems. This may appear a less natural and simple conception than what we have called a microcanonical ensemble of systems, in which all have the same energy and which in many cases represents simply the time-ensemble, or ensemble of phases through which a single system passes in the course of time.

It may therefore seem desirable to find definitions and propositions relating to these microcanonical ensembles, which shall correspond to what in thermodynamics are based on experience. Now the differential equation



d\epsilon = e^{-\phi} V \, d\log V - \overline{A_1}|_{\epsilon} \, da_1 - \overline{A_2}|_{\epsilon} \, da_2 - \mathrm{etc.}
,
(485)
which has been demonstrated in Chapter X, and which relates to a microcanonical ensemble, \overline{A_1}|_{\epsilon} denoting the average value of A_1 in such an ensemble, corresponds precisely to the thermodynamic equation, except for the sign of average applied to the external forces. But as these forces are not entirely determined by the energy with the external coördinates, the use of average values is entirely germane to the subject, and affords the readiest means of getting perfectly determined quantities. These averages, which are taken for a microcanonical ensemble, may seem from some points of view a more simple and natural conception than those which relate to a canonical ensemble. Moreover, the energy, and the quantity corresponding to entropy, are free from the sign of average in this equation.

The quantity in the equation which corresponds to entropy is \log V, the quantity V being defined as the extension-in-phase within which the energy is less than a certain limiting value (\epsilon). This is certainly a more simple conception than the average value in a canonical ensemble of the index of probability of phase. \operatorname{Log} V has the property that when it is constant



d\epsilon = - \overline{A_1}|_{\epsilon} \, da_1 - \overline{A_2}|_{\epsilon} \, da_2 - \mathrm{etc.} , 
,
(486)
which closely corresponds to the thermodynamic property of entropy, that when it is constant


d\epsilon = - A_1 \, da_1 - A_2 \, da_2 - \mathrm{etc.} ,

(487)
The quantity in the equation which corresponds to temperature is e^{-\phi}V, or d\epsilon/d\log V. In a canonical ensemble, the average value of this quantity is equal to the modulus, as has been shown by different methods in Chapters IX and X.

In Chapter X it has also been shown that if the systems of a microcanonical ensemble consist of parts with separate energies, the average value of e^{-\phi}V or any part is equal to its average value for any other part, and to the uniform value of the same expression for the whole ensemble. This corresponds to the theorem in the theory of heat that in case of thermal equilibrium the temperatures of the parts of a body are equal to one another and to that of the whole body. Since the energies of the parts of a body cannot be supposed to remain absolutely constant, even where this is the case with respect to the whole body, it is evident that if we regard the temperature as a function of the energy, the taking of average or of probable values, or some other statistical process, must be used with reference to the parts, in order to get a perfectly definite value corresponding to the notion of temperature.

It is worthy of notice in this connection that the average value of the kinetic energy, either in a microcanonical ensemble, or in a canonical, divided by one half the number of degrees of freedom, is equal to e^{-\phi}V, or to its average value, and that this is true not only of the whole system which is distributed either microcanonically or canonically, but also of any part, although the corresponding theorem relating to temperature hardly belongs to empirical thermodynamics, since neither the (inner) kinetic energy of a body, nor its number of degrees of freedom is immediately cognizable to our faculties, and we meet the gravest difficulties when we endeavor to apply the theorem to the theory of gases, except in the simplest case, that of the gases known as monatomic.

But the correspondence between e^{-\phi}V or d\epsilon/d\log V and temperature is imperfect. If two isolated systems have such energies that



\frac{d\epsilon_1}{d\log V_1} = \frac{d\epsilon_2}{d\log V_2}
,
and the two systems are regarded as combined to form a third system with energy


\epsilon_{12} = \epsilon_1 + \epsilon_2
,
we shall not have in general


\frac{d\epsilon_{12}}{d\log V_{12}} = \frac{d\epsilon_1}{d\log V_1} = \frac{d\epsilon_2}{d\log V_2}
,
as analogy with temperature would require. In fact, we have seen that


\frac{d\epsilon_{12}}{d\log V_{12}} = \overline{\frac{d\epsilon_1}{d\log V_1}}\bigg|_{\epsilon_{12}} = \overline{\frac{d\epsilon_2}{d\log V_2}}\bigg|_{\epsilon_{12}}
,
where the second and third members of the equation denote average values in an ensemble in which the compound system is microcanonically distributed in phase. Let us suppose the two original systems to be identical in nature. Then


\epsilon_1 = \epsilon_2 = \overline{\epsilon_1}|_{\epsilon_{12}} = \overline{\epsilon_2}|_{\epsilon_{12}}
.
The equation in question would require that


\frac{d\epsilon_{1}}{d\log V_{1}} = \overline{\frac{d\epsilon_1}{d\log V_1}}\bigg|_{\epsilon_{12}}
,
i. e., that we get the same result, whether we take the value of d\epsilon_1/d\log V_1 determined for the average value of \epsilon_1 in the ensemble, or take the average value of d\epsilon_1/d\log V_1. This will be the case where d\epsilon_1/d\log V_1 is a linear function of \epsilon_1. Evidently this does not constitute the most general case. Therefore the equation in question cannot be true in general. It is true, however, in some very important particular cases, as when the energy is a quadratic function of the p's and q's, or of the p's alone.[9] When the equation holds, the case is analogous to that of bodies in thermodynamics for which the specific heat for constant volume is constant.

Another quantity which is closely related to temperature is d\phi/d\epsilon. It has been shown in Chapter IX that in a canonical ensemble, if n>2, the average value of d\phi/d\epsilon is 1/\Theta, and that the most common value of the energy in the ensemble is that for which d\phi/d\epsilon = 1/\Theta. The first of these properties may be compared with that of d\epsilon/d\log V, which has been seen to have the average value \Theta in a canonical ensemble, without restriction in regard to the number of degrees of freedom.

With respect to microcanonical ensembles also, d\phi/d\epsilon has a property similar to what has been mentioned with respect to d\epsilon/d\log V. That is, if a system microcanonically distributed in phase consists of two parts with separate energies, and each with more than two degrees of freedom, the average values in the ensemble of d\phi/d\epsilon for the two parts are equal to one another and to the value of same expression for the whole. In our usual notations



\overline{\frac{d\phi_1}{d\epsilon_1}}\bigg|_{\epsilon_{12}} =
\overline{\frac{d\phi_2}{d\epsilon_2}}\bigg|_{\epsilon_{12}} =
\frac{d\phi_{12}}{d\epsilon_{12}}
if n_1 > 2, and n_2 > 2.

This analogy with temperature has the same incompleteness which was noticed with respect to d\epsilon/d\log V, viz., if two systems have such energies (\epsilon_1 and \epsilon_2) that



\frac{d\phi_1}{d\epsilon_1} = \frac{d\phi_2}{d\epsilon_2}
,
and they are combined to form a third system with energy


\epsilon_{12} = \epsilon_1 + \epsilon_2
,
we shall not have in general


\frac{d\phi_{12}}{d\epsilon_{12}} = \frac{d\phi_1}{d\epsilon_1} = \frac{d\phi_2}{d\epsilon_2}
.
Thus, if the energy is a quadratic function of the p's and q's, we have[10]


\frac{d\phi_1}{d\epsilon_1} = \frac{n_1 - 1}{\epsilon_1}, \qquad
\frac{d\phi_2}{d\epsilon_2} = \frac{n_2 - 1}{\epsilon_2}
,


\frac{d\phi_{12}}{d\epsilon_{12}} = \frac{n_{12} - 1}{\epsilon_{12}} = \frac{n_1 + n_2 - 1}{\epsilon_1 + \epsilon_2}
,
where n_1, n_2, n_{12}, are the numbers of degrees of freedom of the separate and combined systems. But


\frac{d\phi_1}{d\epsilon_1} = \frac{d\phi_2}{d\epsilon_2} = \frac{n_1 + n_2 - 2}{\epsilon_1 + \epsilon_2}
.
If the energy is a quadratic function of the p's alone, the case would be the same except that we should have \tfrac 12 n_1, \tfrac 12 n_2, \tfrac 12 n_{12}, instead of n_1, n_2, n_{12}. In these particular cases, the analogy between d\epsilon/d\log V and temperature would be complete, as has already been remarked. We should have



\frac{d\epsilon_{1}}{d\log V_{1}} = \frac{\epsilon_1}{n_1},
\qquad
\frac{d\epsilon_{2}}{d\log V_{2}} = \frac{\epsilon_2}{n_2}
,


\frac{d\epsilon_{12}}{d\log V_{12}} = \frac{\epsilon_{12}}{n_{12}} = \frac{d\epsilon_{1}}{d\log V_{1}} = \frac{d\epsilon_{2}}{d\log V_{2}}
,
when the energy is a quadratic function of the p's and q's, and similar equations with \tfrac 12 n_1, \tfrac 12 n_2, \tfrac 12 n_{12}, instead of n_1, n_2, n_{12}, when the energy is a quadratic function of the p's alone.

More characteristic of d\phi/d\epsilon are its properties relating to most probable values of energy. If a system having two parts with separate energies and each with more than two degrees of freedom is microcanonically distributed in phase, the most probable division of energy between the parts, in a system taken at random from the ensemble, satisfies the equation



\frac{d\phi_1}{d\epsilon_1} = \frac{d\phi_2}{d\epsilon_2}
,
(488)
which corresponds to the thermodynamic theorem that the distribution of energy between the parts of a system, in case of thermal equilibrium, is such that the temperatures of the parts are equal.

To prove the theorem, we observe that the fractional part of the whole number of systems which have the energy of one part (\epsilon_1) between the limits \epsilon_1' and \epsilon_1'' is expressed by



e^{-\phi_{12}} \int_{\epsilon_1'}^{\epsilon_1''} e^{\phi_1 + \phi_2} \, d\epsilon_1
,
where the variables are connected by the equation


\epsilon_1 + \epsilon_2 = \mathrm{constant} = \epsilon_{12}
.
The greatest value of this expression, for a constant infinitesimal value of the difference \epsilon_1'' - \epsilon_1', determines a value of \epsilon_1, which we may call its most probable value. This depends on the greatest possible value of \phi_1 + \phi_2. Now if n_1 > 2, and n_2 > 2, we shall have \phi_1 = -\infty for the least possible value of \epsilon_1, and \phi_2 = -\infty for the least possible value of \epsilon_2. Between these limits \phi_1 and \phi_2 will be finite and continuous. Hence \phi_1 + \phi_2 will have a maximum satisfying the equation (488).

But if n_1 \leq 2, or n_2 \leq 2, d\phi_1/d\epsilon_1 or d\phi_2/d\epsilon_2 may be negative, or zero, for all values of \epsilon_1 or \epsilon_2, and can hardly be regarded as having properties analogous to temperature.

It is also worthy of notice that if a system which is microcanonically distributed in phase has three parts with separate energies, and each with more than two degrees of freedom, the most probable division of energy between these parts satisfies the equation



\frac{d\phi_1}{d\epsilon_1} = \frac{d\phi_2}{d\epsilon_2} = \frac{d\phi_3}{d\epsilon_3}
.
That is, this equation gives the most probable set of values of \epsilon_1, \epsilon_2, and \epsilon_3. But it does not give the most probable value of \epsilon_1, or of \epsilon_2, or of \epsilon_3. Thus, if the energies are quadratic functions of the p's and q's, the most probable division of energy is given by the equation


\frac{n_1 - 1}{\epsilon_1} = \frac{n_2 - 1}{\epsilon_1} = \frac{n_3 - 1}{\epsilon_3}
.
But the most probable value of \epsilon_1 is given by


\frac{n_1 - 1}{\epsilon_1} = \frac{n_2 + n_3 - 1}{\epsilon_2 + \epsilon_3}
,
while the preceding equations give


\frac{n_1 - 1}{\epsilon_1} = \frac{n_2 + n_3 - 2}{\epsilon_2 + \epsilon_3}
.

These distinctions vanish for very great values of n_1, n_2, n_3. For small values of these numbers, they are important. Such facts seem to indicate that the consideration of the most probable division of energy among the parts of a system does not afford a convenient foundation for the study of thermodynamic analogies in the case of systems of a small number of degrees of freedom. The fact that a certain division of energy is the most probable has really no especial physical importance, except when the ensemble of possible divisions are grouped so closely together that the most probable division may fairly represent the whole. This is in general the case, to a very close approximation, when n is enormously great; it entirely fails when n is small.

If we regard d\phi/d\epsilon as corresponding to the reciprocal of temperature, or, in other words, d\epsilon/d\phi as corresponding to temperature, \phi will correspond to entropy. It has been defined as \log(dV/d\epsilon). In the considerations on which its definition is founded, it is therefore very similar to \log V. We have seen that d\phi/d\log V approaches the value unity when n is very great.[11]

To form a differential equation on the model of the thermodynamic equation (482), in which d\epsilon/d\phi shall take the place of temperature, and \phi of entropy, we may write



d\epsilon = \bigg(\frac{d\epsilon}{d\phi}\bigg)_a\, d\phi 
+ \bigg(\frac{d\epsilon}{da_1}\bigg)_{\phi,a}\, da_1
+ \bigg(\frac{d\epsilon}{da_2}\bigg)_{\phi,a}\, da_2
+ \mathrm{etc.}
,
(489)
or


d\phi = \frac{d\phi}{d\epsilon}\,d\epsilon
+ \frac{d\phi}{da_1}\,da_1
+ \frac{d\phi}{da_2}\,da_2
+ \mathrm{etc.}

(490)
With respect to the differential coefficients in the last equation, which corresponds exactly to (482) solved with respect to d\eta, we have seen that their average values in a canonical ensemble are equal to 1/\Theta, and the averages of A_1/\Theta, A_2/\Theta, etc.[12] We have also seen that d\epsilon/d\phi (or d\phi/d\epsilon) has relations to the most probable values of energy in parts of a microcanonical ensemble. That (d\epsilon/da_1)_{\phi,a}, etc., have properties somewhat analogous, may be shown as follows.

In a physical experiment, we measure a force by balancing it against another. If we should ask what force applied to increase or diminish a_1 would balance the action of the systems, it would be one which varies with the different systems. But we may ask what single force will make a given value of a_1 the most probable, and we shall find that under certain conditions (d\epsilon/da_1)_{\phi,a}, a represents that force.

To make the problem definite, let us consider a system consisting of the original system together with another having the coördinates a_1, a_2, etc., and forces A_1', A_2' etc., tending to increase those coördinates. These are in addition to the forces A_1, A_2, etc., exerted by the original system, and are derived from a force-function (-\epsilon_q') by the equations



A_1' = -\frac{d\epsilon_q'}{da_1}
, \quad
A_2' = -\frac{d\epsilon_q'}{da_2}
, \quad \mathrm{etc.}
For the energy of the whole system we may write


E = \epsilon + \epsilon_q' + \tfrac 12 m_1 \dot a_1^2 + \tfrac 12 m_2 \dot a_2^2 + \mathrm{etc.}
,
and for the extension-in-phase of the whole system within any limits


\int \ldots \int dp_1 \ldots dq_n \, da_1 \, m_1 \, d\dot a_1 \, da_2 \, m_2 \, d\dot a_2 \ldots
or


\int e^\phi \, d\epsilon \, da_1 \, m_1 \, d\dot a_1 \, da_2 \, m_2 \, d\dot a_2 \ldots
,
or again


\int e^\phi \, dE \, da_1 \, m_1 \, d\dot a_1 \, da_2 \, m_2 \, d\dot a_2 \ldots
,
since d\epsilon = dE, when a_1, \dot a_1, a_2, \dot a_2, etc., are constant. If the limits are expressed by E and E + dE, a_1 and a_1 + da_1, \dot a_1 and a_1 + d\dot a_1, etc., the integral reduces to


e^\phi \, dE \, da_1 \, m_1 \, d\dot a_1 \, da_2 \, m_2 \, d\dot a_2 \ldots
The values of a_1, \dot a_1, a_2, \dot a_2, etc., which make this expression a maximum for constant values of the energy of the whole system and of the differentials dE, da_1, d\dot a_1, etc., are what may be called the most probable values of a_1, \dot a_1, etc., in an ensemble in which the whole system is distributed microcanonically. To determine these values we have


de^\phi = 0
,
when


d(\epsilon + \epsilon_q' + \tfrac 12 m_1 \dot a_1^2 + \tfrac 12 m_2 \dot a_2^2 + \mathrm{etc.}) = 0
.
That is,


d\phi = 0
,
when


\bigg(\frac{d\epsilon}{d\phi}\bigg)_a\, d\phi
+ \bigg(\frac{d\epsilon}{da_1}\bigg)_{\phi,a}\, da_1
- A_1' da_1 + \mathrm{etc.} + m_1 \dot a_1 d\dot a_1 + \mathrm{etc.} = 0
.
This requires


\dot a_1 = 0, \quad \dot a_2 = 0, \quad \mathrm{etc.}
,
and


\bigg(\frac{d\epsilon}{da_1}\bigg)_{\phi,a} = A_1', \quad
\bigg(\frac{d\epsilon}{da_2}\bigg)_{\phi,a} = A_2', \quad
\mathrm{etc.}
This shows that for any given values of E, a_1, a_2, etc. \Big(\frac{d\epsilon}{da_1}\Big)_{\phi,a}, \Big(\frac{d\epsilon}{da_2}\Big)_{\phi,a}, etc., represent the forces (in the generalized sense) which the external bodies would have to exert to make these values of a_1, a+2, etc., the most probable under the conditions specified. When the differences of the external forces which are exerted by the different systems are negligible,—(d\epsilon/da_1)_{\phi,a}, etc., represent these forces.

It is certainly in the quantities relating to a canonical ensemble, \overline\epsilon, \Theta, \overline\eta, \overline A_1, etc., a_1, etc. that we find the most complete correspondence with the quantities of the thermodynamic equation (482). Yet the conception itself of the canonical ensemble may seem to some artificial, and hardly germane to a natural exposition of the subject; and the quantities \epsilon, \frac{d\epsilon}{d\log V}, \log V, \overline{A_1}|_{\epsilon}, etc., a_1, etc., or \epsilon, \frac{d\epsilon}{d\phi}, \phi, \Big(\frac{d\epsilon}{da_1}\Big)_{\phi,a}, etc., a_1, etc., which are closely related to ensembles of constant energy, and to average and most probable values in such ensembles, and most of which are defined without reference to any ensemble, may appear the most natural analogues of the thermodynamic quantities.

In regard to the naturalness of seeking analogies with the thermodynamic behavior of bodies in canonical or microcanonical ensembles of systems, much will depend upon how we approach the subject, especially upon the question whether we regard energy or temperature as an independent variable.

It is very natural to take energy for an independent variable rather than temperature, because ordinary mechanics furnishes us with a perfectly defined conception of energy, whereas the idea of something relating to a mechanical system and corresponding to temperature is a notion but vaguely defined. Now if the state of a system is given by its energy and the external coördinates, it is incompletely defined, although its partial definition is perfectly clear as far as it goes. The ensemble of phases microcanonically distributed, with the given values of the energy and the external coördinates, will represent the imperfectly defined system better than any other ensemble or single phase. When we approach the subject from this side, our theorems will naturally relate to average values, or most probable values, in such ensembles.

In this case, the choice between the variables of (485) or of (489) will be determined partly by the relative importance which is attached to average and probable values. It would seem that in general average values are the most important, and that they lend themselves better to analytical transformations. This consideration would give the preference to the system of variables in which \log V is the analogue of entropy. Moreover, if we make \phi the analogue of entropy, we are embarrassed by the necessity of making numerous exceptions for systems of one or two degrees of freedom.

On the other hand, the definition of \phi may be regarded as a little more simple than that of \log V, and if our choice is determined by the simplicity of the definitions of the analogues of entropy and temperature, it would seem that the \phi system should have the preference. In our definition of these quantities, V was defined first, and e^\phi derived from V by differentiation. This gives the relation of the quantities in the most simple analytical form. Yet so far as the notions are concerned, it is perhaps more natural to regard \phi as derived from e^\phi by integration. At all events, e^\phi may be defined independently of V, and its definition may be regarded as more simple as not requiring the determination of the zero from which V is measured, which sometimes involves questions of a delicate nature. In fact, the quantity e^\phi may exist, when the definition of V becomes illusory for practical purposes, as the integral by which it is determined becomes infinite.

The case is entirely different, when we regard the temperature as an independent variable, and we have to consider a system which is described as having a certain temperature and certain values for the external coördinates. Here also the state of the system is not completely defined, and will be better represented by an ensemble of phases than by any single phase. What is the nature of such an ensemble as will best represent the imperfectly defined state?

When we wish to give a body a certain temperature, we place it in a bath of the proper temperature, and when we regard what we call thermal equilibrium as established, we say that the body has the same temperature as the bath. Perhaps we place a second body of standard character, which we call a thermometer, in the bath, and say that the first body, the bath, and the thermometer, have all the same temperature.

But the body under such circumstances, as well as the bath, and the thermometer, even if they were entirely isolated from external influences (which it is convenient to suppose in a theoretical discussion), would be continually changing in phase, and in energy as well as in other respects, although our means of observation are not fine enough to perceive these variations.

The series of phases through which the whole system runs in the course of time may not be entirely determined by the energy, but may depend on the initial phase in other respects. In such cases the ensemble obtained by the microcanonical distribution of the whole system, which includes all possible time-ensembles combined in the proportion which seems least arbitrary, will represent better than any one time-ensemble the effect of the bath. Indeed a single time-ensemble, when it is not also a microcanonical ensemble, is too ill-defined a notion to serve the purposes of a general discussion. We will therefore direct our attention, when we suppose the body placed in a bath, to the microcanonical ensemble of phases thus obtained.

If we now suppose the quantity of the substance forming the bath to be increased, the anomalies of the separate energies of the body and of the thermometer in the microcanonical ensemble will be increased, but not without limit. The anomalies of the energy of the bath, considered in comparison with its whole energy, diminish indefinitely as the quantity of the bath is increased, and become in a sense negligible, when the quantity of the bath is sufficiently increased. The ensemble of phases of the body, and of the thermometer, approach a standard form as the quantity of the bath is indefinitely increased. This limiting form is easily shown to be what we have described as the canonical distribution.

Let us write \epsilon for the energy of the whole system consisting of the body first mentioned, the bath, and the thermometer (if any), and let us first suppose this system to be distributed canonically with the modulus \Theta. We have by (205)



\overline{(\epsilon - \overline \epsilon)^2} = \Theta^2 \frac{d\overline\epsilon}{d\Theta}
,
and since


\overline\epsilon_p = \frac n2 \Theta
,


\frac{d\overline\epsilon}{d\Theta} = \frac n2 \frac{d\overline\epsilon}{d\overline\epsilon_p}
.
If we write \Delta \epsilon for the anomaly of mean square, we have


(\Delta \epsilon)^2 = \overline{(\epsilon - \overline \epsilon)^2}
.
If we set


\Delta \Theta = \frac{d\Theta}{d\overline \epsilon}\Delta \epsilon
,
\Delta \Theta will represent approximately the increase of \Theta which would produce an increase in the average value of the energy equal to its anomaly of mean square. Now these equations give


(\Delta \Theta)^2 = \frac{2\Theta^2}{n} \frac{d\overline\epsilon_p}{d\overline\epsilon}
,
which shows that we may diminish \Delta \Theta indefinitely by increasing the quantity of the bath.

Now our canonical ensemble consists of an infinity of microcanonical ensembles, which differ only in consequence of the different values of the energy which is constant in each. If we consider separately the phases of the first body which occur in the canonical ensemble of the whole system, these phases will form a canonical ensemble of the same modulus. This canonical ensemble of phases of the first body will consist of parts which belong to the different microcanonical ensembles into which the canonical ensemble of the whole system is divided.

Let us now imagine that the modulus of the principal canonical ensemble is increased by 2\Delta \Theta, and its average energy by 2\Delta \epsilon. The modulus of the canonical ensemble of the phases of the first body considered separately will be increased by 2\Delta \Theta. We may regard the infinity of microcanonical ensembles into which we have divided the principal canonical ensemble as each having its energy increased by 2\Delta \epsilon. Let us see how the ensembles of phases of the first body contained in these microcanonical ensembles are affected. We may assume that they will all be affected in about the same way, as all the differences which come into account may be treated as small. Therefore, the canonical ensemble formed by taking them together will also be affected in the same way. But we know how this is affected. It is by the increase of its modulus by 2\Delta \Theta, a quantity which vanishes when the quantity of the bath is indefinitely increased.

In the case of an infinite bath, therefore, the increase of the energy of one of the microcanonical ensembles by 2\Delta \epsilon, produces a vanishing effect on the distribution in energy of the phases of the first body which it contains. But 2\Delta \epsilon is more than the average difference of energy between the microcanonical ensembles. The distribution in energy of these phases is therefore the same in the different microcanonical ensembles, and must therefore be canonical, like that of the ensemble which they form when taken together.[13] As a general theorem, the conclusion may be expressed in the words:—If a system of a great number of degrees of freedom is microcanonically distributed in phase, any very small part of it may be regarded as canonically distributed.[14]

It would seem, therefore, that a canonical ensemble of phases is what best represents, with the precision necessary for exact mathematical reasoning, the notion of a body with a given temperature, if we conceive of the temperature as the state produced by such processes as we actually use in physics to produce a given temperature. Since the anomalies of the body increase with the quantity of the bath, we can only get rid of all that is arbitrary in the ensemble of phases which is to represent the notion of a body of a given temperature by making the bath infinite, which brings us to the canonical distribution.

A comparison of temperature and entropy with their analogues in statistical mechanics would be incomplete without a consideration of their differences with respect to units and zeros, and the numbers used for their numerical specification. If we apply the notions of statistical mechanics to such bodies as we usually consider in thermodynamics, for which the kinetic energy is of the same order of magnitude as the unit of energy, but the number of degrees of freedom is enormous, the values of \Theta, d\epsilon/d\log V, and d\epsilon/d\phi will be of the same order of magnitude as 1/n, and the variable part of \overline \eta, \log V, and \phi will be of the same order of magnitude as n.[15] If these quantities, therefore, represent in any sense the notions of temperature and entropy, they will nevertheless not be measured in units of the usual order of magnitude,—a fact which must be borne in mind in determining what magnitudes may be regarded as insensible to human observation.

Now nothing prevents our supposing energy and time in our statistical formulae to be measured in such units as may be convenient for physical purposes. But when these units have been chosen, the numerical values of \Theta, d\epsilon/d\log V, d\epsilon/d\phi, \overline \eta, \log V, \phi, are entirely determined,[16] and in order to compare them with temperature and entropy, the numerical values of which depend upon an arbitrary unit, we must multiply all values of \Theta, d\epsilon/d\log V, d\epsilon/d\phi by a constant (K), and divide all values of \overline \eta, \log V, and \phi by the same constant. This constant is the same for all bodies, and depends only on the units of temperature and energy which we employ. For ordinary units it is of the same order of magnitude as the numbers of atoms in ordinary bodies.

We are not able to determine the numerical value of K as it depends on the number of molecules in the bodies with which we experiment. To fix our ideas, however, we may seek an expression for this value, based upon very probable assumptions, which will show how we would naturally proceed to its evaluation, if our powers of observation were fine enough to take cognizance of individual molecules.

If the unit of mass of a monatomic gas contains \nu atoms, and it may be treated as a system of 3\nu degrees of freedom, which seems to be the case, we have for canonical distribution



\overline\epsilon_p = \tfrac 32 \nu \Theta
,


\frac{d\overline\epsilon_p}{d\Theta} = \tfrac 32 \nu
.
(491)
If we write T for temperature, and c_v for the specific heat of the gas for constant volume (or rather the limit toward which this specific heat tends, as rarefaction is indefinitely increased), we have


\frac{d\epsilon_p}{dT} = c_v
,
(492)
since we may regard the energy as entirely kinetic. We may set the \epsilon_p of this equation equal to the \overline\epsilon_p of the preceding, where indeed the individual values of which the average is taken would appear to human observation as identical. This gives


\frac{d\Theta}{dT} = \frac{2c_v}{3\nu}
,
(X)
whence


\frac 1K = \frac{2c_v}{3\nu}
.
(493)
a value recognized by physicists as a constant independent of the kind of monatomic gas considered.

We may also express the value of K in a somewhat different form, which corresponds to the indirect method by which physicists are accustomed to determine the quantity c_v. The kinetic energy due to the motions of the centers of mass of the molecules of a mass of gas sufficiently expanded is easily shown to be equal to



\tfrac 32 pv
,
where p and v denote the pressure and volume. The average value of the same energy in a canonical ensemble of such a mass of gas is


\tfrac 32 \Theta \nu
,
where \nu denotes the number of molecules in the gas. Equating these values, we have


pv = \Theta \nu
,
(494)
whence


\frac 1K = \frac \Theta T = \frac{pv}{\nu T}
.
(495)
Now the laws of Boyle, Charles, and Avogadro may be expressed by the equation


pv = A \nu T
,
(496)
where A is a constant depending only on the units in which energy and temperature are measured. 1/K, therefore, might be called the constant of the law of Boyle, Charles, and Avogadro as expressed with reference to the true number of molecules in a gaseous body.

Since such numbers are unknown to us, it is more convenient to express the law with reference to relative values. If we denote by M the so-called molecular weight of a gas, that is, a number taken from a table of numbers proportional to the weights of various molecules and atoms, but having one of the values, perhaps the atomic weight of hydrogen, arbitrarily made unity, the law of Boyle, Charles, and Avogadro may be written in the more practical form



p v = A' T \frac mM,

(497)
where A' is a constant and m the weight of gas considered. It is evident that 1~K is equal to the product of the constant of the law in this form and the (true) weight of an atom of hydrogen, or such other atom or molecule as may be given the value unity in the table of molecular weights.

In the following chapter we shall consider the necessary modifications in the theory of equilibrium, when the quantity of matter contained in a system is to be regarded as variable, or, if the system contains more than one kind of matter, when the quantities of the several kinds of matter in the system are to be regarded as independently variable. This will give us yet another set of variables in the statistical equation, corresponding to those of the amplified form of the thennodynamic equation.


  1. See Boltzmann, Sitzb. der Wiener Akad., Bd. LXIIL, S. 418, (1871).
  2. See Chapter IV, pages 44, 45.
  3. See Chapter VII, pages 73-75.
  4. See Chapter XIII, page 160.
  5. See Chapter IV, pages 35-37.
  6. See Chapter XIII, pages 162, 163.
  7. See Chapter XII, pages 143-151.
  8. See Chapter XIII, page 159.
  9. This last case is important on account of its relation to the theory of gases, although it must in strictness be regarded as a limit of possible cases, rather than as a case which is itself possible.
  10. See foot-note on page 93. We have here made the least value of the energy consistent with the values of the external coördinates zero instead of \epsilon_a, as is evidently allowable when the external coördinates are supposed invariable.
  11. See Chapter X, pages 120, 121.
  12. See Chapter IX, equations (321), (327).
  13. In order to appreciate the above reasoning, it should be understood that the differences of energy which occur in the canonical ensemble of phases of the first body are not here regarded as vanishing quantities. To fix one's ideas, one may imagine that he has the fineness of perception to make these differences seem large. The difference between the part of these phases which belong to one microcanonical ensemble of the whole system and the part which belongs to another would still be imperceptible, when the quantity of the bath is sufficiently increased.
  14. It is assumed—and without this assumption the theorem would have no distinct meaning—that the part of the ensemble considered may be regarded as having separate energy.
  15. See equations (124), (288), (289), and (314); also page 106.
  16. The unit of time only affects the last three quantities, and these only by an additive constant, which disappears (with the additive constant of entropy), when differences of entropy are compared with their statistical analogues. See page 19.