Entropy has only been defined unambiguously for states of thermal equilibrium, so we shall begin further discussion in that context. Although the definition of Clausius, Eq.(2) above, is basically an experimental one, many writers from Boltzmann [9] to the present [10] have struggled to define and understand a theoretical expression for S. In retrospect it is easy to see why this was a problem, for it seemed natural to think of S as a physical quantity of the same kind as the mechanical variables pressure, volume, etc. It is indeed a physical quantity in this context, but of a quite different kind.
Not until the work of Shannon [11] in 1948 did it become evident that the notion of entropy far transcended the original idea of Clausius, and that it is fundamentally a functional of probabilities. When applied to a definite physical problem, it becomes a physical quantity. The hints were there all the time, of course -- first with the connection of S with heat, a disorganized form of energy implying some uncertainty, and then with recognition by Boltzmann and Maxwell that probability must become an essential ingredient of the many-body problem. This is, indeed, the principal message of Eq.(1), or its classical counterpart.
Almost ten years later, stimulated by Shannon's insight, Jaynes put forth the principal of maximum entropy (PME) with respect to any reasoning scenario characterized primarily by insufficient information [12]. The optimal probability assignment describing that situation is the one which maximizes the information-theoretic entropy subject to constraints imposed by the information that is available. When applied to a thermodynamic system the PME constitutes a restatement of Gibbs' observation above, with one all-important addition -- we are now given the precise mathematical machinery required to carry out the calculations. When the PME is applied to the theoretical entropy of Eq.(1) it provides the full justification for the method of ensembles, but without the need for ensembles! If the constraints are measured values of the macroscopic variables defining the equilibrium thermodynamic system, then S becomes the thermodynamic entropy, and its role is to bridge the gap between the inaccessible microscopic detail and the observed macroscopic behavior. Thus, contrary to expressed opinion [13], the origin of entropy is far from explicitly dynamical.
It may be useful to digress for a moment here and recall exactly what it is the PME accomplishes. Consider an experiment, of the `random' type, for which there are m possible results at each trial, and thus for which there are conceivable outcomes in n trials. Each outcome yields a set of sample numbers , along with frequencies . If in n trials the ith result occurs times, then out of the possible outcomes the number of those yielding a particular set of frequencies is given by the multinomial coefficient, or multiplicity factor
We now ask for that set that can be realized in the greatest number of ways, which means maximizing W subject to any constraints we may have upon the problem. At a minimum one must require that , or . For large n it is useful to note that an equivalent procedure is to maximize , for Stirling's formula then encourages us to consider the quantity
Let us emphasize what the result of this variational problem yields: for large n we obtain that set of frequencies that can be realized in the greatest number of ways, a course that common sense tells us to pursue in any event.
This scenario can be turned around, of course, and we can ask a different question: after a large number of trials, what are the probabilities for the various outcomes on the next trial? The PME directs us to the set that maximizes the entropy
subject to constraints in the form of the data provided by the previous trials, as well as the normalization . Not surprisingly, the frequencies and probabilities in this example are numerically identical, despite the fact that they address different questions.
The same reasoning is naturally applied to a macroscopic physical system about which we know only its average energy (or temperature), which appears constant for some time. The probability distribution maximizing the entropy subject to this constraint and that of normalization is just that of the canonical ensemble:
where is the Lagrange multiplier associated with the energy. We shall return to further discussion of this well-known result presently, emphasizing the context in which it was derived.
Both experimental and theoretical definitions lead inescapably to the conclusion that entropy is an anthropomorphic concept -- not in the sense that it is somehow unphysical, but only that it is determined by the particular set of macroscopic variables defining an experiment or observation. For example, if one defines a thermodynamic state of the human body in terms of temperature, density, and volume alone, then it would not be surprising to measure entropies differing little from a similar body of water. But the states of living matter are a good deal more complicated and must be defined in terms of many additional variables, which is why their study has not proved a simple matter. Thus, for each experiment, say, experimenters define the entropy in terms of those macroscopic variables they wish to control -- the choice may be different for other experiments and circumstances. Once the anthropomorphic character of entropy is appreciated, it is not so difficult to digest the next logical observation, that to each physical system there correspond many thermodynamic systems, defined once more by the macroscopic variables one chooses to monitor. Of course, this is just the way we defined a thermodynamic state of a physical system above, and hence merely a consistent extension of the viewpoint.
Some further thought reveals that this picture of the physical entropy could have been expected, and is completely consistent with the way theoretical entropy was introduced above, following Shannon and Jaynes. To go back even further -- to Jeffreys, Keynes, and Laplace -- we take all probabilities to be conditional on one or more hypotheses, rather than absolute, so that they are not to be interpreted as real physical objects. They are subjective only to the extent that they depend on the evidence or state of knowledge employed in their formulation, but are completely objective in that any rational observer would then make the same probability assignment. Therefore, any probability distribution, and the entropy of that distribution, are to large extent themselves anthropomorphic, so it should not be surprising that the specific application to statistical mechanics and thermodynamics has this flavor as well. One sometimes sees the phrase `statistical system' used to characterize the many-body problem. But we argue that there is no such thing -- only a physical system over which we have no microscopic control, but with many possible thermodynamic manifestations.
Incidentally, a persistent source of confusion related to the physical-reality myth of probabilities concerns the notion of fluctuations. One can estimate the quality of predictions implied by a probability assignment through calculation of the standard deviation, yielding an expression for fluctuations of a statistical variable about its mean, and then test these predictions by measuring frequencies in an experiment. It cannot be emphasized enough, however, that such statistical fluctuations are nothing more than what we just described, and say nothing about possible physical fluctuations of the corresponding physical variable. These can be predicted in terms of time-averaged quantities, of course, but their existence can only be verified through measurement [14].
As discussed here, the entropy concept corresponds pretty much with that of the equilibrium state, and we shall address those attempts to extend it to a time-dependent quantity and nonequilibrium states presently.