Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2022. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

date: 01 July 2022

# Maxwell’s Demon

## Abstract and Keywords

In his 1867 thought experiment, “Maxwell’s Demon,” James Clerk Maxwell attempted to show that thermodynamics is not strictly reducible to mechanics. Maxwellian Demons are mechanical devices that carry out measurements on a thermodynamic system, manipulate the system so as to extract work from it, and erase all records of the measurement outcomes. If successful, they decrease the total entropy of the universe, thereby violating the Second Law of Thermodynamics. According to the prevalent contemporary approach, the Demon fails, since measurement or erasure necessarily result in entropy increase. This article provides a brief overview of statistical mechanics focusing on the notions of macrovariables and macrostates, and describes in mechanical terms information processes such as measurement and erasure. It is shown that Maxwellian Demons are compatible with the principles of mechanics. After a detour through reduction, probability, information, and computation in statistical mechanics, it is seen that Maxwell was right after all.

# 1. Introduction

In 1867, James Clerk Maxwell wrote to Peter Guthrie Tait that there was a “hole” in the Second Law of Thermodynamics:1

“Now let A & B be two vessels divided by a diaphragm and let them contain elastic molecules in a state of agitation which strike each other and the sides…. Now conceive a finite being who knows the paths and velocities of all the molecules by simple inspection but who can do no work except open and close a hole in the diaphragm by means of a slide without mass. Let him first observe the molecules in A and when he sees one coming the square of whose vel.[ocity] is less than the mean sq. [square] velocity of the molecules in B let him open the hole and let it go into B. Next let him watch for a molecule of B, the square of whose velocity is greater than the mean sq.[square] vel. [velocity] in A, and when it comes to the hole let him draw the slide and let it go into A, keeping the slide shut for all other molecules. Then the number of molecules in A and B are the same as at first, but the energy in A is increased and that in B diminished, that is, the hot system has got hotter and the cold colder and yet no work has been done, only the intelligence of a very observant and neat-fingered being has been employed. Or, in short, if the heat is the motion of finite portions of matter and if we can apply tools to such portions of matter so as to deal with them separately, then we can take advantage of the different motion of different proportions to restore a uniform hot system to unequal temperatures or to motions of large masses. Only we can’t, not being clever enough.”

(Knott 1911, pp. 213–214)

As is well known, increasing the temperature difference between A and B in Maxwell’s example implies that the entropy of the gas is decreased. Even if Maxwell is right and only a very intelligent and delicate Demon can decrease the entropy without investing work, this is still a violation of the Second Law—provided, of course, that the Demon is not supernatural, a point that Maxwell emphasized.2

What does Maxwell’s thought experiment show? Maxwell famously thought that the chief end of his Demon was “To show that the 2nd Law of Thermodynamics has only statistical certainty” (ibid., pp. 214–215). Had this been the lesson of the Demon, it would not have attracted as much interest as it did. For, according to the contemporary mainstream view in statistical mechanics, the Second Law has only statistical certainty: the decrease of entropy is not strictly impossible but is extremely improbable. The fact that entropy may decrease is not taken to be a strict violation of the Second Law. Rather, the Second Law is now understood probabilistically as meaning that entropy decrease is highly unlikely. Obviously, if the Demon is a counterexample for the contemporary theory, it is a thought experiment that challenges the probabilistic version of the Second Law.

Indeed, we take it that Maxwell’s Demon is a mechanical system that brings about a decrease of entropy with certainty, or at least with probability much greater than the rate predicted by standard statistical mechanics, and for this reason it entails that mechanics is compatible with a violation of the phenomenal Second Law of Thermodynamics: the Demon is a perpetuum mobile of the second kind. It seems to us that Maxwell’s thought experiment shows that the Second Law is not universally valid in this strong sense. Our view goes against entrenched beliefs in the field: the Second Law of Thermodynamics is generally considered unshakable. Eddington (1935, p. 81) for example, thought that “The law that entropy always increases—the second law of thermodynamics—holds, I think, the supreme position among the laws of Nature …. [I]f your theory is found to be against the second law of thermodynamics I can give you no hope; there is nothing for it but to collapse in deepest humiliation.” Einstein (1970, p. 33) (and many others) expressed a similar view. Since Maxwell’s Demon seems to threaten this law, it has been widely discussed in the literature in the past 150 years or so. Numerous and various attempts have been made to “exorcise” the Demon; that, is to show that it is incompatible with the fundamental theories of physics.3

In the contemporary literature, the key to exorcising the Demon is taken to be associated with the “intelligence” that Maxwell ascribed to it (see quotation at the start of this section), where this intelligence is understood in terms of the acquisition and manipulation of information carried out by the Demon. Accordingly, a Maxwellian Demon is understood as a physical device that acquires information by measurements, uses the information to extract work from the energy stored in a thermodynamic system, and, finally, erases this information as part of the preparation for the next cycle of operation. If the process is to be a perpetuum mobile (of the second kind), the amount of work invested in the entire process (consisting of the measurement, the extraction of the work from the system, and the erasure) should be less than the work produced, so that the total entropy of the universe decreases during this process.

For these reasons, the mainstream literature about Maxwell’s Demon, which consists of numerous attempts to prove that it is inconsistent with one or another fundamental physical principle,4 is divided into three approaches.

1. (1) The first focuses on the use of information to extract work, by, for example, criticizing the idealization of frictionless processes or emphasizing the role of fluctuations (e.g. Smoluchowski 1912, 1914; Feynman 1963; Norton 2013). In the contemporary literature (see Leff and Rex 2003) it is usually assumed that friction can ideally be made as low as one wants and therefore this idealization is acceptable.

2. (2) The second approach follows Szilard’s (1929) argument and focuses on the increase of the entropy in the measurements carried out by the Demon.

3. (3) The third approach follows Landauer’s (1961) principle, according to which entropy increases in the process of erasing information. This idea was first applied by Bennett (1973, 1982), who suggested that erasure is a necessary part of the Demonic operation, since it is part of resetting the system and preparing it for the next cycle. The idea was later generalized and expanded by Fahn (1996). The contemporary mainstream view follows this third family of approaches. While the Landauer-Bennett approach was formulated in the classical context, some writers (e.g., Zurek 1990) attempt to underpin the entropy increase that accompanies the erasure with quantum mechanical interactions, such as the decoherence interaction of the measuring and recording devices with the environment (see other approaches of this sort in Leff and Rex 2003). Following the vast literature on this third approach, the discussions of the Demon became significant not only because of the Demon’s challenge to the Second Law of Thermodynamics, but also because the research in this field revealed the physical significance of information processing in classical and quantum physics. For this reason our discussion here directly addresses the physical understanding of measurement and erasure, which are not only necessary for understanding Maxwell’s Demon, but also significant in their own right.

In what follows, we analyze the main concepts that appear in these three approaches to exorcising the Demon, namely the Second Law of Thermodynamics, measurement, and erasure in classical mechanics. Our conclusion agrees with Maxwell’s original approach, according to which Demons are compatible with all known principles of classical (and quantum) mechanics. We shall further argue, perhaps pace Maxwell, that Demons are ubiquitous.

This last point is important. From the discussion so far we can see that there are two major questions about Maxwell’s Demon5: (I) Is the Demon compatible with the principles of fundamental physics? (II) Are there, or can one construct, Demons in our world? Some writers think that the answer to question (II) is negative and that this strongly indicates that the answer to question (I) should be negative as well. In contrast, we will show that the answer to question (I) is affirmative, and that there are some reasons to think that the answer to question (II) is affirmative as well.

To answer question (I) it is necessary to examine the principles of the fundamental theory that underlies thermodynamic phenomena. In our present discussion we will focus on classical mechanics, for clarity of presentation; but, of course, one needs to carry out this task with one’s best fundamental theory. Call the best theory, which supposedly underlies the Second Law of Thermodynamics, T. If one could prove a theorem in T according to which the probabilistic version of the Second Law of Thermodynamics is universally true, then it would follow that Maxwell’s Demon is incompatible with T. As is well known, so far no such theorem has been proved in either classical or quantum statistical mechanics (see overview in Sklar 1993, Frigg 2008). Although there are some important and interesting theorems about entropy increase (such as Lanford’s theorem; see Uffink and Valente 2010, 2015), these theorems are applicable only under highly restricted conditions and so fall short of what is needed to exorcise the Demon. If, however, one could construct in T a model of a Maxwellian Demon (without violating any theorem of T), then it would follow that Maxwell’s Demon is compatible with T and, ipso facto, that a universal theorem exorcising the Demon is not provable in T. Recently part of such a model was proposed by Hemmo and Shenker (2010, 2012), and we shall address it below. In our view these constructions answer question (I) in the affirmative.

Since, as we said, most of the literature answers question (I) in the negative, it seems pointless to discuss question (II). Moreover it is still widely believed that all the phenomena we experience invariably accord with the Second Law of Thermodynamics. Thus, according to the conventional wisdom, even if the answer to question (I) were affirmative, nonetheless in our world the answer to question (II) would still be negative. In that case it would be very interesting to know why, given that a Maxwellian Demon is possible, we don’t witness and don’t seem to be able to construct Demons. Maxwell’s answer to this question was that we are not clever enough (see quotation at start of the section). We show in this article that this was perhaps Maxwell’s only mistake about his Demon: We shall argue that it is plausible that we ourselves are Maxwellian Demons of some sort. The question of whether this amounts to a violation of the Second Law is subtle and involves interpreting the law, since in thermodynamics the Second Law does not refer to measurements explicitly. We return to this point in section 7.

We proceed as follows. In sections 2 and 3 we briefly explain the main principles of statistical mechanics that are needed for the analysis of Maxwell’s Demon, namely the concepts of Macrostates and Macrovariables, and the emergence of probability. We then use this explanation in section 4 as a basis for our description of what the Second Law of Thermodynamics looks like in statistical mechanics. It turns out that the notion of measurement, which, as we have seen, is central to discussions of the Demon, is also crucial for understanding the Second Law in statistical mechanics. We explain measurement in section 4. In section 5 we focus on the notion of erasure. Sections 6 and 7 combine all the above ideas and demonstrate that Maxwellian Demons do not violate any principle of classical mechanics. Section 8 explains briefly why this lesson carries over to quantum mechanics.

# 2. Macrostates and Macrovariables

According to mechanics, at every moment the universe is in some well-defined state, called a microstate, which evolves in time in accordance with the fundamental equations of motion. This is true for both classical mechanics and quantum mechanics, but the nature of the microstate and of the evolution differs between the theories. In what follows, for the sake of simplicity, we focus on classical mechanics; the generalization to quantum mechanics is addressed in section 8.

(A terminological remark regarding the term microstate is needed to avoid confusion. There are various notions of “microscopic” in the literature: the term sometimes means small, or part of a whole. But in statistical mechanics, it is customary to use the term “microstate” to denote the complete description of the mechanical state of the world,6 as opposed to its partial descriptions, called macrovariable or macrostate, discussed later in this section. “Complete” here is, of course, relative to the theory in question.)

According to mechanics, to predict the future state of the world, one needs to know its present microstate and calculate its evolution according to the laws of motion.

At first sight we might conclude … that, as the number of particles increases, so also must the complexity and intricacy of the properties of the mechanical system, and that no trace of regularity can be found in the behaviour of a macroscopic body. This is not so, however, … when the number of particles is very large, new types of regularity appear.

(Landau and Lifshitz 1980, p. 1)

This was one7 of the great discoveries of the creators of statistical mechanics, especially J. C. Maxwell, L. Boltzmann, and J. W. Gibbs. They discovered that, in order to provide a successful and informative account of thermodynamic phenomena, one does not need to follow the exact detailed microscopic behavior of the world.8 As the number of particles and the complexity of systems increase, those systems exhibit some regularities that can be described without following their detailed, complex trajectories. One only needs partial information about the complex microstate of a system to predict interesting features of its behavior. For example, the average kinetic energy of the particles making up an ideal gas at a given moment is a very partial description of the microstate of this gas (the full description is a precise specification of the particular positions and momenta of all the individual particles of the gas), yet this partial description is highly informative as a basis for prediction, since it accounts for the thermodynamic notion of temperature in certain circumstances. Such informative partial descriptions of the microstates of complex many-particle systems are sometimes called macrovariables in the literature (see e.g. Lebowitz 1993; Goldstein and Lebowitz 2004; Frigg 2008, p. 104) and sometimes macrostates (e.g. Ehrenfest and Ehrenfest 1912; Sklar 1993). While the notions of macrovariable and macrostate are significantly different, they are closely connected in important cases, and for this reason they are sometimes conflated. And since they are central in statistical mechanics, and carry with them important metaphysical and epistemological ideas, let us explain these two notions in some detail.

There are two senses of the term macrovariable in statistical mechanics. First, a macrovariable is an aspect or a property or a partial description of a given microstate (e.g., the average kinetic energy of the particles making up a gas at a given moment is a partial description or an aspect of its microstate). The second sense stems from the fact that the aspect described by a macrovariable is shared by several microstates (a microstate with different positions and momenta of the particles can have the same average kinetic energy). In this second sense, a macrovariable is a set of microstates: it denotes all the microstates that share the same aspect or partial description. Macrovariables and their dual nature—aspect and set—were Maxwell’s, Boltzmann’s, and Gibbs’s great discovery (although these thinkers understood these notions in different ways9). Since both of these senses are indispensable in statistical mechanics, all the contemporary writers in the field mention them in one way or another (e.g. Lebowitz 1993; Callender 1999; Albert 2000; Goldstein and Lebowitz 2004; Earman 2006; Frigg 2008; Wallace 2011).

The first sense of a macrovariable, that of an aspect or a property of a microstate, is indispensible because it can pertain to the actual microstate of the system. After all, statistical mechanics aims to describe the actual state of a given thermodynamic system and, on its basis, to predict its evolution. So, when we describe the ideal gas in front of us as, say, having a certain average kinetic energy, we talk about the macrovariable of the particular microstate the gas happens to be in, at that moment.

The second sense of a macrovariable, as a set of microstates, is indispensable for understanding two crucial ideas in statistical mechanics, namely probabilities and entropy. We describe the notion of probability in detail in section 3, but here is the idea in a nutshell, so as to understand the notion of macrovariables. A central idea in statistical mechanics is that certain aspects of the mechanical microstates (that is, certain macrovariables in the first sense of the term) exhibit probabilistic regularities, and that these probabilistic regularities underlie the phenomena described by thermodynamics. An important example is the discovery that the phenomenal Second Law of thermodynamics is satisfied not absolutely but only with very high probability. But how can this be? Classical mechanics, which governs the evolution of microstates, is deterministic, and so the question of whether a system in a certain microstate will satisfy the Second Law or not is a yes-no question, and not a matter of probability.10 The only way that one can ascribe probability to the evolution of a system in classical mechanics is by realizing that, if the system’s actual microstate is described only partially—that is, in terms of some of its aspects (or macrovariables in the first sense)—then this same description is shared by many microstates, and one cannot tell which microstate among all these possible microstates (that exhibit the same macrovariable in the first sense of the term) is the actual one. All the microstates that share the same macrovariable (in the first sense of the term) form an equivalence set relative to that macrovariable. Maxwell, Boltzmann and Gibbs discovered (in different ways and senses) that (in general) most of the microstates (in such an equivalence set) will satisfy the Second Law, but some won’t; and this is the sense in which the phenomenal Second Law of thermodynamics is probabilistic, despite the determinism at the fundamental mechanical level. (Later on we shall provide more details about the emergence of this probabilistic behavior.) And so, one can say (by appealing to the dual nature of a macrovariable as an aspect and as a set of microstates) that a macrovariable describes the actual microstate of a system at a time, and also grounds a probabilistic prediction of its evolution. Therefore, in order to understand probabilities, one needs to realize that macrovariables denote sets of microstates.

As we said, the second sense of macrovariables, as sets, is also indispensable for understanding the notion of entropy. In thermodynamics, entropy denotes the part of a system’s energy that cannot be harvested to produce work. (The Second Law says that the entropy of an isolated system cannot decrease, and so with time less of its energy is exploitable.) In mechanics, to harvest a system’s energy one must manipulate the system’s actual microstate; and to do so one must know what the microstate is. Therefore, in statistical mechanics, entropy is associated with the amount of information one has about the system’s microstate. This information is, as usual, understood in terms of the size of sets: if all one knows is that the microstate belongs to a certain set, then the smaller the set, the larger is the amount of information about that microstate. Because of this, entropy is associated with the size of sets.11

This second sense of macrovariables, that is, necessary for the notions of probability and entropy, is often conflated with the notion of macrostates, to which we now turn. To understand macrostates it is important to start by noticing the following formal fact: the microstate of a system has many possible partial descriptions, many possible macrovariables; but only some of them appear in statistical mechanics (e.g., average kinetic energy). And what makes these macrovariables special or preferred—and this is an important point—is the fact that they feature in our experience (either directly or indirectly), whereas other macrovariables don’t. We experience certain magnitudes or properties, but not other magnitudes or properties, and these are associated with certain aspects of the microstates of systems, certain macrovariables, but not others. And since a macrovariable can be understood as denoting a set of microstates (that share an aspect), one can say (formally) that we experience certain sets and not others. Of course, this is only a formal claim: in fact, what we sense is an aspect of the actual particular microstate at each time; but since we only sense that aspect, we cannot tell which, among the microstates that share this aspect, is the actual one. In other words, all the microstates in the set are equivalent or indistinguishable relative to that aspect. And here comes the notion of macrostate: we shall use the term macrostate to denote a set of microstates that are indistinguishable by an observer. An observer who measures a property of a thermodynamic system will receive as outcome a thermodynamic macrovariable (in the first sense, pertaining to the actual microstate of the system), and since this macrovariable is shared by a set of microstates (the second sense of macrovariable), this observer will not be able to know which of these microstates is the actual microstate; and since these microstates will thus be indistinguishable by the observer, they will form a macrostate.

And so, to fix our terminology, when we talk about a set of microstates that share an aspect of the actual microstate, we shall use the term macrovariable, and when we talk about a set of microstates that are indistinguishable by an observer, we shall use the term macrostate.

In interesting cases, such as those that feature in the statistical mechanical account of thermodynamics, the same sets are both macrovariables and macrostates. For example, when our thermometer says that the gas has a certain temperature, it provides information about an aspect of the actual microstates during the time of measurement: that aspect is the value of the macrovariable of average kinetic energy. This value, in turn, is shared by a set of counterfactual microstates, and these microstates are indistinguishable from the actual microstate by the observer (or by the thermometer), and therefore they also form a macrostate.

To those familiar with the standard literature on statistical mechanics, this analysis of the term macrostate may seem prima facie different from standard presentations, but the only difference is that here we take extra care to distinguish between Macrostates and Macrovariables, and if one applies both of these notions carefully, the result meshes with the results of statistical mechanics (as in the example of average kinetic energy). The difference between our presentation here, with the emphasis on the conceptual distinction between Macrostates and Macrovariables, and their interrelation, is merely in the more precise distinction between these notions (for the meaning of the term macrostate in Boltzmann’s theory, see Uffink 2004).

For example, we said that understanding macrovariables as sets is important in order to understand the notions of probability and entropy in statistical mechanics. Usually, however, one talks about the probability and entropy of macrostates. And indeed, this talk is correct: importantly, statistical mechanics addresses cases in which the sets are both Macrostates and Macrovariables, and so the notions of probability and entropy apply to the same set both in its role as macrovariable and in its role as macrostate. We shall see later that this realization is significant.

Notice, by the way, that since macrovariable and macrostate are distinct notions (by the account) one can think of sets of microstates that are both macrovariables and macrostates (as in the above example), or only macrovariables, or only macrostates, or neither. We don’t address the three latter cases here (see Hemmo and Shenker 2015b for a detailed discussion).

Thinking about macrostates as sets of microstates that are indistinguishable by an observer raises a problem, which we then solve by invoking the notion of macrovariable in its two senses. Here is the problem. The term indistinguishability seems prima facie epistemic and therefore not part of the underlying mechanics, and in this sense not completely “physical.” This problem makes the notion of the entropy of a macrostate seem epistemological, while in thermodynamics, and even in statistical mechanics, entropy is thought of as a physical property of the system in its actual microstate. Another closely related problem is that while entropy is thought of as a property of the actual state of the system, realizing that it is a property of a set of points in the state space brings to mind that all but one of the microstates in this set are counterfactual. Similar problems arise with respect to the usual understanding of the notion of probability (but not with respect to the conception of probability presented in section 3). This tension is usually addressed by invoking notions such as ignorance or indifference with respect to the question of which, among the possible microstates, is the actual one; but these notions are obviously epistemic. To eliminate the prima facie epistemic nature of the theory and make it a fully “physical” theory, in the sense that it will be based only on terms in the underlying mechanics, it is necessary to bring in some mechanical facts that will fully account for the status and role of the sets in the theory, and will replace the epistemic terms. Since according to mechanics whatever happens in the world is a consequence of the microstate (and microtrajectory) of the world, the mechanical facts we are looking for must be encoded in the microstate of the world. This problem (of the epistemic nature of sets of microstates) and this solution (of replacing these epistemic terms by mechanical ones) are crucial points that are sometimes overlooked in the literature on the foundations of statistical mechanics, and therefore we emphasize them.

Importantly, the solution for this problem cannot be achieved by invoking the notion of macrovariable with its two senses. The reason is that although in important cases, and especially in statistical mechanics, the same sets are both macrovariables and macrostates, these two notions are conceptually different. The problem of the seemingly epistemic nature of sets does not arise with respect to macrovariables, since an observer interacts with some aspect of the actual microstate; this is the physics of what actually takes place in the world; and the set of microstates that share this aspect is a theoretical construct. But a macrostate is a set of microstates that are indistinguishable by an observer; this is the only criterion for including the microstates in the set (in particular, they don’t have to share a macrovariable). When we speak about the probability or the entropy of macrostates, we speak about the probability and entropy of sets of microstates that share the fact of being indistinguishable by an observer. And the question is, What are the physical facts that ground this notion of indistinguishability?

Moreover, even if (as in statistical mechanics) the set that forms a macrostate is also a macrovariable, this will not help to make the case fully physical. For if a macrovariable is a mode of description of the actual microstate of the system (see Ben Menahem 2001), then, if the theory is to be fully mechanical, one needs to explain what is the physical underpinning of modes of description, and, as we show in the next paragraph, this is another way of phrasing the same question. Thus, the task is to explain how a theory about sets of counterfactual states accounts for the actual evolution of the system. This is the task we undertake here.

To illustrate the problem, consider a physicist, call him Ludwig, who has complete knowledge of the mechanics of a thermodynamic system G. Ludwig knows the exact actual microstate of G and its equations of motion, and can calculate the exact actual microevolution of G. (We think of Ludwig as embodying the mechanical theory itself, and therefore epistemic questions of self-reference, infinite information content, computational abilities, and the like don’t arise.) Is this knowledge of the mechanics of G enough to allow Ludwig to come up with the predictions of thermodynamics for G? In a sense, the answer is trivially affirmative since, according to mechanics, Ludwig knows everything there is to know about G. However—and this is the great discovery of statistical mechanics mentioned above—the thermodynamic description of G is a partial description of the microevolution of G (to continue the previous example, the average kinetic energy and its behavior over time is just a partial description or an aspect of the microevolution of a gas). And the problem is that there are infinitely many partial descriptions of the actual microevolution of G, while the thermodynamic partial description is only one of them. All of these partial descriptions are in principle known to Ludwig, together with all the regularities that they may exhibit; and the challenge is this: can Ludwig single out the thermodynamic partial description as a special one, out of all the possible partial descriptions? Is there something unique to this partial description that makes it preferred over the others and gives it a special status? To single out the thermodynamic partial description, Ludwig needs a guide. And the question is this: Can Ludwig’s guide come from the mechanics of the world? If thermodynamics is to be underpinned by mechanics, the answer has to be in the affirmative; but it turns out that the details here are highly nontrivial (and do not appear in the standard presentations of statistical mechanics, and so we propose to complete the picture here).

Let us first briefly rule out one solution. The criterion for choosing the thermodynamic macrovariables over others (or the thermodynamic partition of sets over alternatives) is not the fact that these macrovariables (or sets) exhibit regularity. The reason is that one cannot rule out the possibility that other regularities would appear under some other macrovariables (or sets).12 Two examples here (we shall not go into their details) are the spin echo experiments (Hahn 1950, 1953) and recently explored features of quantum heat engines (Scully 2001, Scully et al. 2003), in which the regularities are not the usual thermodynamic ones. So the criterion for choosing the thermodynamic macrovariables over others is not the existence of regularity, but something else.

What, then, can single out the thermodynamic macrovariables (and hence also the thermodynamic regularities) among all the possible macrovariables of G? It seems to us that it is, quite simply, the fact that these macrovariables (and not others) appear in our experience. And to single out this property, Ludwig will need to look at us and at our interactions with G, and not only at G. Ludwig can of course do this: assuming that we are natural systems (recall in this context that Maxwell’s Demon is presumably natural), Ludwig can know everything about us in the same way that he can know everything about G, and he can know everything about our physical interaction with G. This interaction will give rise to our experience, and will reveal the special status of the thermodynamic macrovariables. That’s all there is to it, and we shall presently see the outline of how this is done.

To reiterate the point from a slightly different angle, note that Ludwig can derive the thermodynamic regularities that G satisfies even if he looks only at the phase space of G: that is, he can express the thermodynamic regularities of G in terms of mechanical macrovariables pertaining to G alone. But here there are two points. First, as we just said, presumably there are other mechanical macrovariables of G alone that satisfy some other nonthermodynamic regularities. Second, and independently, by looking only at G, Ludwig will not know which macroscopic regularities are manifested in our experience. The first point shows that looking at G indeed gives complete information about the behavior of the macrovariables of G, while the second point shows that looking only at G is insufficient to arrive at the conclusion that the thermodynamic laws are empirical laws in our world. For this second point, Ludwig needs access to the way in which G interacts with us. This means that despite the complete information that G carries about its macrovariables and their evolution in time, Ludwig cannot arrive at the conclusion that the laws of thermodynamics will appear to us as empirical laws without looking at the physical structure of the coupling of G with us.13

This account seems, at first sight, to differ from standard textbook formulations of statistical mechanics; but this impression is misleading, and in fact the account just detailed can be integrated into the textbooks. In textbooks, macrostates are characterized without reference to an observer: they are presented as expressing physical properties of the actual microstates (that is, macrovariables) that are shared by all the microstates in a set (that is, macrostates). But the textbook accounts already rely on our thermodynamic experience, only without being explicit about it. They take it as an uncontroversial starting point that the thermodynamic macrovariables are preferred, without explaining the origin of this preferred status. Here we emphasize the point that the standard accounts feel no need to address. Since we are interested in providing a complete mechanical foundation for statistical mechanics that contains no nonmechanical epistemic notions, we do not accept the choice of the thermodynamic macrovariables as given. We now turn to explaining in more detail the physical origin of the choice of the thermodynamic macrovariables in statistical mechanics.

Importantly, we want all the information that Ludwig needs, about G as well as about us, to be described in terms of the actual microstate and microevolution of the world, since this exhausts all the information available to Ludwig, and all that there is according to the fundamental theory. Figure 1 Macrostates and Macrovariables

Suppose that (classical14) mechanics is the complete description of the universe. The only raw materials we can work with in mechanics are the microstates and their evolutions over time. This means that everything that one can say about subsystems of the universe must be phrased in these terms. Figure 1 illustrates a universe with two sets of degrees of freedom: G is the thermodynamic system and its environment, and O is an observer. Both O and G are completely described in classical mechanical terms. In particular, O is a purely physical system obeying the principles of mechanics, just like G.15 The interactions between O and G16 are such that the following correlations hold between their microstates. The set A1 of microstates of O is correlated with the set B1 of microstates of G, and the set A2 of O is correlated with the set B2 of G. The microstates of O in A1, for example, share a certain physical property that gives rise to O’s experience (which is identical for all the microstates in A1), according to which G is in a microstate in the set B1. In this case the microstates of G that are in B1 may share two sorts of physical properties: (i) First, they may share a macrovariable of the microstates of G. In this case, this macrovariable can be specified without reference to O. Examples are the spatial distribution of the molecules making up a gas, the average kinetic energy of the gas molecules, and so on. (ii) Second, they may only share the physical property of being correlated to the set A1 of microstates of O, as in Figure 1. The thermodynamic macrovariables in statistical mechanics have both sorts of properties. Importantly, it is the second sort of physical property that gives preference to the thermodynamic macrovariables; as we said above, there are infinitely many other ways to partition the phase space into sets and define other macrovariables, and so the first sort of property cannot be the reason for our choice. (Moreover, as mentioned, the fact that the thermodynamic macrovariables exhibit regularities is not unique, either.) But once we make our choice on the basis of (ii)—actually, it is made automatically in our experience, due to our physical structure—we can use (i) to describe the set Ai in terms of observerless macrovariables, thus giving the false impression that one need not mention O in order to account for the macrostates of G. This is why statistical mechanics describes our experience in terms that appear not to mention human observers.

An example of the importance of distinguishing between the two sorts of properties—(i) and (ii)—of statistical mechanical sets is Boltzmann’s equation and (so-called) H theorem (see Uffink 2004; Uffink and Valente 2010)17. All the sets that appear in Boltzmann’s equation correspond to shared macrovariables18; but only one set, the equilibrium macrostate, consisting of microstates that have the Maxwell-Boltzmann (MB) energy distribution, is also correlated with our experience. The MB set is selected as important due to its appearance in our experience: it is the counterpart of thermodynamic equilibrium. Had the MB set not featured in our experience, there would have been no reason to look for the theorem, nor to think that the theorem was important if it had been discovered by accident (and we certainly would not continue studying it as an important stage in the development of physics when it was discovered that it isn’t a theorem after all).

Given this construction, our ideal physicist Ludwig can derive thermodynamics, provided he knows which degrees of freedom are the observer O and which are G, and what the structure of correlations is in O+G. To see this, suppose that everything in the world is physical and, more specifically, mechanical (as a working hypothesis). Then Ludwig can partition O’s phase space into the sets Ai that correspond (by our assumption) to O’s experience with respect to what O takes to be the macrostates of G. Moreover, Ludwig can derive the accessible region of O+G from their parameters and constraints, hence has perfect knowledge of their possible correlations, and so he can deduce the sets Bi of G that correspond to the sets Ai of O. Thus Ludwig can derive the partition of G’s phase space into the thermodynamic macrostates, and, from this, the entire mechanical underpinning of thermodynamics. This idea gives a complete mechanical account of the fact that the thermodynamic regularities are expressible in terms of mechanical macrovariables. Moreover, we have a complete mechanical account of the fact that the thermodynamic macrovariables figure in certain sequences of measurement outcomes that we carry out on thermodynamic systems, and for this one can understand on the basis of mechanics why the thermodynamic laws appear to us as empirical laws.

Since the O+G universe is in a particular microstate at each given moment, according to classical mechanics, O is correlated with a particular microstate of G at each moment. By assumption, the content of O’s experience is fixed by a certain physical aspect (macrovariable) of the particular microstate of O’s; suppose that this aspect is Ak. We assume that this content is reliable, that is, that G is indeed in a microstate that has the aspect (macrovariable) Bk. In this way, O’s experience is fixed by the actual particular microstate that belongs to the set Ak+Bk, rather than by the entire set. That is, with respect to each set Bi of G, one can say that O experiences the common physical property shared by the microstates in that given set. Talking about sets of microstates rather than individual microstates in statistical mechanics is shorthand for talking about aspects of the actual microstate in virtue of which it is equivalent to other microstates, that is, in virtue of which it belongs to the sets Ai and Bi. Thus we see that our approach gives a complete account of how macrostates arise in statistical mechanics on the basis of the microscopic structure of classical mechanics, without appealing to counterfactuals.

# 3. Probability in Statistical Mechanics

It is well known that one cannot base the laws of thermodynamics on the partial descriptions given by the thermodynamic macrostates unless probability is introduced into the theory. However, the issue of how to understand probability in classical statistical mechanics is quite subtle, since the theory is deterministic. There are various approaches to probability in statistical mechanics that are more or less in parallel to the various philosophical interpretations of probability (various objectivist and various subjectivist views; see Gillies 2000; Mellor 2005; in the context of statistical mechanics, see Sklar 1993; von Plato 1994). We propose an objectivist view of the probabilities in statistical mechanics in which the ignorance involved in the notion of macrostates receives an explicit objectivist explanation. Our main goal is to ground the probabilities of statistical mechanics in physics. Since we have already given a physical underpinning of macrostates, we shall base our physical account of these probabilities on the fundamental notions of macrostates and dynamics, and interplay between them. Figure 2 Dynamical evolution of the blob over macrostates

Consider Figure 2, in which G1G2 are the degrees of freedom of G, and O is perpendicular to the page. Suppose that we start with a system G that is in the macrostate M0. The evolution of each microstate in M0 takes G to a microstate within the region B(t1) at t1 and B(t2) at t2. We call such regions of endpoints the dynamical blobs B(t) at t1 and at t2.19 In general B(t1) (and similarly B(t2)) may overlap with more than one macrostate. If we measure G at t1, we shall find it to be in either M1 or M2, depending on the actual trajectory of O+G.

Consider how this interplay gives rise to (for example) the thermodynamic Law of Approach to Equilibrium, according to which isolated systems evolve spontaneously so that their entropy increases until it reaches a maximum. The macrostates M0, M1, and M2 express the correlations between the microstates of G and the microstates of O. Suppose that at time t0 G starts in some non-equilibrium macrostate M0. According to the probabilistic counterpart of the laws of thermodynamics, such a system is highly likely to arrive at the equilibrium macrostate M2 at t2. Moreover, in statistical mechanics it is meaningful to talk about intermediate states as well in the approach to equilibrium, so that we expect G to evolve in a way that it is likely to be in macrostate M1 at t1 and reach the equilibrium macrostate M2 at t2. This means that there is a high probability that the actual microstate of G that started out in M0 at t0 will evolve through M1 at t1 and reach M2 at t2. Given that the underlying mechanics is deterministic, this evolution is predetermined. What could probability mean in this setting?

At time t0 the observer O can only see that the microstate of G is within macrostate M0, but does not know which microstate in M0 is the actual one. Therefore even if O knows the details of the dynamics and has all the required computational capabilities, the best O can do is calculate the entire bundle of trajectories, all of which start from M0 and look at the regions consisting of all the endpoints of these trajectories at times t1 and t2. Admittedly, in practice we cannot calculate the evolution of blobs, and we don’t know the details of the partition of the phase space into macrostates. There are shortcuts that allow us to avoid these calculations and still give the right predictions. The import of Gibb’s approach is that it provides such shortcuts (see Hemmo and Shenker 2012, Ch. 11).

A theorem in classical mechanics called Liouville’s theorem states that the size or volume (formally, the Lebesgue measure) of B(t) is conserved under the dynamics for all times, although its shape may radically change over time, depending on the details of the dynamics. In the general case not all the points that started out in M0 at t0 will evolve to M1 at t1 and to M2 at t2. At t1, for example, some of the points may still be in M0. This means that a prediction at the initial time of the future macrostate, given that G starts in M0, can only be probabilistic. Given the notions of macrostates and blobs, it is natural to associate the transition probability that the macrostate of G will be in M1 at t1, given that it was in M0 at t0, with the relative size of the overlap between the blob B(t1) at time t1 and the macrostate M1, that is:

$Display mathematics$
(1)

And more generally:

$Display mathematics$
(2)

This probability rule means that, in general, the probability of a macrostate has nothing to do with its entropy. If probability is to be correlated with entropy, the system has to have a very special kind of dynamics. Perhaps this happens to be the dynamics in actual systems that we observe, but there are no a priori grounds to predict that it will be the case. For more details on the measure of entropy, see Hemmo and Shenker 2012, chapter 7.

How can one determine the measure µ in the Probability Rule? So far we have used all the tools that mechanics can supply: microstates and their dynamical evolution, and macrostates and their dynamical evolution. Yet, nothing dictates the measure µ. In the literature one encounters various considerations for choosing µ20 but, while some of these considerations fit some of our intuitions, from a purely physical point of view, none of these considerations is a priori compelling.21 In this sense there are no conclusive dynamical or a priori considerations that give rise to a particular choice of µ. How then can one choose µ? A guide in the right direction is the idea that if probabilistic statements are to have empirical significance, the choice of µ should be tested by our experience of the relative frequencies of transitions from M0 through M1 to M2. In our view, the choice of µ can ultimately be justified only on the basis of generalizations of these frequencies, together with some pragmatic considerations. The measure µ of the intersection of the blob B(t1) with the macrostateM1 should reflect the relative frequency of transition from M0 to M1, to a good approximation. There may be infinitely many such measures, in which case we choose the one that is most convenient. In this way we obtain a measure that is empirically significant.

One might think that it is desirable to derive the probability measure µ from first principles, such as dynamical or a priori considerations; but this desideratum is unattainable, since, as we said, there are no compelling principles that will do the job. The best one can do in statistical mechanics is to choose a probability measure over the phase space that fits the relative frequencies of the macrostates as they are observed in our experience. This seems much weaker than the above desideratum, but it turns out that the hope that statistical mechanics can yield the above desideratum is indefensible.

It is sometimes said that statistical mechanics aims at deriving the thermodynamic regularities from the principles of mechanics together with probability theory. This gives the impression that, although inductive generalizations are needed for discovering the laws of mechanics, once we have these laws, the thermodynamic regularities can be formulated without further recourse to inductive generalizations, this time, specifically concerning the thermodynamic regularities. Roughly, that is, we hope that once the laws of mechanics are in place, then the laws of thermodynamics would follow from these laws, together with probability theory. We have explained why this aim cannot be achieved and is in fact hopeless as a matter of principle. The inductive generalizations that ground the laws of mechanics need to be supplemented by inductive generalizations of relative frequencies of macrostates to provide a solid empirical foundation for statistical mechanics.

Most discussions of probability in statistical mechanics attempt to use first principles to explain why we observe the relative frequencies characteristic of thermodynamic systems. The desideratum is to derive a probability distribution over initial conditions from dynamical considerations and some a priori symmetries, such as the principle of indifference. The main problem with this approach is that the notion of a probability distribution over initial conditions, if taken to be fundamental rather than derivative, is not physical. Deriving this desideratum from first principles is unattainable (for more details, see Hemmo and Shenker 2012, chapters 6 and 8). However, one can derive the standard uniform probability distribution over initial conditions using our Probability Rule (2) if the measure µ happens to be the Lebesgue measure. This completes our brief construction of statistical mechanics. We are now in a position to consider the two questions raised by Maxwell’s Demon.

# 4. Measurement

The discussion of Maxwell’s Demon in the literature focuses on the measurements and erasure that need be carried out by the Demon (see Leff and Rex 2003). All the other operations of the Demon are taken to be ideally “without friction or inertia,” as Maxwell claimed (Knott 1911, pp. 214–215) The main idea in the literature is that measurement and erasure are dissipative, and if one takes into account the total entropy balance in the Demon’s operation, it turns out that the Demon does not decrease the total entropy of the universe.

Szilard (1929) famously assumed that a decrease in entropy during measurement must be compensated for by an increase elsewhere in the universe, and thereby turned the Second Law into a tautology, which cannot be tested empirically.22 Earman and Norton (1998) reject Szilard’s argument as circularly assuming the truth of the Second Law challenged by the Demon. We agree with Earman and Norton’s conclusion that Szilard’s argument is circular, but this is not our main point here. We wish to complement their argument by providing a direct analysis of measurement on the basis of the principles of mechanics, showing that measurement need not be dissipative.

Szilard (1929) proposed that measurement is accompanied by dissipation, whereas, according to the contemporary view, measurement is not dissipative just because it is logically reversible (e.g. Bennett 1982), and therefore one should look for the dissipation in the erasure. In the approach we outline here, we show that measurement need not be dissipative regardless of its logical reversibility. We shall deal with logical (ir)reversibility in the context of erasure in the next section.

In the formulation of the Law of Approach to Equilibrium and the account of probability, we implicitly appealed to measurements of macrostates. To complete this account and address the issue of the Demon, we need to underpin the notion of measurement in mechanical terms. Consider again our account of the approach to equilibrium. When we observe the evolution of a thermodynamic system toward equilibrium, we are in fact measuring the actual macrostates of the system. For example, in Figure 2, we carry out a measurement on G at time t1 to find out whether its actual trajectory passes through M0 or M1. A measurement is necessary here because the blob at t1 splits between M0 and M1; since O does not know the microstate of G in the blob, O does not know which of these two macrostates will be the case.

Suppose that the outcome of the measurement at t1 is M1. This means that, after the measurement, the microstates of O and G end up in the following regions of the phase space: the microstate of G remains in the intersection of B(t1) and M1 (since presumably it does not change in an ideal measurement), and the microstate of O has evolved from the region o0 to the region o1 (see Figure 3). All the microstates in o1 are such that O has the experience that G is in M1. In other words, the outcome of the measurement reveals to O that the actual microstate of G at t1 is somewhere within M1, and not M0. Consequently O discards (in the sense that O ignores henceforth) the region of the blob B(t1) that does not overlap with M1. We call this stage the collapse of the blob into the detected macrostate. (This collapse is compatible with Liouville’s theorem. We expand on this point in the next few paragraphs) The collapse here expresses the transition of the microscopic state of O given by the equations of motion from one set of states o0 to the set o1 in which O’s experience changes. (This collapse is radically different from the quantum mechanical collapse of the quantum state, which describes the transition of the microstate of O+G in measurement and does not conform to the equations of motion; see more on quantum mechanics in the last section.)

But the collapse is not the end of the story in a mechanical account of a measurement. The reason is that if that were all, it would mean that O could tell not only that G is in M1, but also that G is in the part of M1 that overlaps with B(t1), and this, in turn, would mean that the situation is as if O has calculated the evolution of the dynamical blob from t0 to t1. However, such a calculation is normally not carried out and is in general not feasible due to its complexity. And if O does not calculate the evolution of the blob from M0, but only measures the macrostate of G at t1, all O can say is that G is in some microstate in M1. That is, O can only assign to G the entire macrostate M1. Thus, the outcome of the measurement is not the region of overlap between B(t1) and M1, but rather the entire M1. We call this idea the expansion stage of the measurement. This expansion is illustrated in Figure 3.

At first sight Liouville’s theorem seems to be violated by the collapse and the expansion of the blob.23 Recall that Liouville’s theorem states that the Lebesgue measure of the dynamical blob B(t) remains invariant under the dynamics at all times, while the collapse seems to reduce B(t) to its intersection with M1 and the expansion seems to extend B(t) to fill M1. If we take the Lebesgue measure to be represented by an area in Figure 2, it is clear that neither of these preserves it.

But the fact is that Liouville’s theorem is not violated, since both the collapse and the expansion are perspectival descriptions of O’s experience. By this we mean the following. At the end of the measurement, the actual phase point of O+G belongs to B(t1). Let us suppose that the projection of the actual point on the G degrees of freedom is in the region M1. Take now the projection of this actual phase point on the O degree of freedom and suppose that it is in the o1 region. This projection is the microstate of O in o1, which fixes O’s experience (we assume that O’s experience supervenes on O’s microstate). The contents of O’s experience is given by the set of all points in G that stand in the same correlation with O, namely, that are correlated with o1.24 In this case, this set is the macrostate M1 in the G degrees of freedom.

The measurement interaction results in a change of the initial microstate of O, so that the final microstate of O becomes correlated with the entire region M1.25 As we said the collapse and expansion of the blob are not evolutions in time, but rather a description at a time of the contents of O’s experience. What evolves in time is the actual phase point of O+G and, ipso facto, its projections onto the O and the G degrees of freedom. Since the collapse and expansion are about the perspective of O and not evolutions in time, they cannot violate Liouville’s theorem. Figure 3 Time evolution of the observer: detection, collapse and expansion

The logarithm of the Lebesgue measure of a macrostate is usually taken to be the entropy of the macrostate. Figure 3, in which the Lebesgue measure is represented by an area, illustrates the idea that entropy increases during a thermodynamic system’s approach to equilibrium. The entropy of G in Figure 3 increases when the system evolves from M0 to M1, and then to M2.

The increase of entropy in the approach to equilibrium, as illustrated in Figure 3, can only be an effect of the detection of the actual macrostate in measurement: a measurement at t1 determined that the macrostate is M1 (and not M0) and another measurement at t2 determined that the macrostate is M2 (and not M1); the very description of the actual macroscopic history of the system assumes that measurements have been carried out. Figure 4 Time evolution of the blob in measurement Figure 5 Time evolution of the observer in measurement: final outcome

A slight change in the structure of the OG correlations, and hence of the thermodynamic macrostates, shows that G’s entropy can decrease during measurement. In the case shown in Figures 4 and 5, each of the macrostates M1 and M2 has a smaller Lebesgue measure than the initial macrostate M0, although their union is larger than M0. The structure of the macrostates in Figures 4 and 5 is a case of a system that is far from equilibrium, as are many systems in our experience.

As illustrated in Figures 4 and 5, nothing in the principles of classical mechanics stands in the way of the measurement’s resulting in a decrease of entropy. This decrease of entropy is a net result of the measurement, just like the net increase of entropy in the case of the approach to equilibrium. In both cases the change of entropy is brought about by the change in the macrostates due to the measurement.

In the literature (e.g., Bennett 1982), both M1 and M2 appear at the postmeasurement state, and in this sense the measurement has no outcome. Consequently, this approach cannot account for state preparations. It cannot even account for the Second Law of Thermodynamics.

# 5. Erasure

In classical mechanics, an erasure is a macroscopic evolution that is logically irreversible in the sense that one cannot infer the initial macrostate from the final macrostate.26 Landauer (1961) claimed that an erasure is necessarily accompanied by at least klog2 dissipation per lost bit of information. Bennett (1982) applied this idea to the problem of Maxwell’s Demon. We show in this section that erasure need not be dissipative and therefore cannot lead to an exorcism of the Demon. Figure 6 Blending and entropy increase

A necessary and sufficient condition for an erasure is what we call blending—that is, the macroscopic evolution is such that one cannot infer the initial macrostate from the final macrostate. This notion has some similarity with Landauer’s (1961) notion of diffusion. Usually one thinks of erasure as in the setup sketched in Figure 6 below.

Here the blending is achieved by mapping the two macrostates M1 and M2 onto M0. Liouville’s theorem is satisfied by this mapping, since the Lebesgue measure of M0 is equal to or larger than the sum of the Lebesgue measures of M1 and M2. Likewise, Liouville’s theorem is satisfied in all the cases we analyze in this section Here the two bundles of trajectories that arrive at M0 from M1 and M2 diffuse or blend in M0, in the sense that they are no longer distinguishable by the observer O. That is, O cannot say, given M0, whether the initial macrostate of G was M1 or M2. This process is indeed dissipative but, as we show in this section, this is not due to the blending as such but rather to the structure of the macrostates. Figure 7 Blending without dissipation

To see this, consider Figure 7. In this figure all four macrostates have the same Lebesgue measure, and the trajectories that start in the macrostate M1 evolve in such a way that the dynamical blob partly overlaps with macrostates M3 and M4. In this special case, designed for simplicity, the blob overlaps with exactly one half of M3 and one half of M4. Similarly, in Figure 7 (right side) we see the evolution of the trajectories that start in M2: they also evolve so that their dynamical blob overlaps with exactly the remaining one half of M3 and the remaining one half of M4.

Since M4 (and similarly M3) contains endpoints that started out in both M1 and M2, that is, if the blobs that started in M1 and M2 blend within M3 and M4, O cannot infer from the final macrostate M4 (or M3, depending on the actual outcome of the erasure), which of the macrostates, M1 or M2, was the case before the erasure. In this special case the final macrostate detected by O is either M3 or M4, and since the Lebesgue measure of both M3 and M4 is equal to the Lebesgue measure of the initial macrostate (M1 or M2), G’s entropy did not change during the erasure. Here we constructed a nondissipative erasure of what Bennett (2003) called known data, since this is natural in the context of the Demon, where the measurement (which has an outcome) precedes the erasure. However, a similar construction with no dissipation can be given for an erasure of random data (see Hemmo and Shenker 2012, chapter 12). Figure 8 Blending and entropy decrease

The case in which the correlation between O and G is such that O can distinguish between the macrostates M3 and M4 is special; in general, this correlation can be either finer or coarser. An example of an erasure with a coarser correlation is illustrated in Figure 4. An example of an erasure with a finer correlation is given in Figure 8, in which the regions M3, M4, M5, and M6 are macrostates. The crucial point is that there is no intrinsic connection between blending, which depends on the structure of trajectories in the blobs, and the entropy, which is fixed by the measure of the macrostates. Notice that in the cases of erasure in which the entropy does not increase, blending results in a final macrostate of G that is unpredictable (see Albert 2000, chapter 5).

In familiar thermodynamic situations there seem to be fixed limitations on the observation capabilities of human observers and, in this sense, one can perhaps introduce a maximally fine-grained partition to thermodynamic macrostates (see Earman 2006), which results in some specific entropy of post-erasure macrostates. But the details of these macrostates are a contingent matter of fact. In particular, the principles of mechanics entail no specific relation between the pre-erasure and post-erasure entropy of the universe. In any case, our analysis of erasure demonstrates that, contrary to the conventional wisdom, classical mechanics does not entail that an erasure is necessarily dissipative.

# 6. Maxwell’s Demon

A Maxwellian Demon is a macroscopic evolution that consists of measurement, mechanical operations that depend on the measurement outcome and produce work, and an erasure, such that at the end of the evolution the total entropy of the universe is lower than at its beginning. In essence, a combination of a measurement that decreases the total entropy of the universe and an erasure that does not increase the total entropy of the universe gives rise to a Maxwellian Demon. Take, for example, the measurement in Figure 5, which decreases the entropy of O+G. We can think of G as consisting of a gas G’ on which O operates and an environment E, where the measurement of O on G decreases only the entropy of G’, while the entropies of O and E remain invariant. Finally, take the erasures in Figures 7 or 8 that do not increase the entropy of O+G, and that result in blending such that the unpredictability of the final macrostate of G is realized in E. (For more details, see Hemmo and Shenker 2012, chapter 13 and appendix A; and also Hemmo and Shenker 2010, 2011.)27 This phase space construction of a Demon shows in the most general way that the principles of mechanics are compatible with Maxwellian Demons. We have thus answered question (I) concerning the Demon in the Introduction of this article. In the literature, there are many attempts to construct concrete devices that are meant to show that Demons are impossible (see Leff and Rex 2003). But obviously, given our general argument, these examples cannot be generalized.

# 7. Are tennis players Maxwellian Demons?

We will now attempt to answer question (II) in the Introduction of this article, namely, Are there, or can one construct, Demons in our world? We will argue that there are good reasons to think that the decrease of entropy in measurement is ubiquitous in our experience rather than exceptional. We exploit the energy in systems around us as a matter of course in our experience, and we do that precisely by measuring the macrostates of these systems. In many cases, our measurements result in decreasing the entropies of the systems we measure (and in decreasing the total entropy of the universe). Since the decrease of entropy in measurement is the key step in realizing a Maxwellian Demon, we will argue that our analysis suggests that we may be some sort of Maxwellian Demons.

To see this, consider a tennis match. (We address later the adequacy of this example.) The rules of the sport and our bodily limitations dictate the accessible region in the phase space of the ball and, in particular, the range of velocities of the ball’s center of mass. The first serve by one of the players can be seen as a preparation of the ball in an initial macrostate. The position of the (center of mass) of the ball at any given time is known to the second player to a good approximation. But the velocity (both speed and direction) of the ball is unknown to that second player, and this is the fact that makes tennis a nontrivial sport. The size of the initial macrostate (prepared by the first serve) expresses a set of possible velocities. Given the rules of the sport and our bodily limitations, the dynamics of the ball ensures that the velocity is slow enough relative to our capability of detecting the ball’s velocity and position and controlling it at roughly every moment. As the ball approaches the second player, she carries out a sequence of measurements that result in gradually narrowing the set of possible velocities of the ball, and this means that the sizes of the macrostates of the ball in the velocity degree of freedom decrease, and, with it, the entropy of the ball. Here the player detects the velocity of the ball by merely looking at it. From the analysis of measurement and from the mechanical details of the game it seems to us that there should be no minimal amount of work that is invested in these measurements.

It might strike you that a tennis match is not a good example of a thermodynamic setup, for two reasons. First, a tennis match is an open system. However, we proved that a measurement can result in a net decrease of total entropy, and it is this decrease that we want to focus on in describing a tennis match. If one wishes to argue that an open system is irrelevant for testing the validity of the Second Law, one implicitly assumes that any decrease of entropy in the match must be compensated for by an increase of entropy elsewhere. This is the essence of Szilard’s (1929) argument, which we already said is circular. If one does not assume in advance that the Second Law is universal, then it is enough to show that the entropy decreases in an open subsystem of the universe and that there is no apparent compensation elsewhere. This is what we want to demonstrate here.

The second reason why a tennis match may seem inadequate for thermodynamic treatment is that the match is a composite system for which the relevant degrees of freedom are few and controllable (e.g., the position and velocity of the center of mass of the ball). Two responses can be given to this argument. First, these few degrees of freedom may be thought of as a subsystem for which our previous answer holds. That is, one cannot simply assume that an increase of entropy in the other degrees of freedom (say, the temperature of the tennis ball) must occur in order to compensate for the decrease of entropy in those few degrees of freedom relevant for the match. Second, similar objections have been raised against Szilard’s engine, which employs a single particle. Such objections have been rejected for the reason that there is no extra-theoretical criterion that distinguishes between mechanical systems that are subject to the laws of thermodynamics and those that are not.

The entire match takes place in states of the ball that are extremely far from equilibrium. However, it is not very clear in this scenario what exactly the equilibrium state of the ball would be: perhaps it is the dust left from the ball (and the players) after some centuries. This understanding of equilibrium brings us to the radical difference between a tennis match, in which we can decrease the entropy of the ball, and an experiment on a gas in a box, in which we cannot decrease the entropy of the gas without substantial investment of work. A gas confined to a box (or the remaining dust of the ball) contains about 1023 particles, which are analogous to little tennis balls moving extremely fast in all directions. It seems to us that the only reason why we cannot reduce the entropy of such a gas by measurements in practice but can reduce the entropy of the ball in a tennis game is not grounded in some fundamental prohibition in the laws of mechanics. Rather it is simply because the behavior of a gas in a box is an extremely challenging game. The famous spin echo experiments show, for example, that we might be able to overcome such challenges by some ingenious trick that would allow us to control the microstate of the gas by macroscopic manipulations without knowing all its details (see Hemmo and Shenker 2012, Sec. 6.7). In these experiments, which have become standard technology at the present time, the entropy of a collection of spin particles is systematically decreased by macroscopic manipulations, although only for quite short time intervals (see Hahn 1950, 1953; Blatt 1959; Ridderbos and Redhead 1998; Hemmo and Shenker 2005).

It follows from our analysis here that the distinction between systems that form challenging games and satisfy the (probabilistic version of the) Second Law of Thermodynamics, as opposed to systems that do not, is merely pragmatic, depending very much on the way we are physically structured. If we had many hands and very quick eyes we might be able to win very challenging tennis matches, in which case the above analysis of the usual single-ball tennis match would hold equally. We take it that this was exactly Maxwell’s idea in devising his thought experiment of the Demon and emphasizing that the Demon is not supernatural. Maxwell believed that we cannot decrease the entropy of systems in our environment, simply because we are not “clever enough” (see Knott 1911, pp. 213–214). It is this claim that we are challenging by focusing on tennis matches. It seems to us that, in our measurements of systems around us that as a matter of fact are far from equilibrium, the right thing to say is that sometimes we routinely decrease the entropy of the universe. And since this is the key point in constructing a Maxwellian Demon, it may be that we are such Demons. This may also give an affirmative answer to question (II) in the Introduction.

Our plausibility argument obviously does not undermine the validity of the Second Law in the circumstances in which it is usually applied, which (as we showed in Section 3) always involve measurements. In these cases, the measurements and the structure of the macrostates and trajectories together result in an increase of entropy. In general, whether a measurement increases or decreases the entropy of a system depends on the harmony between the dynamics and the partition to macrostates.

# 8. A note on quantum mechanical Demons

So far we have answered the two questions at the beginning of this article concerning Maxwell’s Demon in the affirmative. But our discussion was framed in the context of classical statistical mechanics. Does anything change if we replace the underlying classical theory with quantum mechanics? The short answer is No, and the reasons are as follows. The first thing that one needs to know is how quantum mechanics would underpin the thermodynamic phenomena. Very broadly there are two alternatives. One is a quantum statistical mechanical theory along the lines of the classical theory. One can show, for example, that degenerate observables in quantum mechanics are macrostates, provided one also takes into account the ignorance of observers concerning the quantum state of the system (see Hemmo and Shenker 2015b). It seems that this is the only feasible account of macrostates in the standard formulation of quantum mechanics. If this is true, then the notion of a macrostate in quantum statistical mechanics remains essentially classical, and in particular the statistical mechanical probabilities crucially involve classical ignorance probabilities. It turns out that this is just enough to show that our classical phase space construction of a Demon can be translated into quantum mechanics. To be sure, we are assuming here that there are quantum mechanical microscopic evolutions that include measurement and erasure, which decrease the total entropy of the universe. Indeed we have shown that there are such microevolutions in quantum mechanics (see Hemmo and Shenker 2012, Appendix B.3). The other alternative is that the quantum mechanical account of thermodynamic phenomena is entirely microscopic (unlike the classical case), in which the thermodynamic magnitudes will be given directly by some quantum mechanical observables operating on quantum states. In such a theory there would be no macrostates whatsoever. But then in such a theory our microscopic quantum Demon just is a Maxwellian Demon.28

# 9. Conclusion

Maxwell proposed his thought experiment of the Demon “[t]o show that the 2nd Law of Thermodynamics has only statistical certainty” (Knott 1911, pp. 214–215). At the same time he also seemed to believe that the only reason that we cannot violate the statistical certainty of the Second Law is “not being clever enough” (Knott 1911, pp. 213–214). Indeed, this second statement is stronger than the first, regardless of whether or not Maxwell realized this.

We have shown here that Maxwellian Demons are consistent with the principles of statistical mechanics, which means that the probabilistic version of the Second Law of Thermodynamics cannot be a theorem in classical (or quantum) mechanics. This answers our question (I) at the beginning of this article. Regarding question (II), namely, whether there are Demons in our world, Lanford’s theorem might be seen as a possible answer should it turn out that the special conditions of the theorem are satisfied in our world. We tried to show that there are good reasons to think that, as a matter of fact, some sorts of systems (perhaps human beings) seem to be capable of exploiting the energy in their environment to produce work without increasing the entropy of the universe, and therefore are Maxwellian Demons. Both Lanford’s theorem and our argument concerning the decrease of entropy in measurement require certain harmonies between the dynamics and the partition into macrostates (although of different sorts). Both sorts of harmonies are compatible with the principles of mechanics, and whether or not they hold and under which circumstances are questions of fact.

According to the conventional wisdom, Maxwellian Demons do not exist in our world as a matter of fact. There are several ways of defending this position against our argument. The first and obvious reply would be that our construction of the Demon is wrong in that it contradicts some principle of classical or quantum mechanics. This alleged principle should rule out the sort of harmony between the dynamics and the partition to macrostates required by our construction of the Demon. However, we believe that our construction does not contradict any principle of mechanics (classical or quantum). The second reply would be to say that as a matter of fact, given the right structure of macrostates in measurement or erasure, the dynamics invariably results in an increase in the entropy of the universe. This reply seems to mean that the macrostates at the end of measurement or erasure are larger than those at the beginning of these processes. We do not see how this claim can be justified from first principles, and we do not see why it should be the case. One may try to ground this second reply in the observation that as a matter of fact there are no Demons in our world or in some other constraint that needs be specified. However, in section 7 we challenged this claim and argued that the arguments of this sort appearing in the literature are circular (see Earman and Norton 1998, 1999). Finally, a third reply would be to say that the underlying classical or quantum mechanical theories are false and that the true theory should be inconsistent with Demons. Here we should note that our construction of the Demon is not specific to the details of the underlying mechanical theories. Our notion of a microstate as a complete description of a system is general and its details should be fixed by one’s best theory. Our proof applies to any theory that accounts for thermodynamics in terms of macrostates and is a consequence of focusing on the interface between the microstructure and the partition of the state space into thermodynamic macrostates.29

As far as we understand the current knowledge, the right thing to say is that “Maxwell’s Demon lives on. After more than 130 years of uncertain life and at least two pronouncements of death, this fanciful character seems more vibrant than ever” (Leff and Rex 2003, p. 2).

# Acknowledgements

We wish to thank an anonymous reviewer for helpful comments on an earlier version of this article. This research has been supported by the Israel Science Foundation, grant number 713/10, the German-Israel Foundation, grant number 1054/09, and a grant from Lockheed Martin Inc.

## References

Ainsworth, P. M. (2011) “What Chains Does Liouville’s Theorem Put on Maxwell’s Demon?” Philosophy of Science 78, 149–164.Find this resource:

Albert, D. (2000) Time and Chance. Cambridge, MA: Harvard University Press.Find this resource:

Ben Menahem, Y. (2001) “Direction and Description,” Studies in History and Philosophy of Modern Physics 32(4), 621–635.Find this resource:

Bennett, C. (1973) “Logical Reversibility of Computation,” IBM Journal of Research and Development 17, 525–532.Find this resource:

Bennett, C. (1982) “The Thermodynamics of Computation: A Review,” International Journal of Theoretical Physics 21, 905–940.Find this resource:

Bennett, C. (2003) “Notes on Landauer’s Principle, Reversible Computation, and Maxwell’s Demon,” Studies in History and Philosophy of Modern Physics 34(3), 501–510.Find this resource:

Blatt, J. M. (1959) “An Alternative Approach to the Ergodic Problem,” Progress of Theoretical Physics 22(6), 745–756.Find this resource:

Brillouin, L. (1962) Science and Information Theory. London: Academic Press.Find this resource:

Callender, C. (1999) “Reducing Thermodynamics to Statistical Mechanics: The Case of Entropy,” Journal of Philosophy XCVI, 348–373.Find this resource:

Dürr, D., Goldstein, S. and Zanghi, N. (1992) “Quantum Equilibrium and the Origin of Absolute Uncertainty,” Journal of Statistical Physics 67(5/6), 843–907.Find this resource:

Earman, J. (2006) “The Past Hypothesis: Not Even False,” Studies in History and Philosophy of Modern Physics 37, 399–430.Find this resource:

Earman J. and Norton J. (1998) “Exorcist XIV: The Wrath of Maxwell’s Demon. Part I. From Maxwell to Szilard,” Studies in History and Philosophy of Modern Physics 29(4), 435–471.Find this resource:

Earman J. and Norton J. (1999) “Exorcist XIV: The Wrath of Maxwell’s Demon. Part II. From Szilard to Landauer and Beyond,” Studies in History and Philosophy of Modern Physics 30(1), 1–40.Find this resource:

Eddington, A. (1935) The Nature of the Physical World. London: Everyman’s Library.Find this resource:

Ehrenfest, P. and Ehrenfest, T. (1912) “The Conceptual Foundations of the Statistical Approach in Mechanics”, Leipzig, 1912; New York: Dover, 1990.Find this resource:

Einstein, A. (1970) “Autobiographical Notes,” in P. A. Schilpp (ed.), Albert Einstein: Philosopher-Scientist, vol. 2, Cambridge: Cambridge University Press.Find this resource:

Fahn, P. N. (1996) “Maxwell’s Demon and the Entropy Cost of Information,” Foundations of Physics 26: 71–93.Find this resource:

Feynman, R. (1963) The Feynman Lectures on Physics, Redwood City, CA: Addison Wesley.Find this resource:

Frigg, R. (2008) “A Field Guide to Recent Work on the Foundations of Statistical Mechanics,” in D. Rickles (ed.), The Ashgate Companion to Contemporary Philosophy of Physics. London: Ashgate, 2008, pp. 99–196.Find this resource:

Gillies, D. (2000) Philosophical Theories of Probability, London: Routledge.Find this resource:

Goldstein, S. (2012) “Typicality and Notions of Probability in Physics,” in Y. Ben-Menahem and M. Hemmo (eds.), Probability in Physics. The Frontiers Collection, Berlin Heidelberg: Springer-Verlag, pp. 59–72.Find this resource:

Goldstein, S. and Lebowitz, J. (2004) “On the (Boltzmann) Entropy of Nonequilibrium Systems,” Physica D 193, 53–66.Find this resource:

Hahn, E. L. (1950) “Spin Echoes,” Physical Review 80, 580–594.Find this resource:

Hahn, E. L. (1953) “Free Nuclear Induction,” Physics Today 6(11), 4–9.Find this resource:

Hemmo, M. and Shenker, O. (2005) “Quantum Decoherence and the Approach to Equilibrium II,” Studies in the History and Philosophy of Modern Physics 36, 626–648.Find this resource:

Hemmo, M. and Shenker, O. (2010) “Maxwell’s Demon,” The Journal of Philosophy 107, 389–411.Find this resource:

Hemmo, M. and Shenker, O. (2011) “Szilard’s Perpetuum Mobile,” Philosophy of Science 78, 264–283.Find this resource:

Hemmo, M. and Shenker, O. (2012) The Road to Maxwell’s Demon. Cambridge: Cambridge University Press.Find this resource:

Hemmo, M. and Shenker, O. (2015a) “The Emergence of Macroscopic Regularity,” forthcoming in Mind and Society.Find this resource:

Hemmo, M. and Shenker, O. (2015b) “Quantum Statistical Mechanics and Classical Ignorance,” forthcoming in Studies in History and Philosophy of Modern Physics.Find this resource:

Jauch, J. M. and Baron, J. G. (1972) “Entropy, Information and Szilard’s paradox,” Helvetica Physica Acta 45, 220–232.Find this resource:

Knott, C. G. (1911) Life and Scientific Work of Peter Guthrie Tait, Cambridge: Cambridge University Press.Find this resource:

Landau, L. D. and Lifshitz, E. M. (1980) Statistical Physics Part 1, Course in Theoretical Physics vol. 5. 3rd ed. Trans: J. B. Sykes and M. J. Kearsley. Oxford: Butterworth-Heinemann.Find this resource:

Landauer, R. (1961) “Irreversibility and Heat Generation in the Computing Process,” IBM Journal of Research and Development 3, 183–191.Find this resource:

Lebowitz, J. (1993) “Boltzmann’s Entropy and Time’s Arrow,” Physics Today, September 1993, 32–38.Find this resource:

Leff, H. S. and Rex, A. (2003) Maxwell’s Demon 2: Entropy, Classical and Quantum Inforamtion, Computing, Bristol, UK: Institute of Physics Publishing.Find this resource:

Loewer, B. (2001) “Determinism and Chance,” Studies in History and Philosophy of Modern Physics 32, 609–620.Find this resource:

Malament, D. and Zabell, S. (1980) “Why Gibbs Phase Averages Work: The Role of Ergodic Theory,” Philosophy of Science 47, 339–349.Find this resource:

Maroney (2009) “Information Processing and Thermodynamic Entropy,” The Stanford Encyclopedia of Philosophy (Winter 2008 Edition), Edward N. Zalta (ed.), <http://plato.stanford.edu/entries/information-entropy/>Find this resource:

Mellor, D. H. (2005) Probability: A Philosophical Introduction, London: Routledge.Find this resource:

Norton, J. (2013) “All Shook Up: Fluctuations, Maxwell’s Demon and the Thermodynamics of Computation,” Entropy 15, 4432–4483.Find this resource:

Pitowsky, I. (2012) “Typicality and the Role of the Lebesgue Measure in Statistical Mechanics,” in Y. Ben-Menahem and M. Hemmo (eds.), Probability in Physics. The Frontiers Collection, Berlin Heidelberg: Springer-Verlag, pp. 41–58.Find this resource:

Ridderbos, K. and Redhead, M. (1998) “The Spin Echo Experiments and the Second Law of Thermodynamics,” Foundations of Physics 28(8), 1237–1270.Find this resource:

Scully, M. (2001) “Extracting Work from a Single Thermal Bath via Quantum Negentropy,” Physical Review Letters 87(22), 220601.Find this resource:

Scully, M. O., Zubairy, M. S., Agarwal, G. S., and Walther, H. (2003) “Extracting Work from a Single Heat Bath via Vanishing Quantum Coherence,” Science 299, 862–864.Find this resource:

Sklar, L. (1993) Physics and Chance. Cambridge, UK: Cambridge University Press.Find this resource:

Smoluchowski, M. von (1912) “Experimentell nachweisbare der üblichen Thermodynamik widersprechende Molekularphänomene,” Physik. Z. 13, 1069–1080.Find this resource:

Smoluchowski, M. von (1914) “Gültigkeitsgrenzen des zweiten Hauptsatzes der Wärmtheorie,” in Vorträge übber die Kinetisch Theories der Materie und der Elektrizität (“Limits on the Validity of the Second Law of Thermodynamics,” in Lectures on the Kinetic Theory of Matter and Electricity.), Leipzig, Teubner, pp. 89–121.Find this resource:

Szilard, L. (1929) “On the Decrease of Entropy of a Thermodynamic System by the Intervention of an Intelligent Being,” in H. S. Leff and A. Rex (eds.), Maxwell’s Demon 2: Entropy, Classical and Quantum Information, Computing. Bristol, UK: Institute of Physics Publishing, 2003, pp. 110–119.Find this resource:

Uffink, J. (2001) “Bluff Your Way in the Second Law of Thermodynamics,” Studies in History and Philosophy of Modern Physics 32, 305–394.Find this resource:

Uffink, J. (2004) “Boltzmann’s Work in Statistical Physics,” The Stanford Encyclopedia of Philosophy (Winter 2008 Edition), Edward N. Zalta (ed.), <http://plato.stanford.edu/archives/win2008/entries/statphys-Boltzmann/>Find this resource:

Uffink, J. (2007) “Compendium to the Foundations of Classical Statistical Physics,” in J. Butterfield and J. Earman (eds.), Handbook for the Philosophy of Physics, Part B, pp. 923–1074.Find this resource:

Uffink J. and Valente, G. (2010) “Time’s Arrow and Lanford’s Theorem,” Seminaire Poincare XV Le Temps, 141–173.Find this resource:

Uffink J. and Valente, G. (2015) “Lanford’s Theorem and the Emergence of Irreversibility,” Foundations of Physics 45, 404–438.Find this resource:

von Plato, J. (1994) Creating Modern Probability, Cambridge, UK: Cambridge University Press.Find this resource:

Wallace, D. (2011) “The Logic of the Past Hypothesis,” philsci-archive.pitt.edu/8894/1/pastlogic_2011.pdf

Werndl, C. (2013) “Justifying Typicality Measures of Boltzmannian Statistical Mechanics and Dynamical Systems,” Studies in History and Philosophy of Modern Physics, forthcoming.Find this resource:

Zurek, W. (1990) “Algorithmic Information Content, Chuch-Turing Thesis, Physical Entropy and Maxwell’s Demon,” in W. Zurek (ed.), Complexity, Entropy and the Physics of Information, Redwood City, CA: Addison Wesley.Find this resource: