I am fond of any piece of writing involving discussions on thermodynamics, statistical mechanics and entropy. I have only recently discovered the work of Arieh Ben-Naim and decided to start delving into his approach on these subjects with one of his latest books “Entropy and the Second Law, interpretation and misss-interpretationsss” published in 2012 by World Scientific.
To summarise my experience: I had really joyful moments while reading some ways of presenting the material that I had never seen anywhere else but which were unfortunately mixed-up with more confusing (for me at least) passages where I would either not understand the “pedagogy” or simply disagree with the point of view emphasised by the author.
There are 5 chapters in the book and I shall comment on them separately as their aims and content are somewhat different.
This chapter introduces the concept of entropy, first macroscopically via Clausius’ formulation of the 2nd law of thermodynamics and second microscopically via the so-called Boltzmann’s formula S = kB ln W.
It then delves into the most persistent interpretations that have been proposed to understand the concept of entropy, discusses how they emerged as qualitative representations of entropy and why most of them fail either by being sometimes inconsistent descriptors of the states or simply by being too vague.
This chapter is overall well written and understandable by many.
This chapter is the most original of the book as it attempts to introduce the concept of Shannon Measure of Information (SMI) — central concept of the book — via simple games. In particular, in the case of hiding with uniform probability distribution a ball in a single box among W possibilities, the SMI log2 W is directly an estimate of the minimum number of binary (YES-NO) questions that must asked to know for certain where hides the ball starting with no information on where it is. This part is extremely clear and enlightening in many respects.
I felt that this clarity faded away, however, as the chapter progressed towards the more general definition of the SMI. In particular, it is done by looking at the simplest case of W = 2 where now the ball is hidden in one of the boxes (let’s say red and blue) with some probability p to be in the blue box known from the player. This part is then treated by looking for an estimate of the number of questions that must be asked to win a game whereby confirmation of certain knowledge is ensured by an additional binary question. A functional form is found as a function of p and it is nice…the only problem — at least from a pedagogical standpoint — is that it has nothing to do with the SMI to be introduced just after with the usual form SMI(p) = -p log2 p – (1-p)log2 (1-p). When I read this passage, the question I had in mind was “would it be to possible to continue the first little game analogy further?”.
I think it is in fact possible and the way I would do it would be by interpreting the probability p from a bigger system comprising M boxes. There are Mp blue boxes and M(1-p) red boxes (in principle M must be large enough at least to ensure an integer number of red and blue boxes). We are now almost back to the first game provided we uniformly drop a ball in one of the M available boxes. The first question is then crucial: “is the ball in a blue box?”. There are two possibilities:
NO: in which case it is in one of the red boxes and there remain log2 M(1-p) binary questions to be asked to know for certain where is the ball.
YES: in which case it is one of the blue boxes and there remain log2 Mp binary questions to be asked to know for certain where is the ball.
Now, we do not want an estimate that depends on the outcome of the first question so we try instead to estimate a satisfactory value given what we know about the problem. If we were to play infinitely many times the same game we would see that the ball lands in a blue box with relative frequency p and in a red one with relative frequency (1-p). It follows that the average number of questions we would have to ask (over infinitely many games) would be < Nquestion > =(1-p) log2 M(1-p) + p log2 Mp. This average could be calculated because we know that, in a smart enough strategy, there is a distinction to be made between blue and red boxes and their corresponding probabilities. Certain knowledge is only ever achieved at best within log2 M binary questions however. Missing knowledge (or SMI) is therefore definable as being the difference between the minimum number of questions needed to acquire certain knowledge on a situation and the expected minimum number of questions based on our knowledge of the probabilities i.e. SMI = log2 M – < Nquestion > = -(1-p) log2 (1-p) – p log2 p . Note that the result is independent of M since M has essentially nothing to do with the knowledge provided by being given the probabilities.
The rest of the chapter focuses on the mathematics and interpretation of mutual and conditional informations and is again clearly written.
This chapter was essentially the one I was waiting to reach but, unfortunately, it did not meet my expectations. In fact, the promise of bridging in some novel way the SMI as introduced in chapter 2 and the entropy of statistical mechanics sounded really great and I may have expected too much from this promise.
On the very positive side I found the idea of splitting the SMI of an ideal gas in each of the possible components (locational, momenta, mutual etc…) very good and indeed enlightening. However, I found the logic in each individual part difficult to follow. The passages on the locational and momentum parts of the SMI start-off with a maximisation procedure of the SMI under certain constraints which, at this stage, come literally out of nowhere. They do not follow in any way from what has been said in the preceding chapters.
The maximisation step is only explained afterwards towards the end of the chapter when identification with entropy is made (but the constraints still remain a mystery since at that stage there is no clear definition of what a Gibbs ensemble is…in fact there won’t be any in the whole book). Even in the latest part of the chapter, the maximisation of SMI required by the author for identification with entropy (together with some prefactor) is only definitional; it is not clear where it comes from or why it is needed; we are just told to accept it. While the author claims to differ from the point of view of Jaynes in that the latter seeks the least biased distribution, at least the logic followed by Jaynes is clear from beginning to end.
For these reasons I fail to see where is the “very remarkable achievement” of the chapter claimed by the author page 121. There may be a “huge conceptual leap” to make as proclaimed in the same section but it is neither new nor huge for people familiar with the work of Jaynes and his followers.
I really enjoyed the attempt at interpreting the contribution from indistinguishability and quantum uncertainty relations in term of mutual information between particles for the first and between momenta and location for the second.
I do believe however that the treatment of the quantum mutual information part is way too heuristic if not almost wrong. For one thing, the sought Sakur-Tetrode formula is solely concerned with the semi-classical limit in which summation over states can be replaced by an integral or, equivalently, position and momenta operators can be said to approximately commute (not even mentioning that quantum statistics should also not apply). One must actually look for the footnotes mentioning that this formula, even for an ideal gas, is only valid in some limits (not all of them being mentioned by the way). Secondly, the mutual information term which in principle involves a complicated integral of the joint probability distribution of x and p is simply assumed to be log2 h. Although, this makes intuitive sense, this is not supported by any mathematical argument. Here it just feels like it is put there to agree with the final formula that it pretends to derive.
Overall, I think that this chapter is uneven in quality with great ideas but not well conveyed or derived. I think that the programme it proposes is doable but in its current form it is just a raw version of what it could be.
I felt partly let down after reading chapter 3 but chapter 4 gave me good reasons to stick with the book. There, the splitting used in chapter 3 of the entropy into various contributions bears its fruit. The focus of the chapter is mainly on mixing and it shows quite remarkably the various contributions to the so-called “entropy of mixing”. In particular, in the usual thought experiment for mixing, the sole contribution to an increase in entropy by an amount of kB ln 2 per particle is due to the entropy of expansion of each of the gases. The possible wrong conclusions that one could have on mixing are then debunked one at a time with very nice examples, which I am ashamed to say that some of them were new to me. Looking at all the examples presented in this chapter, I think that what would count as “entropy of mixing” as most people think of it would be in fact the example of deassimilation of a “gas” of alanine molecules.
Overall, I really enjoyed this chapter. I am still suspicious of some claims personally but they do not affect the quality of the chapter. If anything, I just regret that the author seems to like “picking on the dead”. Some critics made about Gibbs for example are unwarranted and do not seem to understand the actual complexity of Gibbs’ thoughts and later commentators like E.T.W Jaynes or Daan Frenkel about the topic. Since I share the view of these later names, I would dare claiming that this book perpetuates the myth that the necessity of the -ln N! term in the entropy has anything to do with quantum mechanics!
This chapter is devoted to the discussion of “why” the second law of thermodynamics. While the previous chapters focused on “what” is entropy, this one then focuses on “why” it is that the entropy of an isolated system increases.
I may have misunderstood the chapter in its entirety but it is the one with which I have the most problems.
- The first problem I have is logical. What the whole chapter shows is that, if one takes a system at equilibrium, then the equilibrium macrostates that we are bound to observe are those with the highest configurational entropies. This is illustrated with the usual example of particle number fluctuations in a box whereby, as the system size (N,V) increases, the number of particles in each halves of the box becomes sharply peaked around N/2 while, in principle, nothing prevents them to be all in a corner of the box. The issue with this argument is that it only explains how statistical equilibrium of the Gibbs’ ensembles — which seems to allow every possible macrostate in principle — still agrees with the blatant macroscopic evidence that at equilibrium we only observe a single “macrostate” (like quasi-uniform density for example). This point has logically nothing to do with the 2nd law of thermodynamics and therefore cannot shed any light on why does the entropy of an isolated system increase.
- In fact, the theory developed in this book is simply incapable of explaining the 2nd law of thermodynamics by construction. The reason is that the author makes sure that he only talks about entropy in equilibrium situations and emphasises that, for him, the SMI has no bearing on thermodynamics outside of fully equilibrated situations for a given set of constraints. There is thus no way that (a) it explains why a transformation between two equilibrium states with different constraints is spontaneous if and only if the entropy (of the “universe”) increases and (b) how it does so as a function of time. The fact that the author dismisses the “how” question seems just as absurd as the usual dismiss of the “why” question by most physicists; it is purely rhetorical.
- There is thus nothing “obvious” anymore in the asymmetry between microscopic mechanics and macroscopic laws. The mystery remains untouched. The reason is that when watching a movie of a breaking egg in reverse, we see that something is odd. However, were we to zoom-in on what each particle is doing, we would not see any problem at all! How does this happen is where the mystery lies. As the author rightly claims this has most likely to do with probability theory. But as far as I am concerned, probability theory is not part of mechanics. Moreover, in spite of the “acceptance” of the author — which I share — that we have to use probabilities, there remains certain schools of thought who wish to derive probabilistic results from deterministic and reversible microscopic mechanics.
- “Using probabilities” is not enough. One needs to propose how these probabilities evolve with time from a given initial distribution to a final one. This is for example what the Liouville and Fokker-Planck equations do for hamiltonian mechanics and stochastic processes respectively. Only by studying the properties of these equations can we hope to understand how a probability distribution reaches its equilibrium form (the one that maximises the SMI in chapter 3) from an initial, out-of-equilibrium one. In fact, we know that for finite state Markov chains under certain conditions (such as detailed balanced to mention the most drastic one), the probability density evolves most likely towards a unique distribution which is usually the equilibrium distribution. Monte Carlo calculations that are performed nowadays on a daily basis across the world rely on this property. This is the reason why Oliver Penrose had to postulate that the dynamic equation of the probability distributions were Markov processes in his book on the foundations of statistical mechanics. Much harder is it to show that it works for “reasonable” dynamical equations for continuous distributions. As far as I know, the mathematician Cedric Villani has obtained the Fields medal in 2010 precisely for his work that studied the convergence towards equilibrium of the solution to the Boltzmann equation in a non-perturbative way.
- Finally, showing convergence of the probabilities towards a unique distribution, although great, is not enough for having insight on the 2nd law of thermodynamics, for what is needed is to discuss how does the entropy change with time between the initial and the final state. This necessarily requires a definition of entropy that goes beyond the somewhat restricted definition used by the author. Although this restriction adds clarity to the discourse for chapters 1 to 4, it simply prevents any discussion of chapter 5 as I have said above. That is the reason why Jaynes and later-on Roger Balian have used the SMI and identified it with entropy, at least as long as a probability density is definable for the macro-variables under study. I personally fail to see how extending the definition of entropy for systems out-of-equilibrium (but still with clear global constraints like (E, N, V)) is a problem. The author has no problem using Boltzmann statistical entropy and showing that the variation of entropy is non-zero even when mixing two identical substances. Yet, this is in blatant contradiction with thermodynamics but the point is still valid with the extension of the entropy concept provided by the Boltzmann formula. The same could be done with the SMI for non-equilibrium distributions of which a specific sample would be equilibrium cases.
As far as I have understood the book, I think it is worth reading for the first 4 chapters. Although perfectible, each of these chapters has at least one pearl lying somewhere that will bring insight on equilibrium statistical mechanics even to the trained reader. I am much less happy with chapter 5 which, I believe, misses the point entirely as explained above. So, if you want to understand what is entropy from the point of view of SMI, this can be valuable read; but if you seek an in-depth understanding of the second law, this book won’t be much help.
3 thoughts on “Review of the book “Entropy and the Second Law” by Arieh Ben-Naim”
Reblogged this on Study Physics.