I finally learned the solution to a little puzzle that’s been bothering me for awhile.
The setup of the puzzle is as follows. Let be a weighted undirected graph, e.g. to each edge is associated a non-negative real number , and let be the corresponding weighted adjacency matrix. If is stochastic, one can interpret the weights as transition probabilities between the vertices which describe a Markov chain. (The undirected condition then means that the transition probability between two states doesn’t depend on the order in which the transition occurs.) So one can talk about random walks on such a graph, and between any two vertices the most likely walk is the one which maximizes the product of the weights of the corresponding edges.
Suppose you don’t want to maximize a product associated to the edges, but a sum. For example, if the vertices of are locations to which you want to travel, then maybe you want the most likely random walk to also be the shortest one. If is the distance between vertex and vertex , then a natural way to do this is to set
where is some positive constant. Then the weight of a path is a monotonically decreasing function of its total length, and (fudging the stochastic constraint a bit) the most likely path between two vertices, at least if is sufficiently large, is going to be the shortest one. In fact, the larger is, the more likely you are to always be on the shortest path, since the contribution from any longer paths becomes vanishingly small. As , the ring in which the entries of the adjacency matrix lives stops being and becomes (a version of) the tropical semiring.
That’s pretty cool, but it’s not what’s been puzzling me. What’s been puzzling me is that matrix entries in powers of look an awful lot like partition functions in statistical mechanics, with playing the role of the inverse temperature and playing the role of energies. So, for awhile now, I’ve been wondering whether they actually are partition functions of systems I can construct starting from the matrix . It turns out that the answer is yes: the corresponding systems are called one-dimensional vertex models, and in the literature the connection to matrix entries is called the transfer matrix method. I learned this from an expository article by Vaughan Jones, “In and around the origin of quantum groups,” and today I’d like to briefly explain how it works.
Partition functions
Suppose the states of a system form a (preferably finite) set . Each state has some energy , and in the typical setup in statistical mechanics, the system is constantly changing state in such a way that the probability of a given state occurring depends only on the energy. Furthermore, essentially the only other thing we know about the system is its average energy
where is the probability of state occurring. (This average is determined by the behavior of a large external system, which, as it turns out, can be described entirely by its temperature.) Without knowing anything else about the system, what is a sensible choice for the probabilities ? The key is to choose a distribution which maximizes entropy. This corresponds to a distribution which does not reflect any “extra” knowledge about the system.
Theorem: There exists a constant which depends on the average energy such that the distribution is entropy maximal, where is the partition function.
This is beautifully explained in this expository note by Keith Conrad. Terence Tao also describes the more typical physically-motivated route to this distribution, but I can’t resist any definition that has an accompanying uniqueness statement.
(I also can’t resist mentioning that many important functions in pure mathematics can be thought of as partition functions. For example, the system whose states are the positive integers and whose energies are has partition function the Riemann zeta function! This system is known as the primon gas or free Riemann gas. It is also possible to write down some knot invariants as partition functions.)
By comparison to an ideal gas, we find that where is the Boltzmann constant and is the temperature of the large external system, with which the system must be in thermal equilibrium for these equations to be valid. This is physically sensible: as and every state is equally likely. Similarly, as and the state(s) of minimal energy become asymptotically more and more likely.
Note that the average energy of the system is , which really is determined by . In fact, one can compute the moments of the distribution of energies from the partition function, so the partition function encodes essentially all of the important information about the system.
One-dimensional vertex models
We want to consider (finite approximations to) the following system. A state of the system consists of an assignment of a “spin” to each edge in the graph with vertex set and edges between two consecutive integers. A spin is a vertex of a finite weighted undirected graph , and the interaction between two spins is measured by an energy which determines the weight of the edge from to at a particular temperature. The motivating example here is the Ising model, which is a simplified model of the spins of electrons in iron, and which models ferromagnetism. (In one dimension, it doesn’t matter whether we assign spins to the edges or to the vertices, but it matters in higher dimensions. This type of model is called a vertex model because the interactions occur at the vertices.) If the spins assigned to each edge are then the energy of a particular state is the sum of the interaction energies between consecutive spins, or
so the partition function is
where the outer sum is over all possible assignments of spins to edges. Since we don’t want to deal with infinite sums directly, we’ll instead consider a finite approximation to the above system in which there are edges to assign spins to and take the limit as .
The key observation here is that states in the -edge model are the same as walks of length on the graph , with the probability of each state occurring proportional to the product of the weights of the edges (of the walk, not of the model). Thus, for example, if we specify boundary conditions saying that the first spin should be and the last spin should be , then the partition function of the corresponding system is , where is the matrix with entries . And if we specify periodic boundary conditions saying that the first spin and the last spin should agree, then the partition function of the corresponding system is . So these are the systems we want!
Now that we know something about the finite approximations to a one-dimensional vertex model, what can we say about the actual model? That is, what does the limit as look like? One way to think about this limit is to measure the average energy per edge (say with periodic boundary conditions), so we want to compute
.
Since is a matrix with positive real entries, the Perron-Frobenius theorem guarantees that for any particular value of there is a unique positive real eigenvalue of largest absolute value, so in the limit where is this eigenvalue. It is not hard to see that must also change smoothly with , so the average energy per edge approaches
.
The fact that this average energy (indeed, the higher moments of the energy distribution as well) changes smoothly with implies that phase transitions do not occur in the one-dimensional vertex model. One must pass to the two-dimensional case for phase transitions to occur, and I would like to say something about this computation as soon as I can find a good source for it.
Example. In the one-dimensional Ising model, has two vertices, spin up and spin down. The energies are given by and ; in other words, it is energetically favorable for spins to align and unfavorable for spins not to align. (This agrees with the experimental behavior of magnets: at low temperatures, their spins align and so they are magnetic, while at sufficiently high temperatures they demagnetize.) This gives
.
The Perron-Frobenius eigenvalue of is the larger of the two roots of its characteristic polynomial , which is
and hence the average energy per edge is given by
.
At low temperatures, where is large, the average energy per edge approaches ; almost complete alignment. At high temperatures, where is small, the average energy approaches ; random alignment.