Parallel Tempering ================== Consider the problem of sampling a posterior, in some :math:`d`-dimensional space, in the form .. math:: P(x) = L(x) \pi(x) where :math:`L` is the likelihood and :math:`\pi` is some prior distribution. A temperated posterior, for some :math:`T>1` is defined as .. math:: P(x, \beta) = L(x)^{\beta} \pi(x) ~~~~~~~ \beta = \frac1T For :math:`\beta > 1` this is a shallower distribution in which it is easier to explore tails. Parallel tempering is a MCMC technique that consists in sampling different temperated posteriors (including the :math:`\beta=1` target posterior) and exchange points from their chains so to improve the effectiveness of the sampling (in particular in the case of multi-modal posteriors). The idea is to define a ladder of :math:`t` temperatures :math:`(\beta_1, \dots, \beta_t)` with :math:`\beta_1=1`, and :math:`\beta_i < \beta_{i+1}`. We then sample, on a :math:`td`-dimensional space, the posterior .. math:: \tilde{P}\left(x_1, \dots, x_d\right) \equiv \Pi_{i=1}^t~P(x_i, \beta_i) Than we can perform a :ref:`multi step ` as follows * **t steps** involve the above posterior and proposals that sequentially act only on 1 temperature chain .. math:: Q^{(k)}\left(x_1, \dots, x_t \rightarrow x_1, \dots, y_t \right) = q^{(k)}\left(x_k \rightarrow y_k \right)\Pi_{j\neq k}\delta\left(x_j - y_j\right)~~~~~~k=1, \dots, t according to what we said in the :ref:`parallel MCMC section `, these steps can be performed in parallel. The corresponding acceptance is .. math:: A^{(k)} = \frac{L(y_k)^{\beta_k}~\pi(y_k)}{L(x_k)^{\beta_k}~\pi(x_k)} \frac{q^{(k)}\left(y_k \rightarrow x_k\right)}{q^{(k)}\left(x_k \rightarrow y_k\right)} Thus essentially the acceptance of a simple MCMC chain sampling a :math:`P(x, \beta)` posterior. Note that each :math:`q^{(k)}` can be a mix of multiple proposals as described in the :ref:`mixing proposals section `. * **t / 2 steps** in which we propose the swap of the point of each chain with odd index :math:`k` with the point of the next one, i.e. .. math:: Q^{(k)}\left(x_1, \dots, x_t \rightarrow x_1, \dots, y_t \right) = \delta \left(x_k - y_{k+1}\right) \delta \left(y_k - x_{k+1}\right)~ \Pi_{j\neq k, j\neq k + 1}~\delta \left(x_j - y_j\right) for each :math:`k` so that .. math:: k = 1, \dots, t~~~~~ k \mod 2 = 1 ~~~~~ k + 1 \leq t Again we can see that all the resulting steps can be :ref:`performed in parallel `. The acceptance is (note that the proposal is symmetric) .. math:: A^{(k)} = \frac{P(x_{k+1}, \beta_k)~P(x_{k}, \beta_{k+1})}{P(x_{k}, \beta_k)~P(x_{k+1}, \beta_{k+1})} = = \left(\frac{L(x_{k})}{L(x_{k+1})}\right)^{\beta_{k+1} - \beta_k} * **t / 2 steps** in which we propose the swap of the point of each chain with even index :math:`k` with the point of the next one. That is the same proposal defined above with :math:`k` so that .. math:: k = 1, \dots, t~~~~~ k \mod 2 = 0 ~~~~~ k + 1 \leq t These steps can also be :ref:`performed in parallel ` and the acceptance expression is the same as above. .. _algo3: In practice the algorithm goes as follows:: # Algorithm 3 X[n] # Point at the n-th markov iteration # This is a (N. temperatures, dim) matrix Beta[t] # temperature ladder Proposal[t] # single chain proposal for all temperatures # adding mixing is strightforward. L, Pi # likelihood and prior loop on temperatures t: \\ this can be performed in parallel Y, Fwd, Bkd = Proposal[t](X[n][t]) # Propose a point Y sampling Q[t](X[n][t] -> Y) # Fwd = Q[t](X[n][t] -> Y) # Bkd = Q[t](Y -> X[n][t]) Z = Uniform(0, 1) # Get a random Z between 0 and 1 A = (L(Y) / L(X[n][t]))^Beta[t] * Pi(y) / Pi(x) * Bkd / Fwd # Acceptance if Z > A: X[n + 1][t] = X[n][t] # Refused else: X[n + 1][t] = Y # Accepted loop on odd temperatures t so that temperature t + 1 exists: \\ this can be performed in parallel Z = Uniform(0, 1) # Get a random Z between 0 and 1 A = (L(X[n + 1][t]) / L(X[n + 1][t + 1])^(Beta[t+1] - Beta[t]) if Z > A: # Accepted Y = X[n + 1][t] X[n + 1][t] = X[n + 1][t + 1] # Swap X[n + 1][t + 1] = Y loop on even temperatures t so that temperature t + 1 exists: \\ this can be performed in parallel # Same as previous loop # Loop to get X[n + 2]