Ensemble Sampling

Ensemble Sampling#

Another possible strategy when sampling a posterior using Metropolis Hastings MCMC is to sample multiple chains in parallel allowing some miwed proposals between the chains. This strategy is called Ensemble Sampling.

Consider having a target posterior \(P(x)\) on a \(d\)-dimensional space. We can sample, on a \(Wd\)-dimensional space: (\(W\) is the number of parallel chains, usually called walkers), the following posterior

\[\tilde{P}\left(x_1, \dots, x_W\right) = \Pi_{i=1}^W~P(x_i)\]

We can clearly consider having a multi-step MCMC with each step defined by a proposal (or a mix of proposals) that only involves 1 walker. Thus a porposal in the form

\[Q\left(x_1, \dots, x_W \rightarrow y_1, \dots, y_W\right) = q\left(x_j \rightarrow y_j\right)~\Pi_{k \neq j}\delta\left(x_k - y_k\right)\]

These steps can of course be performed in parallel and each has a simple single walker accaptance

\[A = \frac{P(y_j)}{P(x_j)}\frac{q\left(y_j \rightarrow x_j\right)}{q\left(x_j \rightarrow y_j\right)}\]

This is nothing but having multiple parallel sampling of the same posterior. What is interesting, with ensemble sampling, is the possibility to have cross walkers proposals. So we can have a cycle with 1 step per walker in which we propose a new point for that walker based on the ensemble status of the walkers. Thus a proposal in the form

\[Q\left(x_1, \dots, x_W \rightarrow y_1, \dots, y_W\right) = q\left(x_1,\dots, x_W \rightarrow y_j\right)~\Pi_{k \neq j}\delta\left(x_j - y_j\right)\]

The acceptance form is still the one above but such steps cannot be performed in parallel. One solution to parallelize is to divide the walkers in two group, red and blue, say the first and second half. We first cycle on the red walkers with proposals in the form (here \(j \leq W/2\))

\[Q\left(x_1, \dots, x_W \rightarrow y_1, \dots, y_W\right) = q\left(x_j,x_{\frac{W}2 + 1}\dots, x_W \rightarrow y_j\right)~\Pi_{k \neq j}\delta\left(x_j - y_j\right)\]

and these steps can be performed in parallel. Then we cycle on the blue worker (\(j > W/2\)) and use proposals like

\[Q\left(x_1, \dots, x_W \rightarrow y_1, \dots, y_W\right) = q\left(x_{1}\dots, x_{\frac{W}2}, x_j \rightarrow y_j\right)~\Pi_{k \neq j}\delta\left(x_j - y_j\right)\]

and perform the steps in parallel.

Note that the above strategy can be achieved by explicitly run a “3 steps” procedure (single walker, red and blue) or by having a mix of proposals with single walker, red and blue proposals. Note that red and blue versions of the same proposal should have equal weight to avoid byasing a subset of walker.

In the simple algorithm below we consider having a single proposal of each type (single, red, blue) and we explicitly write the 3 phases.:

# Algorithm 4

X[n]                           # Point at the n-th markov iteration
                               # This is a (N. walkers, dim) matrix

SingleProp                     # single chain proposal

RBProp                         # redblue proposal

loop on walkers k:
\\ this can be performed in parallel

    Y, Fwd, Bkd = SingleProp[t](X[n][w])   # Propose a point Y sampling Q[t](X[n][t] -> Y)
                                           # Fwd = Q[t](X[n][t] -> Y)
                                           # Bkd = Q[t](Y -> X[n][t])

    Z = Uniform(0, 1)                    # Get a random Z between 0 and 1

    A = P(Y) / P(X[n][w]) * Bkd / Fwd #  Acceptance

   if Z > A:

        X[n + 1][w] = X[n][w]            # Refused

   else:

        X[n + 1][w] = Y                  # Accepted

loop on first half of walkers w:
\\ this can be performed in parallel

    Y, Fwd, Bkd = SingleProp[t](X[n + 1][(W/2 + 1)..W], X[n][w])
                                          # Propose a point Y sampling Q[t](X[n][t] -> Y)
                                          # Fwd = Q[t](X[n][t] -> Y)
                                          # Bkd = Q[t](Y -> X[n][t])

    Z = Uniform(0, 1)                    # Get a random Z between 0 and 1

    A = P(Y) / P(X[n + 1][w]) * Bkd / Fwd #  Acceptance

    if Z > A:

        X[n + 1][w] = X[n + 1][w]         # Refused

    else:

        X[n + 1][w] = Y                   # Accepted



loop on first half of walkers w:
\\ this can be performed in parallel

    Y, Fwd, Bkd = SingleProp[t](X[n + 1][1..W/2], X[n][w])
                                          # Propose a point Y sampling Q[t](X[n][t] -> Y)
                                          # Fwd = Q[t](X[n][t] -> Y)
                                          # Bkd = Q[t](Y -> X[n][t])

    # the rest goes an in previous loop

# Loop to get X[n + 2]