emcee + PyMC3
August 21, 2018
This post was generated from an IPython notebook that can be downloaded here.
In this post, I will demonstrate how you can use emcee to sample models defined using PyMC3. Thomas Wiecki wrote about how to do this this with an earlier version of PyMC, but I needed an update since I wanted to do a comparison and PyMC's interface has changed a lot since he wrote his post. This isn't necessarily something that you'll want to do (and I definitely don't recommend it in general), but I figured that I would post it here for posterity.
For simplicity, let's use the simulated data from my previous blog post:
import numpy as np import matplotlib.pyplot as plt np.random.seed(42) true_params = np.array([0.5, -2.3, -0.23]) N = 50 t = np.linspace(0, 10, 2) x = np.random.uniform(0, 10, 50) y = x * true_params + true_params y_obs = y + np.exp(true_params[-1]) * np.random.randn(N) plt.plot(x, y_obs, ".k", label="observations") plt.plot(t, true_params*t + true_params, label="truth") plt.xlabel("x") plt.ylabel("y") plt.legend(fontsize=14);
Then, we can code up the model in PyMC3 following Jake VanderPlas' notation, and sample it using PyMC3's NUTS[sic] sampler:
import pymc3 as pm import theano.tensor as tt with pm.Model() as model: logs = pm.Uniform("logs", lower=-10, upper=10) alphaperp = pm.Uniform("alphaperp", lower=-10, upper=10) theta = pm.Uniform("theta", -2*np.pi, 2*np.pi, testval=0.0) # alpha_perp = alpha * cos(theta) alpha = pm.Deterministic("alpha", alphaperp / tt.cos(theta)) # beta = tan(theta) beta = pm.Deterministic("beta", tt.tan(theta)) # The observation model mu = alpha * x + beta pm.Normal("obs", mu=mu, sd=tt.exp(logs), observed=y_obs) trace = pm.sample(draws=2000, tune=2000)
Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... Multiprocess sampling (2 chains in 2 jobs) NUTS: [theta, alphaperp, logs] Sampling 2 chains: 100%|██████████| 8000/8000 [00:04<00:00, 1881.78draws/s]
import corner samples = np.vstack([trace[k] for k in ["alpha", "beta", "logs"]]).T corner.corner(samples, truths=true_params);
Sampling the PyMC3 model using emcee¶
To sample this using emcee, we'll need to do a little bit of bookkeeping. I've coded this up using version 3 of emcee that is currently available as the master branch on GitHub or as a pre-release on PyPI, so you'll need to install that version to run this.
To sample from this model, we need to expose the Theano method for evaluating the log probability to Python. There is a version of this built into PyMC3, but I also want to return the values of all the deterministic variables using the "blobs" feature in emcee so the function is slightly more complicated.
import theano with model: f = theano.function(model.vars, [model.logpt] + model.deterministics) def log_prob_func(params): dct = model.bijection.rmap(params) args = (dct[k.name] for k in model.vars) results = f(*args) return tuple(results)
And now we can run the sampler:
import emcee with model: # First we work out the shapes of all of the deterministic variables res = pm.find_MAP() vec = model.bijection.map(res) initial_blobs = log_prob_func(vec)[1:] dtype = [(var.name, float, np.shape(b)) for var, b in zip(model.deterministics, initial_blobs)] # Then sample as usual coords = vec + 1e-5 * np.random.randn(25, len(vec)) nwalkers, ndim = coords.shape sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob_func, blobs_dtype=dtype) sampler.run_mcmc(coords, 5000, progress=True)
logp = -63.46, ||grad|| = 386.74: 100%|██████████| 28/28 [00:00<00:00, 4362.09it/s] 100%|██████████| 5000/5000 [00:08<00:00, 578.23it/s]
And we can use this to make the same corner plot as above:
import pandas as pd df = pd.DataFrame.from_records(sampler.get_blobs(flat=True, discard=100, thin=30)) corner.corner(df[["alpha", "beta", "logs"]], truths=true_params);
The last thing that we might want to look at is the integrated autocorrelation time for each method. First, as expected, the autocorrelation for PyMC3 is very short (about 1 step):
[float(emcee.autocorr.integrated_time(np.array(trace.get_values(var.name, combine=False)).T)) for var in model.free_RVs]
[1.263846510183499, 0.8891706657305904, 0.9649242063729033]
And, the autocorrelation for emcee is about 40 steps:
array([39.37563208, 39.56714997, 38.32092758])
If you want to compare these results in detail, you'll want to make sure that you take into account the fact that each step of NUTS is significantly more expensive than one step with emcee, but that's way beyond the scope of this post...
11/22/18: This post has been updated with suggestions from Thomas Wiecki. The
find_MAP call has been removed from the PyMC sampling, and
model.bijection is now used to map between arrays and dicts of parameters.