Licensed under the Creative Commons attribution-noncommercial license, http://creativecommons.org/licenses/by-nc/3.0/. Please share and remix noncommercially, mentioning its origin.
CC-BY_NC

Produced in R version 3.2.3.


Objectives

  1. To review some basic ideas in Monte Carlo computation and simulating random variables.

  2. To provide a basic introduction to the Monte Carlo approach, and the generation of simulated random variables, for those who haven’t seen it before.




Our context: Monte Carlo methods for POMP models.

Let’s consider a general POMP model. As before, let \({y_{1:N}^*}\) be the data, and let the model consist of a latent process \(X_{0:N}\) and an observable process \(Y_{1:N}\). Then the likelihood function is \[\begin{eqnarray} {\mathscr{L}}(\theta) &=& f_{Y_{1:N}}({y_{1:N}^*}{\, ; \,}\theta) \\ &=&\int_{x_0}\cdots\int_{x_N}\! f_{X_0}(x_0{\, ; \,}\theta)\prod_{n=1}^{N}\!f_{Y_n|X_n}({y_n^*}{{\, | \,}}x_n{\, ; \,}\theta)\, f_{X_n|X_{n-1}}(x_n|x_{n-1}{\, ; \,}\theta)\, dx_0\dots dx_N. \end{eqnarray}\] i.e., computation of the likelihood requires integrating (or summing, for a discrete model) over all possible values of the unobserved latent process at each time point. This is very hard to do, in general.

Let’s review, and/or learn, some Monte Carlo approaches for evaluating this and other difficult integrals. An excellent technical reference on Monte Carlo techniques is Robert and Casella (2004).




The fundamental theorem of Monte Carlo integration


Theorem. Let \(f(x)\) be the probability distribution function for a random variable \(X\), and let \(X_{1:J}=\{X_{j}, {j}=1,\dots,{J}\}\) be an independent and identically distributed sample of size \({J}\) from \(f\). Let \({H_{J}}\) be the sample average of \(h(X_1)\dots,h(X_{J})\), \[{H_{J}} = \frac{1}{{J}}\,\sum_{{j}=1}^{{J}}\!h(X_{j}).\] Then \({H_{J}}\) converges to \(H\) as \({J}\to\infty\) with probability 1. Less formally, we write \[{H_{J}} \approx \int\!h(x)\,f(x)\,dx.\]


Proof. This is the strong law of large numbers, together with the identity that \[{\mathbb{E}}[h(X)] = \int\!h(x)\,f(x)\,dx.\]





Importance sampling




Simulation techniques for general distributions




The transformation method

  • This method works for discrete or continuous scalar random variables.

  • Let \(f\) be the probability distribution function we seek to draw from (known as the target distribution) and \(F\) be the corresponding cumulative distribution function, i.e., \(F(x) = \int_{-\infty}^x f(v)\, dv\).

  • Let \(F^{-1}(u) = \inf\{x: F(x)\,\ge\,u\}\) be the inverse of \(F\).

  • A basic fact is that, if \(X\!\sim\!f\), then \(F(X)\!\sim\!\mathrm{uniform}(0,1)\).
    Proof: If \(f(X)>0\), then \[\begin{eqnarray} {\mathbb{P}}[F(X)\,\le\,u] &=& {\mathbb{P}}[X\,<\,F^{-1}(u)] \\ &=& F\big(F^{-1}(u)\big) = u. \end{eqnarray}\]

  • This suggests that, if we can compute \(F^{-1}\), we use the following algorithm to generate \(X\!\sim\!f\):

  1. draw \(U \sim \mathrm{uniform}(0,1)\).

  2. let \(X = F^{-1}(U)\).




The rejection method

  • The transformation method is very efficient in that we are guaranteed to obtain a valid \(X\) from the density \(f\) for every \(U \sim \mathrm{uniform}(0,1)\) we generate.

  • Sometimes, however, we cannot compute the inverse of the cumulative distribution function, as required by the transformation method.

  • Under such circumstances, the rejection method offers a less efficient, but more flexible, alternative.

  • We’ll see how and why this method works.


The rejection method for uniform random variables on arbitrary sets

  • Let a random variable \(X\) take values in \({\mathbb{R}}^{{\mathrm{dim}(X)}}\).

  • Suppose that \(X\) is uniformly distributed over a region \(D\subset {\mathbb{R}}^{{\mathrm{dim}(X)}}\). This means that, for any \({A}\subset{D}\), \[{\mathbb{P}}[X\in{A]}=\frac{\mathrm{area}(A)}{\mathrm{area}(D)}.\] We write \[X \sim \mathrm{uniform}(D).\]

  • Let’s suppose that we wish to simulate \(X\!\sim\!\mathrm{uniform}(D)\).

  • Suppose that we don’t know how to directly simulate a random draw from \(D\), but we know \(D\) is a subset of some nicer region \(U\subset {\mathbb{R}}^{{\mathrm{dim}(X)}}\).

  • If we know how to generate \(Y\!\sim\!\mathrm{uniform}(U)\), then we can simply do so until \({Y}\in{D}\), at which point we take \(X=Y\).

  • Since for any \(A\subset{D}\), \[\begin{eqnarray} {\mathbb{P}}[X\in A] &=& {\mathbb{P}}[Y\in A |Y\in D] \\ &=& \frac{\mathrm{area}(A)}{\mathrm{area}(U)}\Big{/}\frac{\mathrm{area}(D)}{\mathrm{area}(U)} \\ &=& \frac{\mathrm{area}(A)}{\mathrm{area}(D)}, \end{eqnarray}\] it follows that \(Y\!\sim\!\mathrm{uniform}(D)\).

  • Consider an analogy to throwing darts. If the darts are thrown in such a way as to be equally likely to land anywhere in \(U\), then those that do land in \(D\) are equally likely to land anywhere in \(D\).




The rejection method for dominated densities

  • A useful little fact allows us to extend the rejection method from uniform distributions to arbitrary densities.

  • Robert and Casella (2004) refer to this fact the fundamental theorem of simulation.


  • Let \(h\) be an arbitary positive, integrable function.

  • Define \[D=\{(x,u): 0{\le}u{\le}h(x)\},\] i.e., \(D\) is the graph of \(h\).

  • Consider the random pair \((X,U)\!\sim\!\mathrm{uniform}(D)\).

  • What is the marginal distribution of \(X\)? \[\int_0^{h(x)}\!\mathrm{d}u = h(x)\]

  • So \(h\) is the probability distribution function for \(X\)!

  • To carry out the rejection method, simulating from \(g\) to obtain a sample from \(f\), we take our region \(D\) to be the graph of \(Mg\), i.e., \[ D = \{(x,y): 0\le y\le Mg(x),\] where \(M\) is a constant chosen so that \(Mg(x)\le f(x)\) for all \(x\).

  • We propose points \((X,Y)\) by drawing them uniformly from the area under the graph of \(M\,g\).

  • We accept the point \((X,Y)\) if it lies under the graph of \(f\).

  • Then, the \(X\)-component of the \((X,Y)\) pair is distributed according to \(f\).


  • This suggests the following rejection method for simulating an arbitrary random variable.

  • Let \(f\) be the target distribution and \(g\) be another distribution function (from which it is easy to simulate) (see Figure below).

  • Let \(M\) be such that \(M\,g(x){\ge}f(x)\) for all \(x\).

  • The following procedure simulates \(X\!\sim\!f\).

  1. draw \(Y\!\sim\!g\) and \(U\!\sim\!\mathrm{uniform}(0,M\,g(Y))\).

  2. if \(U{\le}f(Y)\), then let \(X=Y\) else repeat step 1.



Acknowledgement: These notes derive from notes by Aaron King.

References

Robert, C. P., and G. Casella. 2004. Monte Carlo Statistical Methods. 2nd editions. Springer-Verlag.