Sampling Techniques

In this chapter, we will discuss how to sample from a probability distribution. Sampling from a probability distribution is a fundamental problem in statistics and machine learning. It is used in various applications like Monte Carlo methods, Bayesian inference, and reinforcement learning.

To understand what it means to sample from a probability distribution, let’s consider a simple example. Let \(X\) be a discrete random variable that takes values \(1, 2, \ldots, n\) with probabilities \(p_1, p_2, \ldots, p_n\). To sample from this distribution, we want to generate a random variable \(X\) such that \(\mathbb{P}(X = i) = p_i\) for all \(i = 1, 2, \ldots, n\) i.e. we want to select a random integer \(i\) with probability \(p_i\). If we generate enough samples \(x_1, x_2, \ldots, x_N\) from this distribution, then the fraction of samples that are equal to \(i\) will be approximately equal to \(p_i\) for large \(N\).

More formally, let \((\Omega, \mathcal{F}, P)\) be a probability space and let \(X: \Omega \to \mathcal{X}\) be a random variable with distribution \(P_X\). Sampling from the distribution of \(X\) means generating independent realizations \(x_1, x_2, \ldots, x_n\) such that each \(x_i\) has the same distribution as \(X\). By the strong law of large numbers, we have

\[\lim_{n \to \infty} \frac{1}{n} \sum_{j=1}^{n} f(x_j) = \mathbb{E}[f(X)]\]

almost surely for any measurable function \(f\) such that \(\mathbb{E}[|f(X)|] < \infty\).

The fundamental challenge in sampling is transforming uniform random numbers, which are readily available from pseudorandom number generators, into samples from arbitrary probability distributions. Most computational environments provide uniform random number generators that produce sequences \(U_1, U_2, \ldots\) where each \(U_i\) is approximately uniformly distributed on \([0,1]\) and the sequence has good statistical properties.

In the following chapters, we will examine several sampling techniques in detail, providing both theoretical foundations and practical algorithms for implementation.