Subjects/Math/Statistics and Discrete Math/Probability/Central limit theorem

Foundations of the Central Limit Theorem

Understand the core statement, assumptions, and extensions of the Central Limit Theorem—from the classical i.i.d. case to Lyapunov, Lindeberg–Feller, and multivariate formulations.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

What happens to the distribution of a normalized sample mean as the sample size grows according to the central limit theorem?

1 of 18

Summary

Introduction to the Central Limit Theorem The Central Limit Theorem (CLT) is one of the most powerful and important results in statistics. It explains why normal distributions appear so frequently in real-world data, even when we're measuring things that don't naturally follow a normal distribution. Understanding the CLT is essential for hypothesis testing, confidence intervals, and many statistical inference methods you'll encounter. What is the Central Limit Theorem? The Central Limit Theorem states that when you take the average of a random sample from a population, that average follows an approximately normal (bell-shaped) distribution—provided your sample is large enough. Remarkably, this holds true regardless of what the original population distribution looks like. Here's the key insight: imagine you repeated an experiment many times, calculating the sample mean each time. If you plotted all those sample means, they would form a bell curve. This is powerful because it lets us use normal distribution methods for inference even when dealing with populations that aren't normally distributed. The image shows this beautifully: on the left is a non-normal population distribution, and on the right is the sampling distribution of the sample mean—which is approximately normal, even though the population wasn't. The Mathematical Formulation Let's define the problem formally. Suppose you have a random sample $X1, X2, \ldots, Xn$ from a population with: Mean: $\mu$ Variance: $\sigma^2$ (which must be positive and finite) The sample mean is: $$\bar{X}n = \frac{1}{n}\sum{i=1}^{n}Xi$$ The Central Limit Theorem tells us that the standardized sample mean converges to a standard normal distribution as $n$ grows large: $$\frac{\sqrt{n}\,(\bar{X}n-\mu)}{\sigma} \xrightarrow{d} N(0,1)$$ The notation $\xrightarrow{d}$ means "converges in distribution to," and $N(0,1)$ is the standard normal distribution with mean 0 and variance 1. Why standardize? Without standardization, as $n$ increases, the sample mean gets closer and closer to $\mu$, and its distribution would collapse to a point. By multiplying by $\sqrt{n}$, we "rescale" the problem to see how the sample mean fluctuates around $\mu$. The Classical Central Limit Theorem The classical CLT is the version you'll use most often. It applies when your random variables satisfy two key requirements. Required Assumptions The classical CLT requires that your random variables satisfy these conditions: Independence and Identical Distribution (i.i.d.): Each $Xi$ must be independent of all others, and they must all come from the same population distribution. Finite Mean and Variance: The population must have a finite mean $\mu$ and finite positive variance $\sigma^2$. These assumptions are crucial. If they're violated, the theorem may not apply, or a different version of the CLT may be needed. The Statement For i.i.d. random variables, we can also express the CLT in terms of the standardized sum: $$Sn^{} = \frac{\sum{i=1}^{n}Xi - n\mu}{\sigma\sqrt{n}} \xrightarrow{d} N(0,1)$$ This is equivalent to the formulation using the sample mean (since the sum equals $n$ times the mean). Both forms tell us the same thing: the properly standardized average converges to a normal distribution. Practical Implication In practice, this means that for reasonably large $n$, you can approximate the distribution of $\bar{X}n$ as: $$\bar{X}n \approx N\left(\mu, \frac{\sigma^2}{n}\right)$$ Notice that the variance of the sample mean is $\frac{\sigma^2}{n}$, which decreases as $n$ increases. This is why larger samples give us more precise estimates—the sample mean varies less. Connection to the Law of Large Numbers The Law of Large Numbers (LLN) and the CLT work together but describe different things: Law of Large Numbers: Tells us that $\bar{X}n \xrightarrow{p} \mu$ (the sample mean converges to the true mean as $n \to \infty$) Central Limit Theorem: Tells us how fast this convergence happens and how the sample mean fluctuates around $\mu$ during finite samples Think of it this way: the LLN guarantees that your sample average eventually hits the bullseye (the true mean), while the CLT describes the pattern of shots around the bullseye as you increase the number of shots. Uniformity of Convergence One important technical detail: the convergence of the cumulative distribution functions (CDFs) is uniform. This means that the approximation quality doesn't depend on what specific value you're evaluating—the normal approximation works equally well across the entire distribution, not just in the center. This uniformity ensures that the CLT is reliable for practical applications. When the Classical Assumptions Don't Hold: Generalizations The classical CLT assumes i.i.d. variables with finite variance. But what if your data violates these assumptions? Fortunately, mathematicians have developed generalizations. The Lyapunov Central Limit Theorem The Lyapunov CLT relaxes the requirement that variables be identically distributed. This is important because in real applications, you often have independent measurements that come from slightly different distributions. What Changed? Variables need only be independent, not identical Each variable can have a different distribution All variables must have finite means $\mui$ (which may differ) The Lyapunov Condition The Lyapunov CLT replaces the "identical distribution" requirement with a technical condition called the Lyapunov condition. Define the sum of variances: $$sn^{2} = \sum{i=1}^{n}\operatorname{Var}(Xi)$$ The Lyapunov condition requires that for some $r > 2$: $$\lim{n\to\infty}\frac{1}{sn^{r}}\sum{i=1}^{n}E\left[|Xi-\mui|^{\,r}\right]=0$$ What does this mean intuitively? The condition ensures that no single variable's variability dominates the sum. In other words, the variability is "spread out" across many roughly comparable terms, with no single term being extreme. The Conclusion If the Lyapunov condition holds, then: $$\frac{\sum{i=1}^{n}(Xi-\mui)}{sn} \xrightarrow{d} N(0,1)$$ The Lindeberg–Feller Central Limit Theorem <extrainfo> The Lindeberg–Feller CLT provides an even weaker (and more general) condition than Lyapunov. Lindeberg Condition For each $\varepsilon > 0$, the condition requires: $$\frac{1}{sn^{2}}\sum{i=1}^{n}E\left[(Xi-\mui)^{2}\mathbf{1}{|Xi-\mui|>\varepsilon sn}\right]\to 0$$ This condition says: among all the variability, the contribution from "large deviations" (values far from their means) must vanish relative to the total variation. Relationship to Lyapunov An important theoretical fact: satisfying the Lyapunov condition implies the Lindeberg condition, but not vice versa. This means Lindeberg is the weaker (more general) condition—it applies in more situations. However, the Lyapunov condition is easier to check in practice. </extrainfo> The Multidimensional Central Limit Theorem The CLT extends naturally to situations where you're measuring multiple variables simultaneously. This is important for understanding how multiple measurements behave together. Random Vectors Suppose you observe random vectors (points in $d$-dimensional space): $$\mathbf{X}1, \mathbf{X}2, \ldots, \mathbf{X}n$$ each coming from the same distribution in $\mathbb{R}^{d}$ with: Mean vector: $\boldsymbol{\mu}$ Covariance matrix: $\Sigma$ The Multivariate CLT The sample mean vector: $$\bar{\mathbf{X}}n = \frac{1}{n}\sum{i=1}^{n}\mathbf{X}i$$ satisfies: $$\sqrt{n}(\bar{\mathbf{X}}n - \boldsymbol{\mu}) \xrightarrow{d} N(\mathbf{0}, \Sigma)$$ This means the sample mean vector is approximately normally distributed with the given covariance structure. How This is Proven The clever proof technique is called the Cramér–Wold device. Instead of proving the multivariate result directly, mathematicians project the vector onto arbitrary one-dimensional directions and show that each projection satisfies the one-dimensional CLT. Since this works for all possible directions, the multivariate result follows. This is an elegant example of reducing a complex problem to simpler, known cases. <extrainfo> Alternative Formulations: The Local Limit Theorem Besides the distributional convergence we've discussed, there's another perspective on the CLT called the Local Limit Theorem. Density Function Perspective Instead of asking "what is the distribution of the sample mean?", we can ask "what does the probability density function look like?" The Local Limit Theorem states that under suitable regularity conditions, the probability density function of the normalized sample mean approaches the normal density function. More concretely: if you convolve (combine through repeated averaging) many probability distributions together, the resulting distribution becomes increasingly normal. This shows why the normal distribution is so natural—it's what you get when you combine many independent random influences. </extrainfo> Summary of Key Takeaways The Central Limit Theorem is foundational to statistics because it justifies using normal-based inference methods broadly: Classical CLT: For i.i.d. samples with finite variance, the sample mean is approximately normal for large $n$ Extensions exist: Lyapunov and Lindeberg generalizations handle non-identical distributions Works in multiple dimensions: The theorem extends to multivariate settings naturally Remarkably general: The theorem applies regardless of the original population distribution—as long as the assumptions hold This universality is what makes the CLT so powerful for practical statistics.

Flashcards

What happens to the distribution of a normalized sample mean as the sample size grows according to the central limit theorem?

It converges to a standard normal distribution.

Does the central limit theorem require the original random variables to be normally distributed?

No, it applies even when they are not normally distributed.

What is the primary practical significance of the central limit theorem regarding non-normal distributions?

Methods that assume normality can often be used for many other distributions.

In the basic statistical formulation of the CLT, what expression represents the random variable that converges to $Z \sim N(0,1)$?

$\frac{\sqrt{n}(\bar{X}n - \mu)}{\sigma}$ (where $n$ is sample size, $\bar{X}n$ is sample mean, $\mu$ is population mean, and $\sigma$ is standard deviation).

Intuitively, what distribution is formed by repeating an experiment many times and computing the average each time?

An approximately normal distribution (for large sample sizes).

What are the two primary assumptions for the random variables in the classical CLT?

They must be independent and identically distributed (i.i.d.). Each must have a finite mean $\mu$ and finite variance $\sigma^2$.

What is the formula for the normalized sum $Sn^$ that converges to $N(0,1)$ in the classical CLT?

$Sn^ = \frac{\sum{i=1}^{n}Xi - n\mu}{\sigma\sqrt{n}}$

While the Law of Large Numbers guarantees the sample mean converges to $\mu$, what does the CLT describe specifically?

How the fluctuations around $\mu$ behave when scaled by $\sqrt{n}$.

What is the nature of the convergence of the cumulative distribution functions in the classical CLT?

The convergence is uniform in the argument.

What may happen to the limiting distribution if the underlying variables have infinite variance?

The CLT may fail and other stable laws (like the Cauchy distribution) become the limits.

How does the requirement for random variables in the Lyapunov CLT differ from the classical CLT?

They only need to be independent; they do not need to be identically distributed.

What moment condition must be satisfied for the Lyapunov CLT?

Each variable must have a finite $r$-th moment for some $r > 2$.

In the Lyapunov CLT, if the Lyapunov condition holds, what does the standardized sum $\frac{\sum{i=1}^{n}(Xi-\mui)}{sn}$ converge to?

$N(0,1)$ (the standard normal distribution).

What is the relationship between the Lyapunov condition and the Lindeberg condition?

The Lyapunov condition implies the Lindeberg condition, but not vice versa.

Is the Lindeberg condition stronger or weaker than the Lyapunov condition?

It is a weaker condition.

What is the limiting distribution for a normalized sum of i.i.d. random vectors in $\mathbb{R}^d$?

A multivariate normal distribution $N(\mathbf{0}, \Sigma)$ (where $\Sigma$ is the covariance matrix).

Which mathematical tool is used to reduce the multivariate CLT case to a one-dimensional CLT by projecting onto arbitrary directions?

The Cramér–Wold device.

What does the density-function view (Local Limit Theorem) state regarding the convolution of many probability densities?

It approaches the normal density under suitable regularity conditions.

Quiz

Foundations of the Central Limit Theorem Quiz Question 1: What does the central limit theorem assert about the distribution of a normalized sample mean when the sample size becomes large?

It converges to a standard normal distribution (correct)
It converges to the original population distribution
It becomes a uniform distribution
It diverges to infinity

Foundations of the Central Limit Theorem Quiz Question 2: In the classical CLT, what is the limiting distribution of the normalized sum $S_n^{*}= \frac{\sum_{i=1}^{n}X_i - n\mu}{\sigma\sqrt{n}}$ for i.i.d. variables?

Standard normal distribution $N(0,1)$ (correct)
T‑distribution with $n-1$ degrees of freedom
Chi‑squared distribution with 1 degree of freedom
Exponential distribution

Foundations of the Central Limit Theorem Quiz Question 3: In the local limit theorem view of the CLT, what does the convolution of many probability densities approach under suitable regularity conditions?

The normal (Gaussian) density (correct)
The uniform density
The exponential density
The Poisson mass function

Foundations of the Central Limit Theorem Quiz Question 4: If the Lyapunov condition holds for a sequence of independent variables, which other condition is automatically satisfied?

The Lindeberg condition (correct)
The uniform integrability condition
The Cramér condition
The Kolmogorov condition

Foundations of the Central Limit Theorem Quiz Question 5: Which device is used to reduce the multivariate Central Limit Theorem to the one‑dimensional case?

The Cramér–Wold device (correct)
The Slutsky theorem
The Delta method
The Skorokhod representation

Foundations of the Central Limit Theorem Quiz Question 6: How does the Lindeberg condition compare to the Lyapunov condition?

It is weaker (less restrictive) (correct)
It is stronger (more restrictive)
It is equivalent
It is unrelated

Foundations of the Central Limit Theorem Quiz Question 7: Which statement correctly describes the asymptotic behavior of the standardized sample mean $\displaystyle \frac{\sqrt{n}\,(\bar{X}_n-\mu)}{\sigma}$ as the sample size $n$ grows?

It converges in distribution to a standard normal variable $N(0,1)$. (correct)
It converges almost surely to zero.
It diverges to infinity in probability.
It converges in probability to the population mean $\mu$.

Foundations of the Central Limit Theorem Quiz Question 8: When the Lindeberg condition is satisfied, to which distribution does the standardized sum $\displaystyle \frac{\sum_{i=1}^{n}(X_i-\mu_i)}{s_n}$ converge?

Standard normal distribution $N(0,1)$. (correct)
Student’s t‑distribution with $n-1$ degrees of freedom.
Cauchy distribution.
Chi‑square distribution with $n$ degrees of freedom.

Foundations of the Central Limit Theorem Quiz Question 9: In the multidimensional Central Limit Theorem, the limiting multivariate normal distribution has covariance matrix equal to which of the following?

$\Sigma$ (correct)
$I$ (the identity matrix)
$2\Sigma$
$\Sigma^{2}$ (the matrix square of $\Sigma$)

Foundations of the Central Limit Theorem Quiz Question 10: For a sequence of i.i.d. random vectors in ℝⁿ with mean vector μ and covariance matrix Σ, to which distribution does the scaled sum (1/√n)∑_{i=1}^{n}(X_i‑μ) converge?

Multivariate normal N(0, Σ) (correct)
Multivariate t‑distribution
Multivariate uniform distribution
Multivariate exponential distribution

Foundations of the Central Limit Theorem Quiz Question 11: In the Lyapunov Central Limit Theorem, how is the quantity $s_n^{2}$ defined?

$s_n^{2}= \displaystyle\sum_{i=1}^{n}\operatorname{Var}(X_i)$ (correct)
$s_n^{2}= \displaystyle\frac{1}{n}\sum_{i=1}^{n}\operatorname{Var}(X_i)$
$s_n^{2}= \bigg(\displaystyle\sum_{i=1}^{n}E\!\big|X_i-\mu_i\big|\bigg)^{2}$
$s_n^{2}= \displaystyle\prod_{i=1}^{n}\operatorname{Var}(X_i)$

Foundations of the Central Limit Theorem Quiz Question 12: When the Lyapunov condition holds, to what distribution does the standardized sum $\displaystyle\frac{\sum_{i=1}^{n}(X_i-\mu_i)}{s_n}$ converge?

Standard normal distribution $N(0,1)$ (correct)
Student's t‑distribution with $n-1$ degrees of freedom
Cauchy distribution
Chi‑square distribution with $n$ degrees of freedom

Foundations of the Central Limit Theorem Quiz Question 13: Which of the following is an example of a stable distribution that can arise as the limiting law when the underlying variables have infinite variance?

Cauchy distribution (correct)
Standard normal distribution
Poisson distribution
Exponential distribution

Foundations of the Central Limit Theorem Quiz Question 14: In the Lyapunov Central Limit Theorem, the moment condition involves an exponent $r$. Which statement about $r$ is correct?

$r$ must be greater than 2 (correct)
$r$ must equal 2
$r$ can be any positive number
$r$ must be less than 2

Foundations of the Central Limit Theorem Quiz Question 15: When an experiment is performed repeatedly and the average of each large sample is plotted, what shape does the resulting histogram tend to approach?

Approximately normal (bell‑shaped) distribution (correct)
Uniform distribution
Skewed distribution matching the original population
Exponential decay distribution

Foundations of the Central Limit Theorem Quiz Question 16: For each random variable in the classical CLT, which two quantities must be finite?

A finite mean and a finite positive variance (correct)
An infinite variance but finite mean
A finite mean only; variance may be infinite
No finiteness requirement; only independence matters

Foundations of the Central Limit Theorem Quiz Question 17: Which theorem guarantees that the sample mean $\bar{X}_n$ converges almost surely to the population mean $\mu$ as the number of observations grows?

The Law of Large Numbers (correct)
The Central Limit Theorem
The Chebyshev Inequality
The Markov Inequality

Foundations of the Central Limit Theorem Quiz Question 18: In the Central Limit Theorem, the deviation of the sample mean from $\mu$ must be multiplied by which factor to obtain a distribution that approaches the standard normal?

$\sqrt{n}$ (correct)
$n$
$\frac{1}{\sqrt{n}}$
$\log n$

Foundations of the Central Limit Theorem Quiz Question 19: How does the convergence of the cumulative distribution functions of the standardized sum $S_n^{*}$ to the standard normal cdf $\Phi$ occur?

Uniformly in the argument (correct)
Only pointwise for each fixed $x$
In probability but not uniformly
In the mean‑square sense

What does the central limit theorem assert about the distribution of a normalized sample mean when the sample size becomes large?

1 of 19

Key Concepts

Central Limit Theorems

Central Limit Theorem

Lyapunov Central Limit Theorem

Lindeberg–Feller Central Limit Theorem

Multivariate Central Limit Theorem

Cramér–Wold device

Convergence Theorems

Law of Large Numbers

Uniform convergence of distribution functions

Local limit theorem

Special Distributions

Stable distribution

Definitions

Central Limit Theorem

A fundamental result stating that the normalized sum (or average) of a large number of independent, identically distributed random variables with finite variance converges in distribution to a standard normal distribution.

Law of Large Numbers

A theorem asserting that the sample mean of independent, identically distributed random variables converges almost surely to the true population mean as the sample size grows.

Lyapunov Central Limit Theorem

An extension of the CLT that requires only independence (not identical distribution) and a Lyapunov moment condition to guarantee convergence of the standardized sum to a normal distribution.

Lindeberg–Feller Central Limit Theorem

A more general CLT that replaces Lyapunov’s moment condition with the Lindeberg condition, a weaker requirement on the tails of the summands.

Multivariate Central Limit Theorem

The generalization of the CLT to vector‑valued random variables, stating that the normalized sum of i.i.d. random vectors converges to a multivariate normal distribution.

Cramér–Wold device

A technique that proves convergence in distribution of random vectors by showing convergence of all one‑dimensional linear projections.

Stable distribution

A class of probability distributions that remain stable under convolution; they appear as limiting distributions when the CLT fails due to infinite variance.

Uniform convergence of distribution functions

The property that the cumulative distribution functions of standardized sums converge to the normal cdf uniformly over all real arguments.

Local limit theorem

A refinement of the CLT describing the pointwise convergence of the probability density (or mass) functions of normalized sums to the normal density under additional regularity conditions.