Law of large numbers Study Guide
Study Guide
📖 Core Concepts
Law of Large Numbers (LLN) – As the number of independent, identically distributed (i.i.d.) draws grows, the sample mean
\[
\bar Xn=\frac{1}{n}\sum{i=1}^{n}Xi
\]
converges to the true expected value \(\mu = E[Xi]\).
Weak LLN – Convergence in probability: for any \(\varepsilon>0\),
\[
\Pr\big(|\bar Xn-\mu|>\varepsilon\big)\to 0 \quad (n\to\infty).
\]
Strong LLN – Almost‑sure convergence:
\[
\Pr\!\big(\lim{n\to\infty}\bar Xn=\mu\big)=1.
\]
Minimal requirements – Independence, identical distribution, finite mean. Finite variance is not required (it just makes proofs easier).
Heavy‑tailed failure – Distributions with infinite or undefined mean (e.g., Cauchy, Pareto with \(\alpha<1\)) break the LLN.
Selection bias – Systematic bias is not eliminated by simply increasing \(n\).
Borel’s LLN – Formalizes the intuitive “probability = long‑run relative frequency’’: the proportion of an event’s occurrences converges to its probability almost surely.
---
📌 Must Remember
Weak LLN statement (probability convergence).
Strong LLN statement (almost‑sure convergence).
Chebyshev bound for i.i.d. finite‑variance variables:
\[
\Pr\big(|\bar Xn-\mu|>\varepsilon\big)\le\frac{\sigma^{2}}{n\varepsilon^{2}}.
\]
Kolmogorov strong law condition for non‑identical variables:
\[
\sum{i=1}^{\infty}\frac{\operatorname{Var}(Xi)}{i^{2}}<\infty \;\Longrightarrow\; \bar Xn\to\mu \text{ a.s.}
\]
Heavy‑tailed exception – No LLN convergence when \(E[|X|]=\infty\).
Borel’s law – Relative frequency \(\frac{Nn(E)}{n}\to p\) a.s. for event \(E\) with probability \(p\).
Monte Carlo – Accuracy improves with \(n\) because the estimator obeys the strong LLN.
---
🔄 Key Processes
Check conditions – Independence, i.i.d. (or verify Kolmogorov’s variance sum), finite mean.
Choose law –
Use Weak LLN when only a probabilistic guarantee is needed (e.g., confidence intervals).
Use Strong LLN for almost‑sure statements (e.g., Monte Carlo convergence).
Apply Chebyshev (if variance finite) – Compute \(\sigma^{2}\), plug into bound to estimate required \(n\) for a given \(\varepsilon\).
Monte Carlo integration –
a. Sample \(X1,\dots,Xn\) from target distribution.
b. Compute estimator \(\hat In = \frac{1}{n}\sum f(Xi)\).
c. Invoke Strong LLN → \(\hat In\to I\) a.s. as \(n\) grows.
Empirical probability – Count successes, divide by \(n\); convergence follows from Borel’s LLN.
---
🔍 Key Comparisons
Weak vs. Strong LLN
Weak: “high probability” that \(\bar Xn\) is close to \(\mu\); large deviations can still occur infinitely often.
Strong: “eventually always” after some random \(N\); deviations cease almost surely.
Finite variance needed?
Weak: No – finite mean suffices; finite variance only gives a simple Chebyshev proof.
Strong: No – finite second moment plus variance‑sum condition is enough.
Borel’s LLN vs. Weak LLN
Borel focuses on relative frequencies of a single event; weak LLN deals with sample means of general r.v.’s.
Heavy‑tailed (Cauchy) vs. Light‑tailed
Light‑tailed: mean exists → LLN holds.
Heavy‑tailed with infinite mean → LLN fails.
---
⚠️ Common Misunderstandings
Gambler’s fallacy – Believing a short sequence must “balance out”; LLN only guarantees balance as \(n\to\infty\).
“LLN fixes bias” – Systematic selection bias is not corrected by larger \(n\).
“Finite variance required” – Only a sufficient condition for a simple proof, not a necessity.
Confusing probability convergence with almost‑sure convergence.
Assuming LLN works for Cauchy or other infinite‑mean distributions.
---
🧠 Mental Models / Intuition
“Stabilizing average” – Picture a crowd of random walkers; as more join, the center of mass drifts less and eventually hovers near the true mean.
Weak LLN = “most of the time” – Like a weather forecast that is right 90 % of the time; occasional wrong days remain possible.
Strong LLN = “once settled, never leaves” – After enough steps, the walker never strays far again.
Heavy‑tailed = “wild horse” – No matter how many rides you take, the horse’s jumps (samples) can still be arbitrarily large, preventing settling.
---
🚩 Exceptions & Edge Cases
Cauchy distribution – No finite expectation ⇒ sample mean does not converge.
Pareto with \(\alpha<1\) – Infinite mean ⇒ LLN fails.
Selection bias – Persistent systematic error despite large \(n\).
Non‑i.i.d. but independent – Strong LLN may still hold if \(\sum \operatorname{Var}(Xi)/i^2\) converges.
Varying variances – Weak LLN holds if average variance of the first \(n\) terms → 0.
---
📍 When to Use Which
Weak LLN – Estimating a mean with a confidence statement, when only probability of closeness matters (e.g., quick sanity checks).
Strong LLN – Proving almost‑sure convergence of estimators (Monte Carlo integration, long‑run frequency results).
Chebyshev bound – When you know finite variance and need a concrete sample‑size estimate for a desired \(\varepsilon\).
Kolmogorov’s condition – For independent but non‑identical data; verify \(\sum \operatorname{Var}(Xi)/i^2<\infty\).
Borel’s LLN – When the question asks directly about relative frequency of a single event.
---
👀 Patterns to Recognize
\(1/n\) shrinkage – Variance of the sample mean always appears as \(\sigma^{2}/n\) (or analogous average‑variance term).
Empirical vs. theoretical probability – Look for ratios like \(\frac{Nn(E)}{n}\) converging to a constant \(p\).
Heavy‑tailed flag – Presence of Cauchy, Pareto with \(\alpha<1\), or any distribution lacking a finite first moment ⇒ expect LLN to fail.
Independence cue – Problems that explicitly mention “independent trials” are setting up LLN conditions.
---
🗂️ Exam Traps
“More samples always removes bias” – Distractor that ignores selection bias.
“Finite variance is required for any LLN” – Over‑states the necessity; only needed for the elementary Chebyshev proof.
Confusing weak & strong guarantees – Choosing “probability 1” as the answer for a weak‑law question, or vice‑versa.
Assuming convergence for Cauchy – Many test‑makers include Cauchy as a “counter‑example” to catch this mistake.
Borel vs. Weak LLN – Selecting the weak‑law definition when the question explicitly asks about “relative frequency of a single event”.
---
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or