Subjects/Math/Statistics and Discrete Math/Probability/Central limit theorem

Central limit theorem Study Guide

Study Guide

📖 Core Concepts Central Limit Theorem (CLT) – As the sample size \(n\) grows, the distribution of the normalized sample mean (or sum) approaches a standard normal distribution, regardless of the original population shape (provided certain conditions hold). Standardized sum/mean – For i.i.d. variables \(Xi\) with mean \(\mu\) and variance \(\sigma^2\): \[ Zn = \frac{\sqrt{n}\,(\bar Xn-\mu)}{\sigma} \quad\text{or}\quad Sn^{}= \frac{\sum{i=1}^{n}Xi-n\mu}{\sigma\sqrt{n}} \] Both converge in distribution to \(N(0,1)\). Independence & Identical Distribution (i.i.d.) – Classical CLT requires each \(Xi\) to be independent and share the same mean/variance. Lyapunov & Lindeberg Conditions – Weaker requirements that allow non‑identical independent variables; they control higher‑order moments to guarantee normal convergence. Multivariate CLT – Extends the result to random vectors; the normalized sum of i.i.d. vectors converges to a multivariate normal \(N(\mathbf 0,\Sigma)\). Law of Large Numbers (LLN) – Guarantees \(\bar Xn \to \mu\) (consistency); CLT describes the rate and shape of fluctuations around \(\mu\). 📌 Must Remember Classical CLT formula: \(\displaystyle Sn^{}= \frac{\sum{i=1}^{n}Xi - n\mu}{\sigma\sqrt{n}} \xrightarrow{d} N(0,1)\). Lyapunov condition (for some \(r>2\)): \[ \lim{n\to\infty}\frac{1}{sn^{r}}\sum{i=1}^{n}E\!\left[|Xi-\mui|^{\,r}\right]=0, \quad sn^{2}= \sum{i=1}^{n}\operatorname{Var}(Xi) \] Lindeberg condition (for every \(\varepsilon>0\)): \[ \frac{1}{sn^{2}}\sum{i=1}^{n}E\!\left[(Xi-\mui)^{2}\mathbf{1}{\{|Xi-\mui|>\varepsilon sn\}}\right]\to 0 . \] Implication hierarchy: Lyapunov ⇒ Lindeberg ⇒ CLT (but not conversely). “\(n>30\)” rule – Not a theorem; only a rough rule of thumb that can fail for highly skewed or heavy‑tailed data. Failure case – If \(\operatorname{Var}(Xi)=\infty\), CLT does not apply; stable laws (e.g., Cauchy) take over. 🔄 Key Processes Verify assumptions Check independence. For classical CLT: confirm identical distribution and finite variance. If not identical, decide between Lyapunov or Lindeberg verification. Compute standardized statistic Find \(\mu\) and \(\sigma\) (or \(\mui, \sigmai\) for non‑identical). Form \(Zn = \frac{\sqrt{n}(\bar Xn-\mu)}{\sigma}\) (or the sum version). Apply normal approximation Use \(Zn \approx N(0,1)\) for large \(n\). Translate to confidence intervals or hypothesis tests as needed. Multivariate case Compute sample mean vector \(\bar{\mathbf X}n\) and covariance matrix \(\Sigma\). Form \(\mathbf Sn^{}= \frac{1}{\sqrt{n}}\sum{i=1}^{n}(\mathbf Xi-\boldsymbol\mu)\) and treat as \(N(\mathbf 0,\Sigma)\). 🔍 Key Comparisons Classical CLT vs. Lyapunov CLT Classical: i.i.d., finite variance only. Lyapunov: independent, possibly non‑identical; requires a higher‑order moment condition. Lyapunov vs. Lindeberg Lyapunov: stronger; checks a single \(r\)-th moment bound. Lindeberg: weaker; checks tail contribution for every \(\varepsilon\). Sample‑mean CLT vs. Original Distribution Sample mean: becomes normal as \(n\) grows. Original data: retains its original shape; CLT does not make raw data normal. ⚠️ Common Misunderstandings “All data become normal” – CLT only concerns the distribution of the average (or sum), not each observation. Universal “\(n>30\)” – No fixed cutoff; convergence speed depends on skewness, kurtosis, and tail heaviness. Normality of errors in regression – The CLT justifies the approximate normality of OLS error terms; it is not an assumption of the model itself. Infinite variance – Assuming CLT works with Cauchy‑type data is wrong; the limit is a stable distribution, not normal. 🧠 Mental Models / Intuition “Pile of sand” analogy: Each observation is a grain; piling many grains (adding) smooths out irregularities, producing a bell‑shaped mound (normal). Projection view: In the multivariate case, any linear combination (projection) of the vector sum behaves like the one‑dimensional CLT – that’s the Cramér–Wold device. Scaling by \(\sqrt{n}\): Fluctuations shrink like \(1/\sqrt{n}\); think of a jitter that gets finer as you average more points. 🚩 Exceptions & Edge Cases Heavy‑tailed distributions (e.g., Cauchy) → infinite variance → CLT fails. Strong dependence (e.g., time series with long‑range correlation) → independence assumption violated; specialized CLTs needed. Small sample from a highly skewed population → normal approximation may be poor even if \(n=30\). 📍 When to Use Which i.i.d., finite variance → apply Classical CLT (simplest). Independent, non‑identical, finite variance → check Lyapunov first (easier to verify with known moments); if Lyapunov fails, test Lindeberg condition. Multivariate data → use Multivariate CLT; compute mean vector & covariance, then treat linear combos as normal. When variance unknown → replace \(\sigma\) with sample standard deviation \(s\); still approximate \(t\)-distribution for moderate \(n\). 👀 Patterns to Recognize Sum of many independent terms → look for a CLT cue. Presence of \(\sqrt{n}\) in the denominator → classic sign of a normalized CLT expression. Tail‑control statements (e.g., “\(|Xi-\mui| > \varepsilon sn\)”) → Lindeberg condition is being invoked. Covariance matrix \(\Sigma\) in a limit → multivariate CLT is in play. 🗂️ Exam Traps Choosing “\(n>30\)” as a guarantee – exam may present a highly skewed distribution; answer will note that the rule is not universal. Confusing CLT with LLN – a distractor may claim CLT “ensures the sample mean equals the population mean”; correct answer emphasizes distribution shape, not convergence to the mean. Using CLT when variance is infinite – a common wrong choice; the correct response points out the need for a stable law instead. Applying classical CLT to non‑i.i.d. data – a trap; the right answer will invoke Lyapunov or Lindeberg conditions. --- Study tip: Memorize the two core formulas (classical CLT and Lyapunov condition) and the hierarchy i.i.d. → Classical CLT → Lyapunov → Lindeberg. When a problem mentions “different distributions” or “different variances,” immediately think Lyapunov/Lindeberg rather than the classical statement.

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.

Start learning in seconds

Drop your PDFs here or