Subjects/Math/Statistics and Discrete Math/Statistics/Normal distribution

Normal distribution Study Guide

Study Guide

📖 Core Concepts Normal density: $f(x)=\dfrac{1}{\sqrt{2\pi\sigma^{2}}}\exp\!\Bigl[-\dfrac{(x-\mu)^{2}}{2\sigma^{2}}\Bigr]$; $\mu$ = mean = median = mode, $\sigma^{2}$ = variance, $\sigma=\sqrt{\sigma^{2}}$. Standard normal: special case $\mu=0,\;\sigma^{2}=1$, density $\phi(z)=\dfrac{1}{\sqrt{2\pi}}e^{-z^{2}/2}$. Standardization: $Z=\dfrac{X-\mu}{\sigma}\sim N(0,1)$; reverse $X=\mu+\sigma Z$. CDF: $\Phi(x)=\int{-\infty}^{x}\phi(t)\,dt=\tfrac12[1+\operatorname{erf}(x/\sqrt2)]$. For $X\sim N(\mu,\sigma^{2})$, $FX(x)=\Phi\!\bigl(\tfrac{x-\mu}{\sigma}\bigr)$. Q‑function: $Q(x)=1-\Phi(x)=\int{x}^{\infty}\phi(t)\,dt$. Empirical (68‑95‑99.7) rule: $P(\mu\pm\sigma)=0.6827$, $P(\mu\pm2\sigma)=0.9545$, $P(\mu\pm3\sigma)=0.9973$. Linear transformation: $Y=aX+b\;\Rightarrow\;Y\sim N(a\mu+b,\;a^{2}\sigma^{2})$. Sum/Difference of independent normals: $X+Y\sim N(\muX+\muY,\;\sigmaX^{2}+\sigmaY^{2})$, $X-Y\sim N(\muX-\muY,\;\sigmaX^{2}+\sigmaY^{2})$. Moment‑generating function: $MX(t)=\exp\!\bigl(\mu t+\tfrac12\sigma^{2}t^{2}\bigr)$. Characteristic function: $\varphiX(t)=\exp\!\bigl(i\mu t-\tfrac12\sigma^{2}t^{2}\bigr)$. CLT: $\displaystyle\frac{1}{\sqrt{n}}\sum{i=1}^{n}Xi\xrightarrow{d}N(0,\sigma^{2})$ for i.i.d. $Xi$ with finite variance. Parameter estimation (normal data): $\hat\mu=\bar{x}=\frac1n\sum xi$, $\hat\sigma^{2}{\text{MLE}}=\frac1n\sum (xi-\bar{x})^{2}$, unbiased $s^{2}=\frac1{n-1}\sum (xi-\bar{x})^{2}$. Confidence interval for $\mu$ (unknown $\sigma$): $\displaystyle \bar{x}\pm t{\alpha/2,n-1}\,\frac{s}{\sqrt{n}}$ (Student‑$t$). For $\sigma^{2}$: chi‑square limits $\displaystyle \frac{(n-1)s^{2}}{\chi^{2}{\alpha/2,n-1}}<\sigma^{2}<\frac{(n-1)s^{2}}{\chi^{2}{1-\alpha/2,n-1}}$. Bayesian conjugate priors: Known $\sigma^{2}$ → normal prior for $\mu$, posterior normal with precision‑weighted mean. Known $\mu$ → scaled‑inverse‑$\chi^{2}$ prior for $\sigma^{2}$. Both unknown → normal–inverse‑gamma prior, updated as in the outline. 📌 Must Remember $f(x)$ integrates to 1 only if multiplied by $1/\sigma$ when scaling. $\Phi^{-1}(p)=\sqrt{2}\,\operatorname{erf}^{-1}(2p-1)$ (probit function). $z{0.975}=1.96$ gives a two‑sided 5 % test. All odd central moments of a normal are zero. Only the first two cumulants are non‑zero: $\kappa{1}=\mu$, $\kappa{2}=\sigma^{2}$. $t{n-1}$ and $\chi^{2}{n-1}$ are pivotal because $\hat\mu$ and $s^{2}$ are independent (Cochran’s theorem). In linear combinations $\sum ai Xi$, variance adds as $\sum ai^{2}\sigmai^{2}$ (no covariances for independence). 🔄 Key Processes Standardizing a value Compute $z=\dfrac{x-\mu}{\sigma}$. Use $\Phi(z)$ (tables/software) for $P(X\le x)$. Finding a two‑sided CI for $\mu$ Compute $\bar{x}$ and $s$. Look up $t{\alpha/2,n-1}$. Apply $\bar{x}\pm t\,s/\sqrt{n}$. Updating a normal mean posterior (known variance) Compute prior precision $\tau0=1/\tau0^{2}$, data precision $n\tau$. Posterior mean $\mun=(\tau0\mu0+n\tau\bar{x})/(\tau0+n\tau)$. Posterior precision $\taun=\tau0+n\tau$. Generating two independent $N(0,1)$ variables (Box–Muller) Draw $U,V\sim U(0,1)$. Compute $Z1=\sqrt{-2\ln U}\cos(2\pi V)$, $Z2=\sqrt{-2\ln U}\sin(2\pi V)$. Applying CLT to approximate a binomial For $B(n,p)$, set $\mu=np$, $\sigma=\sqrt{np(1-p)}$. Approximate $P(B\le k)\approx\Phi\!\bigl((k+0.5-\mu)/\sigma\bigr)$ (continuity correction). 🔍 Key Comparisons Standard normal vs. General normal $Z\sim N(0,1)$ vs. $X\sim N(\mu,\sigma^{2})$; $X=\mu+\sigma Z$. MLE variance vs. unbiased variance $\hat\sigma^{2}{\text{MLE}}=\frac1n\sum (xi-\bar{x})^{2}$ (biased, lower MSE) $s^{2}=\frac1{n-1}\sum (xi-\bar{x})^{2}$ (unbiased). t‑test vs. Z‑test Use $t$ when $\sigma$ unknown and $n$ small (Student‑$t$). Use $Z$ when $\sigma$ known or $n$ large (normal approx). Box–Muller vs. Marsaglia Polar Box–Muller uses two log/ trig evaluations; Polar avoids trig by rejection but needs extra uniform draws. Normal–Inverse‑Gamma vs. Separate Normal & Inverse‑Gamma Joint prior couples $\mu$ and $\sigma^{2}$; separate priors treat them independently (not conjugate). ⚠️ Common Misunderstandings “σ is the standard deviation” – do not confuse $\sigma^{2}$ (variance) with $\sigma$. Using $\Phi$ tables for non‑standard normal – always standardize first. Assuming independence of $\hat\mu$ and $s^{2}$ holds for any distribution – true only for normal data. Treating the empirical rule as exact – it’s an approximation; exact probabilities differ slightly (see exact values above). Thinking the CLT guarantees normality for any sample size – convergence can be slow for highly skewed parent distributions; check with normality tests. 🧠 Mental Models / Intuition “Stretch‑and‑Shift”: Think of the standard bell curve as a rubber sheet; stretching horizontally by $\sigma$ widens it, then sliding left/right by $\mu$ positions the center. Standardization as “z‑score”: Measures how many σ’s a value is from its mean; a universal ruler for any normal. Sum of normals → normal: Adding independent “noisy” quantities never creates new shapes; the noise just adds up. CLT “averaging out”: Many small, independent pushes in different directions produce a symmetric bell‑shaped result, regardless of the original shape. 🚩 Exceptions & Edge Cases Sum of dependent normals – variance adds plus $2\operatorname{Cov}(X,Y)$; independence is required for simple variance addition. Finite‑sample $t$ vs. large‑sample $Z$ – for $n\le30$, $t$ critical values differ noticeably from $Z$. Normality tests with estimated parameters – Lilliefors adjusts KS critical values; standard KS tables are invalid. Generating normals with the CLT approximation – 12‑uniform‑sum method is only an approximation; tails are lighter than true normal. 📍 When to Use Which Exact normal probabilities → use $\Phi$ (or software) after standardizing. Approximate binomial/Poisson probabilities → apply CLT with continuity correction. Confidence interval for mean, unknown σ → Student‑$t$ interval. Confidence interval for variance → chi‑square interval. Bayesian update of mean, known σ → normal‑conjugate prior (use precision‑weighted average). Generating random normals in code → prefer Box–Muller (simple) or Marsaglia Polar (no trig) unless a library provides randn. 👀 Patterns to Recognize “σ appears under a square root” in standard errors: $\text{SE} = \sigma/\sqrt{n}$. Quadratic exponent $-\tfrac12(\cdot)^2$ signals a normal density. Even‑order central moments proportional to $\sigma^{2k}$; odd moments zero. Linear combination → normal: whenever you see $\sum ai Xi$ with independent normals, the result is normal. “$+$” in exponent of MGF/characteristic function → mean term; “$-$” quadratic term → variance. 🗂️ Exam Traps Choosing $Z{0.975}=1.96$ for a one‑sided test – the correct one‑sided critical value is $z{0.95}=1.645$. Using $n$ instead of $n-1$ in sample variance for an unbiased CI – leads to under‑coverage. Applying the empirical 68‑95‑99.7 rule to a non‑normal dataset – the rule only holds for exact normals. Confusing $Q(x)$ with $1-\Phi(x)$ sign – $Q(x)$ is the upper tail probability, not the lower. Assuming independence when summing normals with common source – hidden covariance inflates variance. --- All statements are drawn directly from the provided outline.

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.

Start learning in seconds

Drop your PDFs here or