Normal distribution Study Guide
Study Guide
📖 Core Concepts
Normal density:
$f(x)=\dfrac{1}{\sqrt{2\pi\sigma^{2}}}\exp\!\Bigl[-\dfrac{(x-\mu)^{2}}{2\sigma^{2}}\Bigr]$;
$\mu$ = mean = median = mode, $\sigma^{2}$ = variance, $\sigma=\sqrt{\sigma^{2}}$.
Standard normal: special case $\mu=0,\;\sigma^{2}=1$, density $\phi(z)=\dfrac{1}{\sqrt{2\pi}}e^{-z^{2}/2}$.
Standardization: $Z=\dfrac{X-\mu}{\sigma}\sim N(0,1)$; reverse $X=\mu+\sigma Z$.
CDF: $\Phi(x)=\int{-\infty}^{x}\phi(t)\,dt=\tfrac12[1+\operatorname{erf}(x/\sqrt2)]$.
For $X\sim N(\mu,\sigma^{2})$, $FX(x)=\Phi\!\bigl(\tfrac{x-\mu}{\sigma}\bigr)$.
Q‑function: $Q(x)=1-\Phi(x)=\int{x}^{\infty}\phi(t)\,dt$.
Empirical (68‑95‑99.7) rule:
$P(\mu\pm\sigma)=0.6827$, $P(\mu\pm2\sigma)=0.9545$, $P(\mu\pm3\sigma)=0.9973$.
Linear transformation: $Y=aX+b\;\Rightarrow\;Y\sim N(a\mu+b,\;a^{2}\sigma^{2})$.
Sum/Difference of independent normals:
$X+Y\sim N(\muX+\muY,\;\sigmaX^{2}+\sigmaY^{2})$,
$X-Y\sim N(\muX-\muY,\;\sigmaX^{2}+\sigmaY^{2})$.
Moment‑generating function: $MX(t)=\exp\!\bigl(\mu t+\tfrac12\sigma^{2}t^{2}\bigr)$.
Characteristic function: $\varphiX(t)=\exp\!\bigl(i\mu t-\tfrac12\sigma^{2}t^{2}\bigr)$.
CLT: $\displaystyle\frac{1}{\sqrt{n}}\sum{i=1}^{n}Xi\xrightarrow{d}N(0,\sigma^{2})$ for i.i.d. $Xi$ with finite variance.
Parameter estimation (normal data):
$\hat\mu=\bar{x}=\frac1n\sum xi$,
$\hat\sigma^{2}{\text{MLE}}=\frac1n\sum (xi-\bar{x})^{2}$,
unbiased $s^{2}=\frac1{n-1}\sum (xi-\bar{x})^{2}$.
Confidence interval for $\mu$ (unknown $\sigma$):
$\displaystyle \bar{x}\pm t{\alpha/2,n-1}\,\frac{s}{\sqrt{n}}$ (Student‑$t$).
For $\sigma^{2}$: chi‑square limits
$\displaystyle \frac{(n-1)s^{2}}{\chi^{2}{\alpha/2,n-1}}<\sigma^{2}<\frac{(n-1)s^{2}}{\chi^{2}{1-\alpha/2,n-1}}$.
Bayesian conjugate priors:
Known $\sigma^{2}$ → normal prior for $\mu$, posterior normal with precision‑weighted mean.
Known $\mu$ → scaled‑inverse‑$\chi^{2}$ prior for $\sigma^{2}$.
Both unknown → normal–inverse‑gamma prior, updated as in the outline.
📌 Must Remember
$f(x)$ integrates to 1 only if multiplied by $1/\sigma$ when scaling.
$\Phi^{-1}(p)=\sqrt{2}\,\operatorname{erf}^{-1}(2p-1)$ (probit function).
$z{0.975}=1.96$ gives a two‑sided 5 % test.
All odd central moments of a normal are zero.
Only the first two cumulants are non‑zero: $\kappa{1}=\mu$, $\kappa{2}=\sigma^{2}$.
$t{n-1}$ and $\chi^{2}{n-1}$ are pivotal because $\hat\mu$ and $s^{2}$ are independent (Cochran’s theorem).
In linear combinations $\sum ai Xi$, variance adds as $\sum ai^{2}\sigmai^{2}$ (no covariances for independence).
🔄 Key Processes
Standardizing a value
Compute $z=\dfrac{x-\mu}{\sigma}$.
Use $\Phi(z)$ (tables/software) for $P(X\le x)$.
Finding a two‑sided CI for $\mu$
Compute $\bar{x}$ and $s$.
Look up $t{\alpha/2,n-1}$.
Apply $\bar{x}\pm t\,s/\sqrt{n}$.
Updating a normal mean posterior (known variance)
Compute prior precision $\tau0=1/\tau0^{2}$, data precision $n\tau$.
Posterior mean $\mun=(\tau0\mu0+n\tau\bar{x})/(\tau0+n\tau)$.
Posterior precision $\taun=\tau0+n\tau$.
Generating two independent $N(0,1)$ variables (Box–Muller)
Draw $U,V\sim U(0,1)$.
Compute $Z1=\sqrt{-2\ln U}\cos(2\pi V)$, $Z2=\sqrt{-2\ln U}\sin(2\pi V)$.
Applying CLT to approximate a binomial
For $B(n,p)$, set $\mu=np$, $\sigma=\sqrt{np(1-p)}$.
Approximate $P(B\le k)\approx\Phi\!\bigl((k+0.5-\mu)/\sigma\bigr)$ (continuity correction).
🔍 Key Comparisons
Standard normal vs. General normal
$Z\sim N(0,1)$ vs. $X\sim N(\mu,\sigma^{2})$; $X=\mu+\sigma Z$.
MLE variance vs. unbiased variance
$\hat\sigma^{2}{\text{MLE}}=\frac1n\sum (xi-\bar{x})^{2}$ (biased, lower MSE)
$s^{2}=\frac1{n-1}\sum (xi-\bar{x})^{2}$ (unbiased).
t‑test vs. Z‑test
Use $t$ when $\sigma$ unknown and $n$ small (Student‑$t$).
Use $Z$ when $\sigma$ known or $n$ large (normal approx).
Box–Muller vs. Marsaglia Polar
Box–Muller uses two log/ trig evaluations; Polar avoids trig by rejection but needs extra uniform draws.
Normal–Inverse‑Gamma vs. Separate Normal & Inverse‑Gamma
Joint prior couples $\mu$ and $\sigma^{2}$; separate priors treat them independently (not conjugate).
⚠️ Common Misunderstandings
“σ is the standard deviation” – do not confuse $\sigma^{2}$ (variance) with $\sigma$.
Using $\Phi$ tables for non‑standard normal – always standardize first.
Assuming independence of $\hat\mu$ and $s^{2}$ holds for any distribution – true only for normal data.
Treating the empirical rule as exact – it’s an approximation; exact probabilities differ slightly (see exact values above).
Thinking the CLT guarantees normality for any sample size – convergence can be slow for highly skewed parent distributions; check with normality tests.
🧠 Mental Models / Intuition
“Stretch‑and‑Shift”: Think of the standard bell curve as a rubber sheet; stretching horizontally by $\sigma$ widens it, then sliding left/right by $\mu$ positions the center.
Standardization as “z‑score”: Measures how many σ’s a value is from its mean; a universal ruler for any normal.
Sum of normals → normal: Adding independent “noisy” quantities never creates new shapes; the noise just adds up.
CLT “averaging out”: Many small, independent pushes in different directions produce a symmetric bell‑shaped result, regardless of the original shape.
🚩 Exceptions & Edge Cases
Sum of dependent normals – variance adds plus $2\operatorname{Cov}(X,Y)$; independence is required for simple variance addition.
Finite‑sample $t$ vs. large‑sample $Z$ – for $n\le30$, $t$ critical values differ noticeably from $Z$.
Normality tests with estimated parameters – Lilliefors adjusts KS critical values; standard KS tables are invalid.
Generating normals with the CLT approximation – 12‑uniform‑sum method is only an approximation; tails are lighter than true normal.
📍 When to Use Which
Exact normal probabilities → use $\Phi$ (or software) after standardizing.
Approximate binomial/Poisson probabilities → apply CLT with continuity correction.
Confidence interval for mean, unknown σ → Student‑$t$ interval.
Confidence interval for variance → chi‑square interval.
Bayesian update of mean, known σ → normal‑conjugate prior (use precision‑weighted average).
Generating random normals in code → prefer Box–Muller (simple) or Marsaglia Polar (no trig) unless a library provides randn.
👀 Patterns to Recognize
“σ appears under a square root” in standard errors: $\text{SE} = \sigma/\sqrt{n}$.
Quadratic exponent $-\tfrac12(\cdot)^2$ signals a normal density.
Even‑order central moments proportional to $\sigma^{2k}$; odd moments zero.
Linear combination → normal: whenever you see $\sum ai Xi$ with independent normals, the result is normal.
“$+$” in exponent of MGF/characteristic function → mean term; “$-$” quadratic term → variance.
🗂️ Exam Traps
Choosing $Z{0.975}=1.96$ for a one‑sided test – the correct one‑sided critical value is $z{0.95}=1.645$.
Using $n$ instead of $n-1$ in sample variance for an unbiased CI – leads to under‑coverage.
Applying the empirical 68‑95‑99.7 rule to a non‑normal dataset – the rule only holds for exact normals.
Confusing $Q(x)$ with $1-\Phi(x)$ sign – $Q(x)$ is the upper tail probability, not the lower.
Assuming independence when summing normals with common source – hidden covariance inflates variance.
---
All statements are drawn directly from the provided outline.
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or