Expected value Study Guide
Study Guide
📖 Core Concepts
Expected value (EV) – weighted average of a random variable’s outcomes.
Discrete: $E[X]=\sumi xi pi$.
Continuous: $E[X]=\int{-\infty}^{\infty} x\,f(x)\,dx$.
Existence – the series (or integral) must converge absolutely; otherwise $X$ has no finite expectation.
Positive/negative parts – $X^{+}= \max(X,0)$, $X^{-}= -\min(X,0)$.
$E[X]=E[X^{+}]-E[X^{-}]$ (Lebesgue definition).
$E[X]$ finite ⇔ $E[|X|]<\infty$.
Linearity – $E[aX+bY]=aE[X]+bE[Y]$ (when expectations exist).
Monotonicity – If $X\le Y$ a.s., then $E[X]\le E[Y]$.
Indicator variable – $1A$ satisfies $E[1A]=P(A)$.
Variance – $\operatorname{Var}(X)=E[X^{2}]-(E[X])^{2}$.
---
📌 Must Remember
$E[1A]=P(A)$.
$|E[X]|\le E[|X|]$.
Markov: $P(X\ge a)\le \dfrac{E[X]}{a}$ for $X\ge0$.
Chebyshev: $P(|X-E[X]|\ge k\sigma)\le \dfrac1{k^{2}}$.
Jensen: convex $f$ → $f(E[X])\le E[f(X)]$.
Hölder: $E[|XY|]\le (E[|X|^{p}])^{1/p}(E[|Y|^{q}])^{1/q}$, $\frac1p+\frac1q=1$.
Minkowski: $(E[|X+Y|^{p}])^{1/p}\le (E[|X|^{p}])^{1/p}+(E[|Y|^{p}])^{1/p}$.
Sample mean $\bar X$ is unbiased: $E[\bar X]=E[X]$.
Characteristic function: $\varphiX(t)=E[e^{itX}]$, $E[X]=i^{-1}\varphi'X(0)$.
---
🔄 Key Processes
Compute EV for a discrete r.v.
List outcomes $xi$ and probabilities $pi$.
Multiply each $xi pi$, sum them.
Compute EV for a continuous r.v.
Identify pdf $f(x)$.
Integrate $x f(x)$ over the support.
Check existence of EV
Verify absolute convergence of $\sumi |xi|pi$ (discrete) or $\int |x|f(x)dx$ (continuous).
Apply Markov/Chebyshev
Choose $a$ (or $k\sigma$) → plug into inequality.
Use Jensen
Identify convex (or concave) $f$.
Compare $f(E[X])$ with $E[f(X)]$.
---
🔍 Key Comparisons
Markov vs. Chebyshev
Markov: works for any non‑negative $X$, bounds $P(X\ge a)$.
Chebyshev: uses variance, bounds deviation from the mean $P(|X-E[X]|\ge k\sigma)$.
Discrete vs. Continuous EV
Discrete: sum $\sum xi pi$.
Continuous: integral $\int x f(x)dx$.
Linear vs. Non‑linear Transformations
Linearity: $E[aX+b]=aE[X]+b$.
Non‑linear: need Jensen or moment‑generating techniques.
---
⚠️ Common Misunderstandings
“Expectation always exists.” → False; divergence of the series/integral gives no finite EV.
Confusing $E[X^2]$ with $\operatorname{Var}(X)$. → Variance = $E[X^2]-(E[X])^2$, not just $E[X^2]$.
Applying Markov to variables that can be negative. → Markov requires $X\ge0$.
Assuming Jensen works for any function. → It only holds for convex (or concave) $f$.
---
🧠 Mental Models / Intuition
EV = “center of mass” of the probability distribution. Think of each outcome as a weight placed at $xi$; the EV is where the balance point lies.
Markov: If the average height is $h$, the fraction of people taller than $2h$ can’t exceed $1/2$.
Chebyshev: Most data cluster within a few standard deviations; the farther you go, the fewer points can be there (inverse‑square law).
Hölder/Minkowski: Generalizations of Cauchy‑Schwarz; they tell you how “size” (norm) behaves under multiplication/addition.
---
🚩 Exceptions & Edge Cases
Non‑absolute convergence → series may conditionally converge but EV is undefined (e.g., Cauchy distribution).
Infinite variance – Chebyshev requires finite variance; heavy‑tailed distributions (Cauchy) invalidate it.
Indicator expectation – works for any event, even if the event has probability 0 or 1.
---
📍 When to Use Which
Use Markov when you have a non‑negative r.v. and only the mean is known.
Use Chebyshev when variance is known and you need a bound on deviation from the mean.
Use Jensen to bound expectations of convex/concave transformations (e.g., $E[e^{X}]$).
Use Hölder for products of random variables when you have $p$‑th and $q$‑th moments.
Use Minkowski to bound the $p$‑norm of a sum (useful in $L^p$ spaces).
---
👀 Patterns to Recognize
“Mean of a sum = sum of means” → whenever a problem involves $E[X+Y]$, immediately apply linearity.
“Bound on tail probability” → look for $P(X\ge a)$ or $P(|X-E[X]|\ge t)$ → think Markov or Chebyshev.
“Convex function of a r.v.” → Jensen is the go‑to inequality.
“Product of random variables” → check Hölder if moments of each factor are known.
---
🗂️ Exam Traps
Choosing the wrong inequality: Using Markov on a variable that can be negative yields an invalid bound.
Dropping the absolute value in Markov: $P(X\ge a)$, not $P(|X|\ge a)$.
Confusing unbiasedness with consistency: Sample mean is unbiased and converges by LLN, but unbiasedness alone doesn’t guarantee convergence for small $n$.
Miscalculating variance: Forget the square on the mean term: $\operatorname{Var}(X)=E[X^2]-(E[X])^2$, not $E[X]^2-E[X^2]$.
Assuming Jensen works both ways: For concave $f$, inequality reverses: $f(E[X])\ge E[f(X)]$.
---
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or