Case–control study Study Guide
Study Guide
📖 Core Concepts
Case‑control study – Observational design that starts with outcome status (cases = disease, controls = no disease) and looks backward to compare exposure frequencies.
Odds Ratio (OR) – Primary effect measure; quantifies how much more (or less) likely cases were exposed compared to controls.
Rare‑disease assumption – When the disease is uncommon, the OR ≈ relative risk (RR).
Control selection – Controls must come from the same source population that gave rise to the cases and be chosen independently of exposure.
Prospective vs. retrospective –
Prospective: identify cases/controls now, then follow forward for future exposures.
Retrospective: identify existing cases/controls and look back for past exposures.
Matching / adjustment – Strategies (e.g., matching on age, sex) used to control confounding.
Bias sources – Selection bias, recall (misclassification) bias, loss to follow‑up.
---
📌 Must Remember
OR formula (exposure odds in cases ÷ exposure odds in controls):
$$OR = \frac{a/c}{b/d} = \frac{ad}{bc}$$
where a = exposed cases, b = exposed controls, c = unexposed cases, d = unexposed controls.
Control‑to‑case ratio: up to 4:1 improves power without huge cost.
Rare disease → OR ≈ RR; otherwise OR may over‑estimate the true risk.
Controls ≠ necessarily healthy; they can have other diseases as long as it’s not the disease under study.
Prospective case‑control → less bias than retrospective.
Logistic regression provides adjusted ORs for multiple covariates.
Direct estimation of RR, RD, incidence rates from case‑control data is biased (unless special designs are used).
---
🔄 Key Processes
Define source population → ensure cases and controls arise from the same base.
Select cases (all with disease of interest).
Select controls (independent of exposure, matched or unmatched).
Determine exposure status for each participant (blinded if possible).
Compute 2 × 2 table → calculate OR.
If multiple confounders → fit logistic regression, obtain adjusted ORs.
Interpret:
OR > 1 → exposure associated with higher odds of disease.
OR = 1 → no association.
OR < 1 → possible protective effect.
---
🔍 Key Comparisons
Case‑control vs. Cohort
Start point: outcome vs. exposure.
Time direction: backward vs. forward.
Efficiency: better for rare diseases; cohort better for common outcomes.
Prospective vs. Retrospective case‑control
Bias: prospective → fewer recall/selection biases.
Timing: prospective follows forward after enrollment; retrospective relies on past records.
Controls vs. Healthy subjects
Controls may have other diseases; they are not required to be completely healthy.
---
⚠️ Common Misunderstandings
“OR equals RR always” – True only under the rare‑disease assumption or special designs.
“Controls must be disease‑free” – They must lack the disease of interest, but can have other conditions.
“More than 4 controls per case always adds power” – Gains plateau after 4:1; extra controls give diminishing returns.
“Case‑control can directly give incidence rates” – Standard designs cannot; incidence requires cohort data.
---
🧠 Mental Models / Intuition
“Backwards detective” – Imagine a crime scene (disease) and you interview suspects (exposures); you compare how many suspects have a particular motive (exposure) among victims vs. non‑victims.
“Odds as a ratio of probabilities” – Think of odds as “how many exposed vs. unexposed” within each group; the OR is simply the ratio of those odds.
---
🚩 Exceptions & Edge Cases
Nested case‑control and case‑cohort designs allow the OR to estimate RR without needing the rare‑disease assumption.
Matching can introduce over‑matching if controls are matched on variables that are actually part of the exposure pathway, potentially masking true associations.
---
📍 When to Use Which
Use case‑control when the disease is rare or when you need a quick, inexpensive study.
Choose prospective case‑control if you can recruit and follow participants forward, to reduce recall bias.
Apply logistic regression when you have multiple confounders or want adjusted effect estimates.
Opt for a nested case‑control within an existing cohort if you already have a well‑defined source population and want RR‑type interpretation.
---
👀 Patterns to Recognize
2 × 2 table pattern: high exposure frequency in cases and low in controls → OR > 1.
Recall bias clue: exposure information collected only from cases (or via self‑report) → suspect over‑estimation of association.
Selection bias clue: controls drawn from a different setting (e.g., hospital patients with unrelated diseases) → may distort OR.
---
🗂️ Exam Traps
Distractor: “RR can be directly calculated from a case‑control study.” – Wrong; only OR is directly estimable (unless special designs).
Distractor: “More than 5 controls per case always improves power.” – Power gains plateau around 4:1; extra controls add little.
Distractor: “Matching eliminates the need for statistical adjustment.” – Matching controls confounding for matched variables but does not remove the need for adjustment of other covariates.
Distractor: “A non‑significant OR means no association.” – May be due to insufficient power or bias; consider confidence intervals and study limitations.
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or