Introduction to Hypothesis Tests
Understand the core concepts of hypothesis testing, how to make decisions using p‑values and significance levels, and how to manage errors while improving test power.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What is the systematic method used by statisticians to decide if a claim about a population is believable based on sample data?
1 of 17
Summary
Foundations of Hypothesis Testing
What is Hypothesis Testing?
Hypothesis testing is a systematic statistical procedure that allows us to make decisions about population claims based on sample data. Rather than measuring an entire population (which is usually impossible), we collect a sample, analyze it, and use the results to decide whether a claim about the population is reasonable.
Think of it as a legal proceeding: we start by assuming a defendant is innocent (the claim we're testing), examine the evidence (our sample data), and decide whether that evidence is convincing enough to reject our initial assumption.
The Two Hypotheses
Every hypothesis test involves exactly two competing hypotheses:
The Null Hypothesis ($H0$) represents the status quo—the claim we're skeptical of. It typically states that a population parameter equals a specific value, that there is no effect, or that nothing has changed. For example: "The average height of adult males is 70 inches" or "This new drug has no effect on blood pressure."
The Alternative Hypothesis ($Ha$ or $H1$) represents what we're trying to find evidence for. It states that the population parameter differs from, is greater than, or is less than the value specified in the null hypothesis. It's the hypothesis we hope our data will support.
An important point: the alternative hypothesis is never neutral—it always represents either a specific direction (greater than or less than) or a general difference (not equal to).
Test Statistics and Sampling Distributions
A test statistic is a single number that summarizes how far our sample result is from what we'd expect if the null hypothesis were true. Common test statistics include the t-statistic and the z-statistic.
The key insight is this: if the null hypothesis is true, the test statistic follows a known probability distribution (called the sampling distribution under the null hypothesis). This allows us to calculate probabilities and make informed decisions.
When the null hypothesis is true but our sample happens to give us an unusual result, the test statistic will be far from zero. When the null hypothesis is true and our sample is typical, the test statistic will be close to zero. By measuring how extreme our observed test statistic is within this known distribution, we can assess whether our data are surprising or typical.
Decision Rules in Hypothesis Testing
The p-value: Your Decision Guide
The p-value is perhaps the most important concept in hypothesis testing. It answers this question: If the null hypothesis were true, what is the probability of observing a test statistic at least as extreme as the one we actually got?
Notice the phrasing: it's not "the probability the null hypothesis is true." Rather, it's calculated assuming the null hypothesis is true, and then tells us how unlikely our data would be under that assumption.
In the image above, the bell curve represents the sampling distribution if the null hypothesis were true. The shaded red area represents the p-value—the probability of observing data as extreme or more extreme than what we got. A small p-value means our observed data would be very unlikely if the null hypothesis were true.
Significance Level and Decision Rules
Before collecting data, we choose a significance level (denoted $\alpha$), which is typically set at 0.05. This is our threshold for decision-making.
The decision rule is simple:
If $p\text{-value} \leq \alpha$: We reject the null hypothesis in favor of the alternative hypothesis. This means our data are surprising enough under the null hypothesis that we don't believe it's true.
If $p\text{-value} > \alpha$: We fail to reject the null hypothesis. This means our data are not unusual enough to convince us the null hypothesis is false, though this doesn't mean it's true.
A Critical Language Point
This language distinction matters: we say "fail to reject" rather than "accept." Why? Failing to reject the null hypothesis simply means the data didn't provide strong enough evidence against it—it doesn't prove the null hypothesis is actually true. There's an important difference between "not finding evidence against something" and "proving something is true."
Types of Errors and Power
Understanding Type I and Type II Errors
Because we're making decisions based on sample data (not the entire population), we can make mistakes:
Type I Error: We reject the null hypothesis when it is actually true. This is a "false positive"—we conclude there's an effect when there really isn't one. The probability of making a Type I error equals our chosen significance level: $P(\text{Type I Error}) = \alpha$.
Type II Error: We fail to reject the null hypothesis when it is actually false. This is a "false negative"—we miss a real effect because our sample didn't provide enough evidence.
Notice that lowering $\alpha$ to reduce Type I errors actually increases the risk of Type II errors, and vice versa. We can't eliminate both risks simultaneously—there's a trade-off.
Test Power
Power is the probability of correctly rejecting a false null hypothesis. In other words, it's the probability of finding an effect when one truly exists. Power equals $1 - P(\text{Type II Error})$.
High power is desirable because it means our test is likely to detect a real effect when it's there. We can increase power by:
Increasing sample size: Larger samples provide more stable estimates and more extreme test statistics
Increasing the effect size: When the true effect is larger, it's easier to detect
Using a more sensitive test statistic: Some tests are better designed for specific situations
Practical Applications and Procedure
Step-by-Step Hypothesis Testing Procedure
Here's how to conduct a hypothesis test:
Formulate the hypotheses: Write out $H0$ and $Ha$ clearly. The null hypothesis should specify a single value for the parameter.
Collect data and compute the test statistic: Gather a random sample and calculate the appropriate test statistic based on your data and hypotheses.
Determine the sampling distribution: Identify what probability distribution the test statistic follows when $H0$ is true (this depends on your test choice and sample characteristics).
Calculate the p-value: Use the sampling distribution to find the probability of observing a test statistic at least as extreme as yours.
Compare to significance level: Is your p-value less than or equal to $\alpha$?
Make a decision: Reject or fail to reject $H0$ based on the comparison.
Interpret in context: Explain what your decision means for the original research question. Don't just report the statistics—explain what they mean in plain language.
Choosing the Right Test Statistic
The choice between a t-statistic and a z-statistic depends on what information you have:
Use a t-statistic when the population standard deviation is unknown and the sample size is small (typically $n < 30$)
Use a z-statistic when the population standard deviation is known or the sample size is large ($n \geq 30$)
The Role of Random Sampling
We emphasize random sampling because it ensures your sample is representative of the population. When your sample is representative, the theoretical sampling distribution accurately describes your test statistic's behavior, making your p-value calculation reliable. Without random sampling, the sampling distribution assumptions may not hold, and your conclusions could be invalid.
Flashcards
What is the systematic method used by statisticians to decide if a claim about a population is believable based on sample data?
Hypothesis testing
What are the standard steps involved in the hypothesis testing procedure?
Formulate the null and alternative hypotheses
Collect a random sample and compute the test statistic
Determine the sampling distribution under the null hypothesis
Calculate the p-value
Compare the p-value to the significance level $\alpha$ (alpha)
Reject or fail to reject the null hypothesis
Interpret the result in the research context
Which hypothesis represents the status-quo or "no effect" claim and states a population parameter equals a specific value?
The null hypothesis
Why is the null hypothesis never "accepted" outright based on a p-value comparison?
Because we only have enough evidence to either reject it or fail to reject it
Which hypothesis represents the claim that a population parameter is different, greater, or smaller than the value in the null hypothesis?
The alternative hypothesis
What does a test statistic, such as a $t$ or $z$ value, measure in the context of hypothesis testing?
How far the observed data are from what is expected if the null hypothesis were true
Under what conditions should a $t$ value be used as the test statistic rather than a $z$ value?
When the population standard deviation is unknown and the sample size is small
Under what conditions is a $z$ value used as the test statistic?
When the population standard deviation is known or the sample size is large
What is the probability of obtaining a test statistic at least as extreme as the observed value, assuming the null hypothesis is true?
The p-value
What decision is made regarding the null hypothesis if the p-value is less than or equal to the significance level $\alpha$ (alpha)?
Reject the null hypothesis
What is the conclusion if the p-value is greater than the significance level $\alpha$ (alpha)?
Fail to reject the null hypothesis
What is the common value set for the significance level $\alpha$ (alpha) before conducting an analysis?
0.05
The probability of making a Type I error is equal to which pre-selected value?
The significance level $\alpha$ (alpha)
What type of error occurs when the null hypothesis is rejected even though it is actually true?
Type I error
What type of error occurs when the researcher fails to reject the null hypothesis even though the alternative hypothesis is true?
Type II error
What term describes the probability of correctly rejecting a false null hypothesis?
Test power
How is test power mathematically related to the probability of a Type II error ($\beta$)?
$1 - \beta$ (one minus the probability of a Type II error)
Quiz
Introduction to Hypothesis Tests Quiz Question 1: What is the commonly used significance level in hypothesis testing?
- 0.05 (chosen before analysis) (correct)
- 0.01 (determined after seeing the data)
- 0.10 (used only for exploratory studies)
- 1.00 (to always reject the null)
Introduction to Hypothesis Tests Quiz Question 2: When do we fail to reject the null hypothesis?
- When the p‑value is greater than alpha (correct)
- When the p‑value is less than alpha
- When the test statistic is negative
- When the sample mean matches the hypothesized mean
Introduction to Hypothesis Tests Quiz Question 3: What is the correct interpretation regarding the null hypothesis after testing?
- We never accept it; we only reject or fail to reject (correct)
- We can declare it true if the p‑value is high
- We replace it with the alternative hypothesis
- We must always reject it at the 0.05 level
Introduction to Hypothesis Tests Quiz Question 4: What is a Type I error?
- Rejecting a true null hypothesis (correct)
- Failing to reject a false null hypothesis
- Choosing the wrong test statistic
- Using a non‑random sample
Introduction to Hypothesis Tests Quiz Question 5: What does the significance level alpha equal?
- The probability of making a Type I error (correct)
- The probability of making a Type II error
- The power of the test
- The sample size needed for significance
Introduction to Hypothesis Tests Quiz Question 6: Which of the following can increase the power of a hypothesis test?
- Increasing the sample size (correct)
- Choosing a smaller effect size
- Lowering the significance level alpha
- Using a less sensitive test statistic
Introduction to Hypothesis Tests Quiz Question 7: What is calculated using the observed test statistic and its sampling distribution?
- The p‑value (correct)
- The confidence level
- The effect size
- The sample variance
Introduction to Hypothesis Tests Quiz Question 8: What does the null hypothesis usually assert regarding a population parameter?
- That the parameter equals a specified value, indicating no effect. (correct)
- That the parameter is greater than the specified value.
- That the parameter differs from the specified value.
- That the parameter is unknown.
Introduction to Hypothesis Tests Quiz Question 9: Hypothesis testing is best described as a _____ method for evaluating claims about a population.
- systematic (correct)
- anecdotal
- speculative
- descriptive
Introduction to Hypothesis Tests Quiz Question 10: A researcher obtains a p‑value of 0.04 and has set the significance level α = 0.05. What conclusion should be drawn?
- Reject the null hypothesis (correct)
- Fail to reject the null hypothesis
- Accept the null hypothesis as true
- Collect more data before deciding
Introduction to Hypothesis Tests Quiz Question 11: Which test statistic should be used when the population standard deviation is unknown and the sample size is 20?
- t‑value (correct)
- z‑value
- χ²‑value
- F‑value
What is the commonly used significance level in hypothesis testing?
1 of 11
Key Concepts
Hypothesis Testing Concepts
Hypothesis testing
Null hypothesis
Alternative hypothesis
Test statistic
p‑value
Significance level (α)
Errors and Power
Type I error
Type II error
Statistical power
Sampling and Distribution
Sampling distribution
Definitions
Hypothesis testing
A statistical method for deciding whether a claim about a population is supported by sample data.
Null hypothesis
The default assumption that a population parameter equals a specified value, indicating no effect.
Alternative hypothesis
The competing claim that a population parameter differs from, exceeds, or falls below the null value.
Test statistic
A calculated value (e.g., t or z) that quantifies the discrepancy between observed data and the null hypothesis.
p‑value
The probability of obtaining a test‑statistic at least as extreme as observed, assuming the null hypothesis is true.
Significance level (α)
A pre‑selected threshold (commonly 0.05) for deciding when to reject the null hypothesis.
Type I error
The mistake of rejecting a true null hypothesis.
Type II error
The mistake of failing to reject a false null hypothesis.
Statistical power
The probability of correctly rejecting a false null hypothesis (1 − β).
Sampling distribution
The probability distribution of a test statistic under repeated sampling when the null hypothesis holds.