Subjects/Math/Statistics and Discrete Math/Statistics/Hypothesis test

Introduction to Hypothesis Tests

Understand the core concepts of hypothesis testing, how to make decisions using p‑values and significance levels, and how to manage errors while improving test power.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

What is the systematic method used by statisticians to decide if a claim about a population is believable based on sample data?

1 of 17

Summary

Foundations of Hypothesis Testing What is Hypothesis Testing? Hypothesis testing is a systematic statistical procedure that allows us to make decisions about population claims based on sample data. Rather than measuring an entire population (which is usually impossible), we collect a sample, analyze it, and use the results to decide whether a claim about the population is reasonable. Think of it as a legal proceeding: we start by assuming a defendant is innocent (the claim we're testing), examine the evidence (our sample data), and decide whether that evidence is convincing enough to reject our initial assumption. The Two Hypotheses Every hypothesis test involves exactly two competing hypotheses: The Null Hypothesis ($H0$) represents the status quo—the claim we're skeptical of. It typically states that a population parameter equals a specific value, that there is no effect, or that nothing has changed. For example: "The average height of adult males is 70 inches" or "This new drug has no effect on blood pressure." The Alternative Hypothesis ($Ha$ or $H1$) represents what we're trying to find evidence for. It states that the population parameter differs from, is greater than, or is less than the value specified in the null hypothesis. It's the hypothesis we hope our data will support. An important point: the alternative hypothesis is never neutral—it always represents either a specific direction (greater than or less than) or a general difference (not equal to). Test Statistics and Sampling Distributions A test statistic is a single number that summarizes how far our sample result is from what we'd expect if the null hypothesis were true. Common test statistics include the t-statistic and the z-statistic. The key insight is this: if the null hypothesis is true, the test statistic follows a known probability distribution (called the sampling distribution under the null hypothesis). This allows us to calculate probabilities and make informed decisions. When the null hypothesis is true but our sample happens to give us an unusual result, the test statistic will be far from zero. When the null hypothesis is true and our sample is typical, the test statistic will be close to zero. By measuring how extreme our observed test statistic is within this known distribution, we can assess whether our data are surprising or typical. Decision Rules in Hypothesis Testing The p-value: Your Decision Guide The p-value is perhaps the most important concept in hypothesis testing. It answers this question: If the null hypothesis were true, what is the probability of observing a test statistic at least as extreme as the one we actually got? Notice the phrasing: it's not "the probability the null hypothesis is true." Rather, it's calculated assuming the null hypothesis is true, and then tells us how unlikely our data would be under that assumption. In the image above, the bell curve represents the sampling distribution if the null hypothesis were true. The shaded red area represents the p-value—the probability of observing data as extreme or more extreme than what we got. A small p-value means our observed data would be very unlikely if the null hypothesis were true. Significance Level and Decision Rules Before collecting data, we choose a significance level (denoted $\alpha$), which is typically set at 0.05. This is our threshold for decision-making. The decision rule is simple: If $p\text{-value} \leq \alpha$: We reject the null hypothesis in favor of the alternative hypothesis. This means our data are surprising enough under the null hypothesis that we don't believe it's true. If $p\text{-value} > \alpha$: We fail to reject the null hypothesis. This means our data are not unusual enough to convince us the null hypothesis is false, though this doesn't mean it's true. A Critical Language Point This language distinction matters: we say "fail to reject" rather than "accept." Why? Failing to reject the null hypothesis simply means the data didn't provide strong enough evidence against it—it doesn't prove the null hypothesis is actually true. There's an important difference between "not finding evidence against something" and "proving something is true." Types of Errors and Power Understanding Type I and Type II Errors Because we're making decisions based on sample data (not the entire population), we can make mistakes: Type I Error: We reject the null hypothesis when it is actually true. This is a "false positive"—we conclude there's an effect when there really isn't one. The probability of making a Type I error equals our chosen significance level: $P(\text{Type I Error}) = \alpha$. Type II Error: We fail to reject the null hypothesis when it is actually false. This is a "false negative"—we miss a real effect because our sample didn't provide enough evidence. Notice that lowering $\alpha$ to reduce Type I errors actually increases the risk of Type II errors, and vice versa. We can't eliminate both risks simultaneously—there's a trade-off. Test Power Power is the probability of correctly rejecting a false null hypothesis. In other words, it's the probability of finding an effect when one truly exists. Power equals $1 - P(\text{Type II Error})$. High power is desirable because it means our test is likely to detect a real effect when it's there. We can increase power by: Increasing sample size: Larger samples provide more stable estimates and more extreme test statistics Increasing the effect size: When the true effect is larger, it's easier to detect Using a more sensitive test statistic: Some tests are better designed for specific situations Practical Applications and Procedure Step-by-Step Hypothesis Testing Procedure Here's how to conduct a hypothesis test: Formulate the hypotheses: Write out $H0$ and $Ha$ clearly. The null hypothesis should specify a single value for the parameter. Collect data and compute the test statistic: Gather a random sample and calculate the appropriate test statistic based on your data and hypotheses. Determine the sampling distribution: Identify what probability distribution the test statistic follows when $H0$ is true (this depends on your test choice and sample characteristics). Calculate the p-value: Use the sampling distribution to find the probability of observing a test statistic at least as extreme as yours. Compare to significance level: Is your p-value less than or equal to $\alpha$? Make a decision: Reject or fail to reject $H0$ based on the comparison. Interpret in context: Explain what your decision means for the original research question. Don't just report the statistics—explain what they mean in plain language. Choosing the Right Test Statistic The choice between a t-statistic and a z-statistic depends on what information you have: Use a t-statistic when the population standard deviation is unknown and the sample size is small (typically $n < 30$) Use a z-statistic when the population standard deviation is known or the sample size is large ($n \geq 30$) The Role of Random Sampling We emphasize random sampling because it ensures your sample is representative of the population. When your sample is representative, the theoretical sampling distribution accurately describes your test statistic's behavior, making your p-value calculation reliable. Without random sampling, the sampling distribution assumptions may not hold, and your conclusions could be invalid.

Flashcards

What is the systematic method used by statisticians to decide if a claim about a population is believable based on sample data?

Hypothesis testing

What are the standard steps involved in the hypothesis testing procedure?

Formulate the null and alternative hypotheses Collect a random sample and compute the test statistic Determine the sampling distribution under the null hypothesis Calculate the p-value Compare the p-value to the significance level $\alpha$ (alpha) Reject or fail to reject the null hypothesis Interpret the result in the research context

Which hypothesis represents the status-quo or "no effect" claim and states a population parameter equals a specific value?

The null hypothesis

Why is the null hypothesis never "accepted" outright based on a p-value comparison?

Because we only have enough evidence to either reject it or fail to reject it

Which hypothesis represents the claim that a population parameter is different, greater, or smaller than the value in the null hypothesis?

The alternative hypothesis

What does a test statistic, such as a $t$ or $z$ value, measure in the context of hypothesis testing?

How far the observed data are from what is expected if the null hypothesis were true

Under what conditions should a $t$ value be used as the test statistic rather than a $z$ value?

When the population standard deviation is unknown and the sample size is small

Under what conditions is a $z$ value used as the test statistic?

When the population standard deviation is known or the sample size is large

What is the probability of obtaining a test statistic at least as extreme as the observed value, assuming the null hypothesis is true?

The p-value

What decision is made regarding the null hypothesis if the p-value is less than or equal to the significance level $\alpha$ (alpha)?

Reject the null hypothesis

What is the conclusion if the p-value is greater than the significance level $\alpha$ (alpha)?

Fail to reject the null hypothesis

What is the common value set for the significance level $\alpha$ (alpha) before conducting an analysis?

0.05

The probability of making a Type I error is equal to which pre-selected value?

The significance level $\alpha$ (alpha)

What type of error occurs when the null hypothesis is rejected even though it is actually true?

Type I error

What type of error occurs when the researcher fails to reject the null hypothesis even though the alternative hypothesis is true?

Type II error

What term describes the probability of correctly rejecting a false null hypothesis?

Test power

How is test power mathematically related to the probability of a Type II error ($\beta$)?

$1 - \beta$ (one minus the probability of a Type II error)

Quiz

Introduction to Hypothesis Tests Quiz Question 1: What is the commonly used significance level in hypothesis testing?

0.05 (chosen before analysis) (correct)
0.01 (determined after seeing the data)
0.10 (used only for exploratory studies)
1.00 (to always reject the null)

Introduction to Hypothesis Tests Quiz Question 2: When do we fail to reject the null hypothesis?

When the p‑value is greater than alpha (correct)
When the p‑value is less than alpha
When the test statistic is negative
When the sample mean matches the hypothesized mean

Introduction to Hypothesis Tests Quiz Question 3: What is the correct interpretation regarding the null hypothesis after testing?

We never accept it; we only reject or fail to reject (correct)
We can declare it true if the p‑value is high
We replace it with the alternative hypothesis
We must always reject it at the 0.05 level

Introduction to Hypothesis Tests Quiz Question 4: What is a Type I error?

Rejecting a true null hypothesis (correct)
Failing to reject a false null hypothesis
Choosing the wrong test statistic
Using a non‑random sample

Introduction to Hypothesis Tests Quiz Question 5: What does the significance level alpha equal?

The probability of making a Type I error (correct)
The probability of making a Type II error
The power of the test
The sample size needed for significance

Introduction to Hypothesis Tests Quiz Question 6: Which of the following can increase the power of a hypothesis test?

Increasing the sample size (correct)
Choosing a smaller effect size
Lowering the significance level alpha
Using a less sensitive test statistic

Introduction to Hypothesis Tests Quiz Question 7: What is calculated using the observed test statistic and its sampling distribution?

The p‑value (correct)
The confidence level
The effect size
The sample variance

Introduction to Hypothesis Tests Quiz Question 8: What does the null hypothesis usually assert regarding a population parameter?

That the parameter equals a specified value, indicating no effect. (correct)
That the parameter is greater than the specified value.
That the parameter differs from the specified value.
That the parameter is unknown.

Introduction to Hypothesis Tests Quiz Question 9: Hypothesis testing is best described as a _____ method for evaluating claims about a population.

systematic (correct)
anecdotal
speculative
descriptive

Introduction to Hypothesis Tests Quiz Question 10: A researcher obtains a p‑value of 0.04 and has set the significance level α = 0.05. What conclusion should be drawn?

Reject the null hypothesis (correct)
Fail to reject the null hypothesis
Accept the null hypothesis as true
Collect more data before deciding

Introduction to Hypothesis Tests Quiz Question 11: Which test statistic should be used when the population standard deviation is unknown and the sample size is 20?

t‑value (correct)
z‑value
χ²‑value
F‑value

What is the commonly used significance level in hypothesis testing?

1 of 11

Key Concepts

Hypothesis Testing Concepts

Hypothesis testing

Null hypothesis

Alternative hypothesis

Test statistic

p‑value

Significance level (α)

Errors and Power

Type I error

Type II error

Statistical power

Sampling and Distribution

Sampling distribution

Definitions

Hypothesis testing

A statistical method for deciding whether a claim about a population is supported by sample data.

Null hypothesis

The default assumption that a population parameter equals a specified value, indicating no effect.

Alternative hypothesis

The competing claim that a population parameter differs from, exceeds, or falls below the null value.

Test statistic

A calculated value (e.g., t or z) that quantifies the discrepancy between observed data and the null hypothesis.

p‑value

The probability of obtaining a test‑statistic at least as extreme as observed, assuming the null hypothesis is true.

Significance level (α)

A pre‑selected threshold (commonly 0.05) for deciding when to reject the null hypothesis.

Type I error

The mistake of rejecting a true null hypothesis.

Type II error

The mistake of failing to reject a false null hypothesis.

Statistical power

The probability of correctly rejecting a false null hypothesis (1 − β).

Sampling distribution

The probability distribution of a test statistic under repeated sampling when the null hypothesis holds.