What is a hypothesis test, in relation to a sample proportion?

Oftentimes in the world, claims will be made about a given population proportion. Through hypothesis testing, we can assess the validity of such claims.

Hypothesis testing with sample proportions explained

Going off our example in What is a z-score, in relation to a sample proportion? and What is a confidence interval, in relation to a sample proportion?, imagine that you're sitting in class at Crammer Nation University and notice a large amount of your fellow freshmen are wearing their new Greek Life chapter merch.

The university released a statement stating the proportion of freshmen students who joined Greek Life this year was 30%, but you want to test that claim out. You take a sample of 50 random freshmen students to establish a range of values that you can be confident the true population proportion of freshmen Greek Life involvement lies.

As discussed in What is a confidence interval, in relation to a sample mean?, herein lies the purpose of a confidence interval!

Hypothesis testing is a way to test a claim about a given population. It enables you to determine whether or not an outcome of a given sample was due to random chance or was statistically significant.

What’s the population in this situation? The freshmen student body at Crammer Nation University.

What’s the sample? The 50 freshmen that you sampled.

What’s the claim that you’re testing? That the proportion of freshmen students who joined Greek Life is actually 0.30.

Potential outcomes of this hypothesis test

As stated in What is a hypothesis test, in relation to a sample mean? there's two potential conclusions our hypothesis test will come to based on our sample.

• We do have enough evidence to reject Crammer Nation University's claim that 30% of their freshmen students joined Greek Life.
• We don't have enough evidence to reject (a.k.a. "fail to reject") Crammer Nation University's claim that 30% of their freshmen students joined Greek Life.

To arrive to either of the above two solutions, there's 4 crucial steps that we'll take:

1. State the hypotheses
2. Calculate the test statistic
3. Find the p-value

Let's dig into the key elements of each step, in relation to a sample proportion problem!

How to conduct a hypothesis test

Before beginning, let's formalize the above prompt:

Crammer Nation University claims that 30% of their freshmen students joined a Greek Life chapter this year. You are curious if that's a truthful proportion, or if a different proportion of students joined a chapter. You collect a random sample of 50 freshmen students and find that 22 of them joined a chapter this year. Provide support for your claim using a hypothesis test with an alpha level of 0.05.

Step 1 - State the hypotheses

As stated in What is a hypothesis test, in relation to a sample mean?, there will be two hypotheses that you need to state in any hypothesis test problem. (1) The null hypothesis and (2) the alternative hypothesis.

Each of these hypotheses will be making a claim about the population parameter (in this case, p). We will not make claims about the sample parameter (in this case, p-hat), because the whole point of us taking the sample is to figure out if we have enough evidence to support a claim made about the population!

As stated in What is a hypothesis test, in relation to a sample mean?...

Your hypotheses will involve the population parameter (mean "µ" or proportion "p"), not the sample parameter!

Understanding the null hypothesis

Your null hypothesis (H0) embodies the claim made about the population parameter.

In the case of the situation above, it's that Crammer Nation University freshmen students joined Greek Life this year at a proportion of 0.30.

Crammer Nation University claims that 30% of their freshmen students joined a Greek Life chapter this year. You are curious if that's a truthful proportion, or if a different proportion of students joined a chapter. You collect a random sample of 50 freshmen students and find that 22 of them joined a chapter this year. Provide support for your claim using a hypothesis test with an alpha level of 0.05.

We'd write that null hypothesis (H0) like so:

H0: p = 0.30

Essentially, what we're saying here is that the true population proportion for Greek Life involvement among Crammer Nation University freshmen...

H0: p = 0.30

...is equal to 0.30.

H0: p = 0.30

Something else important to note...

The null hypothesis will always have an equal sign.

Why? Because there will always be a claim made that the population parameter equals something.

Understanding the alternative hypothesis

Your alternative hypothesis in essence makes a claim that the population parameter is different than the null hypothesis says.

Your alternative hypothesis (Ha) makes a claim that the population parameter differs from what the H0 says.

In the case of the situation above, we're claiming that Crammer Nation University freshmen students joined Greek Life this year at a proportion not equal to 0.30.

Crammer Nation University claims that 30% of their freshmen students joined a Greek Life chapter this year. You are curious if that's a truthful proportion, or if a different proportion of students joined a chapter. You collect a random sample of 50 freshmen students and find that 22 of them joined a chapter this year. Provide support for your claim using a hypothesis test with an alpha level of 0.05.

We'd write that alternative hypothesis (Ha) like so:

H0: p = 0.30
Ha: p ≠ 0.30

Essentially what we're saying here is that the true population proportion for Greek Life involvement among Crammer Nation University freshmen...

H0: p = 0.30
Ha: p ≠ 0.30

...is actually not equal to 0.30.

H0: p = 0.30
Ha: p ≠ 0.30

A helpful tip when writing alternative hypotheses:

The alternative hypothesis can have the greater than (>), less than (<), or not equal to () sign.

In this situation, we're testing the claim that Crammer Nation University freshmen students joined Greek Life at a proportion not equal to 0.30 (≠). However... we could've tested that the proportion is greater than 0.30 (>) or less than 0.30 (<).

Step 2 - Calculating your test statistic

Before even beginning to calculate our test statistic, we have to check out assumptions!

We're working with a sample proportion here, so in accordance with Assumptions for sampling distributions, that means we must check the following assumptions:

1. Sample is randomly selected from the population
2. The sample size (n) is less than or equal to 10% of of the population size N
3. There are 10 successes and 10 failures in the sample OR np >= 10 and nq >= 10

For the sake of zoning in on the hypothesis test, we're going to assume that the assumptions are met and move on.

Recognizing we'll use z-score

Remember, you'll only use t-score if you're dealing with sample means! Therefore, we know we will be using z-score here, which will be computed with the following formula:

You'll notice this formula is very similar to the formula for z-scores in What is a z-score, in relation to a sample proportion?...

...but now it's using p0 and q0 instead of p and q.

Long story short, that's because with hypothesis tests, we use the "0" (that little "0" is often called "knot") to signify that we don't know for sure that it's the true population proportion for p and q. It's the claimed population proportion that's being tested within our hypothesis test!

Before plugging in variables into our z-score formula, it's often helpful to understand what's going on visually. Let's dig into that below!

Visualizing our z-score

Our z-distribution will look like so:

Considering that our alternative hypothesis is making a claim that the true proportion of Crammer Nation University freshmen who joined Greek Life is not equal to 0.30...

H0: p = 0.30
Ha: p 0.30

...we are going to be assessing if the combined p-values on the left and right tails of our z-distribution fit within the alpha level (⍺) of 0.05.

Why are we splitting our alpha level among both the left and right tails of the z-distribution?

In simple terms, it's because our alternative hypothesis is working with ≠.

That means we're not assessing if the true population proportion is only greater than 0.30 (which would be a right-tail test)...

...or is only less than 0.30 (which would be a left-tail test)...

...we're assessing if it does not equal 0.30.

In other words, do we have evidence that the true population proportion something greater than or less than the claimed one?

This means that we need to assess both tails of our sampling distribution, and therefore split our alpha level in half to account for both tails.

Since we're splitting our alpha level among the two tails, that means we'll also be reflecting our z-scores among the two tails. Keep reading to see this in action.

This means that we'll also be reflecting our z-score on the right and left tails of our z-distribution.

When your alternative hypothesis deals with , that means you're working with a two-tail hypothesis test. Therefore, split your alpha level in half on both the right and left tails of the sampling distribution! Don't forget to reflect your z-scores among the two tails!

For example, imagine if our p-hat value was here.

Through calculating z-score, we'd find a corresponding p-value of 0.01 on the right-tail of the sampling distribution.

This p-value is a certain distance from the claimed population proportion (p0) at the center of this sampling distribution.

Since we're doing a two-tail hypothesis test, we need to additionally test for the p-hat value on the opposite side of the sample proportion. Therefore, we'll reflect our p-hat value onto the left-tail like so:

Since our p-values on both tails are within the range of our split alpha level...

...this indicates the probability of another sample of the same size having a sample proportion (p-hat) the same distance from the claimed population proportion (p0) is so slim, that it is statistically significant and could be due to something other than random chance.

On the flip side, imagine if our p-hat value was here.

Through calculating z-score, we'd find a p-value of 0.03 on the right-tail of the sampling distribution.

This p-value is a certain distance from the sample proportion at the center of this sampling distribution.

Since we're doing a two-tail hypothesis test, we need to additionally test for the p-hat value on the opposite side of the sample proportion. Therefore, we'll reflect our p-hat value onto the left-tail like so:

Since our p-values on the right-tail and left-tail are outside the range of our split alpha level...

...this indicates the probability of another sample of the same size having a sample proportion (p-hat) the same distance from the claimed population proportion (p0) is not slim enough to indicate statistical significance and could just be due to random chance.

What if our original p-hat value occurs on the left-side of the sampling distribution?

...we'd still reflect it to the other side of the sampling distribution!

We'll get into this a little more in "Step 4 - Make your concluding statement". Let's move on and calculate our z-score!

Plugging in the variables for your z-score

Here, again, is our equation for our z-score:

For the sake of focusing on hypothesis tests, I will tell you the final answer here: our z-score equals 2.15.

If, however, you'd like to see how z-score was calculated, click below. Keep in mind, this is no different than how we calculated it in What is a z-score, in relation to a sample proportion?, so if you read that article, you're probably good to move on!

I want a walkthrough of how z-score was calculated.

Based on the prompt, the sample proportion was 0.44 (because 22 / 50 = 0.44)...

Crammer Nation University claims that 30% of their freshmen students joined a Greek Life chapter this year. You are curious if that's a truthful proportion, or if a different proportion of students joined a chapter. You collect a random sample of 50 freshmen students and find that 22 of them joined a chapter this year. Provide support for your claim using a hypothesis test with an alpha level of 0.05.

...therefore, we'll plug in 0.44 for p-hat.

Crammer Nation is claiming that the population proportion is 0.30...

Crammer Nation University claims that 30% of their freshmen students joined a Greek Life chapter this year. You are curious if that's a truthful proportion, or if a different proportion of students joined a chapter. You collect a random sample of 50 freshmen students and find that 22 of them joined a chapter this year. Provide support for your claim using a hypothesis test with an alpha level of 0.05.

...therefore, we'll plug in 0.30 for p0.

Next, for our population proportion of failure (q0), we'll do what we did in What is a z-score, in relation to sample proportions? and utilize the following formula:

q0 = 1 - p0

Since p0 equals 0.30...

q0 = 1 - 0.30

...this results in q0 equalling 0.70...

q0 = 1 - 0.30 = 0.70

...so we'll plug in 0.70 for q0!

Lastly, the prompt states that the sample size is 50...

Crammer Nation University claims that 30% of their freshmen students joined a Greek Life chapter this year. You are curious if that's a truthful proportion, or if a different proportion of students joined a chapter. You collect a random sample of 50 freshmen students and find that 22 of them joined a chapter this year. Provide support for your claim using a hypothesis test with an alpha level of 0.05.

...therefore, we'll plug in 50 for n!

When we solve this out, it results in a z-score of 2.15!

Step 3 - Find your p-value

Finding your p-value works differently between z-scores vs. t-scores. If you want to see how it's done with t-scores, click here to access What is a hypothesis test, in relation to a sample mean?

For the sake of focusing on hypothesis tests, I will tell you the final answer here: our p-value equals 0.9842.

If you'd like to see how this p-value was found, click below. Keep in mind, this is no different than how we found it in What is a z-score, in relation to a sample proportion?, so if you read that article, you're probably chillin'!

I want a walkthrough of how p-value was found.

Knowing that our z-score is 2.15, all we need to do is go to our z-table...

...find "2.1" in the left-hand column (representing 2.15)...

...and then "0.05" in the top row (representing 2.15)...

...to locate our p-value of 0.9842!

Wait... a p-value of 0.9842? That's way bigger than our alpha level of 0.05... why?

We need to remember that the z-table displays the area to the left of your z-score...

...therefore this p-value of 0.9842 can be understood visually like so:

To find the p-value to the right of our p-hat value, we must subtract 0.9842 from 1.

1.00 - 0.9842 = 0.0158

When we reflect our p-hat and p-value onto the left-tail of the distribution...

...we're able to find that they fit within our alpha level!

Step 4 - Make your concluding statement

Your concluding statement is going to center around the alpha level declared in the problem. In most cases, that alpha level will be 0.05. Each problem should explicitly state the alpha level. In our problem, it's 0.05.

Crammer Nation University claims that 30% of their freshmen students joined a Greek Life chapter this year. You are curious if that's a truthful proportion, or if a different proportion of students joined a chapter. You collect a random sample of 50 freshmen students and find that 22 of them joined a chapter this year. Provide support for your claim using a hypothesis test with an alpha level of 0.05.

As stated in What is a hypothesis test, in relation to a sample mean?...

- If the p-value is below the alpha level, then we reject the null hypothesis
- If the p-value is above the alpha level, then we fail to reject the null hypothesis.

Since we're dealing with a two-tail test here, our alpha level (⍺) was split among the left-tail and right-tail of our sampling distribution.

That meant that we reflected (a.k.a. duplicated, or "x2") our p-value among both tails...

...therefore, our total p-value was:

0.0158 x 2 = 0.0316

A p-value of 0.0316 is below our alpha level of 0.05, therefore we'll reject our null hypothesis!

(If you want an example of us failing to reject the null hypothesis, click here to see that in What is a hypothesis test, in relation to a sample mean?)

What does it mean to "reject" the null hypothesis?

When we reject the null hypothesis, we're essentially saying that we have enough evidence to support the alternative hypothesis.

Why is this the case?

Because our p-value, or probability of our sample results occurring, is below our alpha level!

In other words, if the null hypothesis was actually a truthful claim about the population, then the probability of a sample of the same size occurring (the p-value) was so low (below the alpha level) that it indicates the results of our sample hold statistical significance and are due to something outside of random chance.

Since the probability of our sample results occurring was so low, that means we have enough evidence to reject the null hypothesis (the baseline claim about the population parameter) and support the alternative hypothesis.

When you reject the null hypothesis, you are saying that the outcome of the sample was statistically significant enough to support the alternative. It does not mean you accept the alternative hypothesis!

NOTE: we are NOT accepting the null hypothesis! That would mean that we are 100% certain that the null hypothesis is true... which is not the case. Rather, the outcome of our sample provided enough evidence to support the alternative hypothesis we proposed, and therefore we "reject" the null hypothesis.

Write your concluding statement (z-score edition)

Here is the template to write your concluding statement with z-scores:

Since our p-value of p-value is less / greater than our alpha level of alpha level value, we reject / fail to reject the null hypothesis and do / don't have enough evidence to support the alternative hypothesis, implying that description of alternative hypothesis.

Based on what we found with the Crammer Nation University Greek Life involvement situation above, here's what our concluding statement would look like:

Since our p-value of 0.0316 is less than our alpha level of 0.05, we reject the null hypothesis and do have enough evidence to support the alternative hypothesis, implying that the proportion of freshmen students at Crammer Nation University who joined Greek Life this year is not equal to 0.30.