What is a z-score, in relation to a sample proportion?

In What is a z-score, in relation to a data point? and What is a z-score, in relation to a sample mean?, we focused on population means. We were dealing with questions like this...

What's the probability of finding a sample of 30 people will a mean IQ score of less than 105?

...which are dealing with the mean of a numeric value (IQ score).

A problem is working with sample means if it's dealing with the average of a set of numeric values from a sample.

What if instead, we were prompted with a situation like so?

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

This situation is a little different, because we're no longer dealing with the average of a set of numeric values. We're dealing with a Yes/No ratio of students who did join Greek Life vs. those who didn't.

A problem is working with a sample proportion if it's dealing with a ratio derived from a set of Yes/No values.

Z-score with a sample proportions explained

For the sake of continuity, let's zone in on the problem statement we stated above...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...and understand the inner-workings of sample proportion distributions. Then, we'll be able to comprehend how to solve for z-score!

Visualizing the population distribution

If you remember from What is a z-score, in relation to a data point?, we had a population distribution of all IQ scores that formed a bell curve like this:

Created with Statistics Kingdom

In the case of our above situation with Crammer Nation University Greek Life involvement, the population distribution would look like this...

...which looks a little different than a bell-curve. Why?

Because students are in one of two groups: (1) involved in Greek Life or (2) not involved in Greek Life. There's no numeric values here that create a bell-curve shape for the population distribution, it's a simple Yes/No value.

So... how can we compute z-score of a sample proportion if we don't have a bell-curve shaped population distribution?

Herein lies the value of the Central Limit Theorem!

Addressing the Central Limit Theorem

The Central Limit Theorem states the following:

The Central Limit Theorem states that as you increase your sample size, your sampling distribution will become normal, no matter if your population distribution is normal or not!

You might've seen this in What is a z-score, in relation to a sample mean?, but I absolutely love this graphic from Statistics How To.

It's a fantastic illustration of how even though the distribution of dice rolls with a single die is completely flat (a.k.a. "uniform")...

...as we increase our sample size (the number of dice that we're rolling), our sampling distribution becomes more and more normal.

Visualizing the sample distribution

Let's dig into a quick scenario to see how the Central Limit Theorem applies to our Greek Life involvement situation.

Let's say we took a sample of 50 random freshmen at Crammer Nation University and found that 25% of them joined Greek Life.

You decide to take another random sample of 50 freshmen and find that 30% of them joined Greek Life.

Out of pure curiosity, you take a third random sample of 50 freshmen and find that once again 30% of your sample joined Greek Life.

You take one last sample of 50 freshmen and find that this time, 35% of them joined Greek Life.

If we were to repeat this process for all potential samples of 50 freshmen at Crammer Nation University, these little dots would start stack up on each other and form a bell-curve like this!

(Obviously there'd be way more black dots, but don't get caught up on that.)

Circling back to the Central Limit Theorem

See how even though our distribution of individual Greek Life involvement responses (the population distribution) was not a bell-curve...

...our distribution of sample proportions of sample size 50 (the sample distribution) was a bell-curve?

That's because of the Central Limit Theorem!

Visually understanding our z-score

Now that we've got our sample distribution (and it's a normal distribution with a bell-curve shape), visualizing z-score will be very similar to how we did it in What is a z-score, in relation to sample means?

With our prompt...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...we essentially need to ask ourselves "Out of the distribution of all possible samples means with a sample size of 50...

...what percentage of sample proportions are greater than 0.36?"

How to calculate z-score with a sample proportion

Given our prompt from above...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...how can we find the z-score (to then in-turn find the probability, a.k.a. the p-value)?

We'll use the below formula!

We used standard error (SE) with sample means, where is it for sample proportions?

It still exists in the denominator...

...just like it did with sample means.

The formula is a little different due to the fact that we're utilizing proportions, and proportions don't have standard deviation!

Plugging in the population proportion

Based on the prompt, it states that the population proportion is 30%...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...therefore, we'll plug in 0.30 for p!

Plugging in the population proportion of failure

The population proportion of failure, q, is essentially the opposite of p. It's the proportion of students not involved in Greek Life at Crammer Nation University, which we can solve with the following formula:

q = 1 - p

When we plug in p as 0.30...

q = 1 - 0.30

...this results in q equaling 0.70.

q = 0.70

Therefore, let's plug in 0.70 for q!

Plugging in the sample proportion

The sample proportion is the proportion of students out of our sample who are involved in Greek Life. Our sample size was 50...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...and 18 of those students were involved in Greek Life.

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

Therefore, our sample proportion is 0.36.

18 / 50 = 0.36

Let's go ahead and plug in 0.36 for p-hat!

Plugging in the sample size

As we stated before, our sample size was 50...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...so let's plug in 50 for n!

Solving for z-score

When we solve the equation...

...we get a z-score of 0.92!

How to associate a p-value to your z-score

This process is the exact same as it was in What is a z-score, in relation to a data point? and What is a z-score, in relation to a sample mean?

Using the z-table...

...find "0.9" in the left-hand column (representing 0.92)... 

...and then "0.02" in the top row (representing 0.92)...

...to locate our p-value of 0.8212!

Understanding your p-value visually

Once again... this process is literally the exact same as What is a z-score, in relation to a data point? and What is a z-score, in relation to a sample mean?

Our p-value of 0.8212 means that on our distribution of sample proportions of sample size 50, the area under the curve to the left of an sample proportion 0.36...

...is equal to 0.8212, or 82.12%.

However, in our prompt we're looking for the percentage of sample proportions of sample size 50 that are greater than 0.36...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...therefore, to find the area to the right of our z-score...

...all we need to do is subtract 0.8212 from 1.00 (since the area under the curve is equal to 1.00)...

1.00 - 0.8212 = 0.1788

...to get 0.1788, or 17.88%!

This answers our prompt! The chances of finding a sample of 50 students with a sample size greater than 0.36 is 17.88%!

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

Leave a Comment