In What is a z-score, in relation to a data point? and What is a z-score, in relation to a sample mean?, we focused on population *means*. We were dealing with questions like this...

What's the probability of finding a sample of 30 people will a mean IQ score of less than 105?

...which are dealing with the *mean* of a numeric value (IQ score).

A problem is working with sample **means** if it's dealing with the average of a set of **numeric values** from a sample.

What if instead, we were prompted with a situation like so?

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

This situation is a little different, because we're no longer dealing with the average of a set of numeric values. We're dealing with a Yes/No ratio of students who did join Greek Life vs. those who didn't.

A problem is working with a sample **proportion** if it's dealing with a ratio derived from a set of Yes/No values.

## Z-score with a sample proportions explained

For the sake of continuity, let's zone in on the problem statement we stated above...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...and understand the inner-workings of sample proportion distributions. Then, we'll be able to comprehend how to solve for z-score!

### Visualizing the population distribution

If you remember from What is a z-score, in relation to a data point?, we had a population distribution of all IQ scores that formed a bell curve like this:

In the case of our above situation with Crammer Nation University Greek Life involvement, the population distribution would look like this...

...which looks a little different than a bell-curve. Why?

Because students are in one of two groups: (1) involved in Greek Life or (2) not involved in Greek Life. There's no *numeric* values here that create a bell-curve shape for the population distribution, it's a simple *Yes/No* value.

So... how can we compute z-score of a sample proportion if we don't have a bell-curve shaped population distribution?

Herein lies the value of the Central Limit Theorem!

### Addressing the Central Limit Theorem

The Central Limit Theorem states the following:

The** Central Limit Theorem** states that as you increase your sample size, your sampling distribution will become normal, no matter if your population distribution is normal or not!

You might've seen this in What is a z-score, in relation to a sample mean?, but I absolutely *love* this graphic from Statistics How To.

It's a fantastic illustration of how even though the distribution of dice rolls with a single die is completely flat (a.k.a. "uniform")...

...as we increase our sample size (the number of dice that we're rolling), our sampling distribution becomes more and more normal.

### Visualizing the sample distribution

Let's dig into a quick scenario to see how the Central Limit Theorem applies to our Greek Life involvement situation.

Let's say we took a sample of 50 random freshmen at Crammer Nation University and found that 25% of them joined Greek Life.

You decide to take another random sample of 50 freshmen and find that 30% of them joined Greek Life.

Out of pure curiosity, you take a third random sample of 50 freshmen and find that once again 30% of your sample joined Greek Life.

You take one last sample of 50 freshmen and find that this time, 35% of them joined Greek Life.

If we were to repeat this process for *all potential samples of 50 freshmen at Crammer Nation University*, these little dots would start stack up on each other and form a bell-curve like this!

(Obviously there'd be *way* more black dots, but don't get caught up on that.)

#### Circling back to the Central Limit Theorem

See how even though our distribution of *individual* Greek Life involvement responses (the population distribution) was not a bell-curve...

...our distribution of sample proportions of sample size 50 (the sample distribution) was a bell-curve?

That's because of the Central Limit Theorem!

### Visually understanding our z-score

Now that we've got our sample distribution (and it's a normal distribution with a bell-curve shape), visualizing z-score will be very similar to how we did it in What is a z-score, in relation to sample means?

With our prompt...

Crammer Nation University released news that 30% of the most recent freshmen class joined Greek Life. You conduct a sample of 50 randomly selected freshmen and find that 18 of them joined a Greek Life chapter. What is the probability of running another sample with a proportion greater than the one found in your first sample?

...we essentially need to ask ourselves "Out of the distribution of *all possible samples means with a sample size of 50*...

...what percentage of sample proportions are greater than 0.36?"

## How to calculate z-score with a sample proportion

Given our prompt from above...

...how can we find the z-score (to then in-turn find the probability, a.k.a. the p-value)?

We'll use the below formula!

It still exists in the denominator...

...just like it did with sample means.

The formula is a little different due to the fact that we're utilizing proportions, and proportions don't have standard deviation!

### Plugging in the population proportion

Based on the prompt, it states that the population proportion is 30%...

...therefore, we'll plug in 0.30 for **p**!

### Plugging in the population proportion of failure

The population proportion of failure, **q**, is essentially the opposite of** p**. It's the proportion of students *not* involved in Greek Life at Crammer Nation University, which we can solve with the following formula:

**q** = 1 - **p**

When we plug in **p** as 0.30...

**q** = 1 - 0.30

...this results in **q** equaling 0.70.

**q** = 0.70

Therefore, let's plug in 0.70 for **q**!

### Plugging in the sample proportion

The sample proportion is the proportion of students out of our sample who are involved in Greek Life. Our sample size was 50...

...and 18 of those students were involved in Greek Life.

Therefore, our sample proportion is 0.36.

18 / 50 = 0.36

Let's go ahead and plug in 0.36 for **p-hat**!

### Plugging in the sample size

As we stated before, our sample size was 50...

...so let's plug in 50 for **n**!

### Solving for z-score

When we solve the equation...

...we get a z-score of 0.92!

## How to associate a p-value to your z-score

This process is the exact same as it was in What is a z-score, in relation to a data point? and What is a z-score, in relation to a sample mean?

Using the z-table...

...find "0.9" in the left-hand column (representing 0.92)...

...and then "0.02" in the top row (representing 0.92)...

...to locate our p-value of 0.8212!

### Understanding your p-value visually

Once again... this process is *literally* the exact same as What is a z-score, in relation to a data point? and What is a z-score, in relation to a sample mean?

Our p-value of 0.8212 means that on our distribution of sample proportions of sample size 50, the area under the curve to the left of an sample proportion 0.36...

...is equal to 0.8212, or 82.12%.

However, in our prompt we're looking for the percentage of sample proportions of sample size 50 that are *greater than* 0.36...

...therefore, to find the area to the *right* of our z-score...

...all we need to do is subtract 0.8212 from 1.00 (since the area under the curve is equal to 1.00)...

1.00 - 0.8212 = 0.1788

...to get 0.1788, or 17.88%!

This answers our prompt! The chances of finding a sample of 50 students with a sample size greater than 0.36 is 17.88%!

I’m a Miami University (OH) 2021 alumni who majored in Information Systems. At Miami, I tutored students in Python, SQL, JavaScript, and HTML for 2+ years. I’m a huge fantasy football fan, Marvel nerd, and love hanging out with my friends here in Chicago where I currently reside.