 # z-scores and normal distributions

Here at Crammer Nation, we're big believers in explaining through examples. Let's learn about z-scores and normal distributions through an IQ score example!

## The IQ scenario explained

You've always been curious how smart you are compared to the average person, so you decide to take an IQ test. After taking the test, you obtain a score of 105.

Good for you! But... what does that score mean? How smart are you compared to the average person? You've always heard people claim they're "smarter than __% of the population", and you want to know what percent of people you're smarter than!

To formalize our question...

What percentage of the population has an IQ score less than 105?

Well, it is known that all IQ scores follow on a normal distribution like this...

...with a mean of 100 and a standard deviation of 15.

Considering that the distribution of IQ scores is a normal distribution, this means that it follows the Empirical Rule.

This means that 68% of all IQ scores fall within one standard deviation of the mean...

...and that 95% of all IQ scores fall within two standard deviations of the mean...

...and that 99.7% of all IQ scores fall within three standard deviations of the mean.

You scored a 105...

...so what percentage of IQ scores fall below your score?

In other words... what's your "__%" value in the statement that you are "smarter than __% of the population"?

To find this "__%" value, we must first determine the z-score associated with your IQ score. Then, we'll associate that with a p-value!

## Pause... let's hit rewind & make sure we understand.

I threw a LOT at you in the above text. Let's take a second and make sure we understand the important things.

### What's a "normal distribution"?

A normal distribution is a symmetrical, bell-shaped curve that is unimodal, meaning that it has one peak. The area under a normal distribution is equal to 1.00.

In our normal distribution for IQ scores, we can see that it has one peak at an x-value of 100 (since that's the mean of all IQ scores).

Additionally, the entire area under this normal distribution equals 1.00. This'll come into play with p-values, which are between 0.000 and 1.000.

Throughout the world, normal distributions occur very often. Whether it be exam scores, human height, blood pressure, etc... a lot of things in this world follow a normal distribution.

### How do IQ scores follow a normal distribution?

In this scenario, you scored a 105 on your IQ test.

You ask Johnny across the street, and he scored a 95.

Camila from class scored a 100.

You ask Joseph from your volleyball team, and he scored a 100 too.

If you were to ask everyone in the United States what their IQ score is, these little dots would start stack up on each other and form a bell-curve like this!

In summary...

A distribution is formed by all the different data points that occupy it.

In the case of a normal distribution, these data points typically congregate around the mean of all the data points (a.k.a. the center), causing the bell-shape to occur!

Something important to understand here... if the normal distribution is formed by all the different data points under it, that means that the area under the normal distribution curve (the entire red area)...

...accounts for 100% of the data points.

That's why...

The total area under a normal distribution curve is 1.00, or 100%.

(We kinda already eluded to this in the below picture.)

### What's the "Empirical Rule"?

The Empirical Rule (commonly referred to as the "68-95-99.7 rule") is a defining trait of normal distributions. All normal distributions follow it.

The Empirical Rule states that in a normal distribution, 68% of data points fall within 1 standard deviation (σ) of the mean, 95% of data points fall within 2 standard deviations (σ) of the mean, and 99.7% of data points fall within 3 standard deviations (σ) of the mean.

Put in visual terms (with standard deviation represented by "σ")...

So... why is the Empirical Rule important?

It enables us to easily calculate the percentage of results that occur above / below a given data point. Here's a quick example:

If I had an IQ of 115...

...that would be 1 standard deviation away from the mean of all IQ scores of 100.

Considering that 68% of all results are 1 standard deviation away from the mean...

...that means that between the mean and our data point of 115, there's 68% / 2 = 34% of all data points.

Considering that a normal distribution is symmetrical at the mean, this means that all the results to the left of the mean account for 50% of all data points...

...therefore, an IQ score of 115 indicates that the person is smarter than 34% + 50% = 84% of the population.

But... this get's a little tricky when you're given an IQ score like 105, which falls somewhere between 0-1 standard deviations with from the mean.

That's where z-scores come into play!

### So... why are z-scores important?

In essence, the z-score tells us how many standard deviations a given data point is away from the mean of our distribution. We can see this represented in its formula:

Z-scores are valuable because they're a way to standardize the distance a given data point is from the mean across all different types of populations.

Think about it: IQ scores follow a normal distribution with a mean of 100 points and a standard deviation of 15 points. Human height (in the US) also follows a normal distribution, but with a mean of 70 inches and a standard deviation of 3 inches. These populations have different means, standard deviations, and units... and z-scores enables us to standardize those differences!

z-scores are a standardized measure of how many standard deviations a given data point is away from the mean of its data set, given that the data set follows a normal distribution.