# T-scores

The first crucial thing to understand about t-scores is...

You will only use t-scores when working with sample means.

The second crucial thing to understand about t-scores is that when working with sample means, utilize the following diagram to determine if it's necessary to use t-scores (graphic from Statology):

Okay... now that we've got that out of the way, let's dig into why t-scores are actually needed in statistics.

## Why are t-scores necessary?

In most real-world scenarios, we're either (1) not going to be given the population standard deviation or (2) our sample size will be too small to ensure that our sample distribution is normal. In those cases, we need to utilize a t-score instead of a z-score.

t-scores are used to account for the absence of population standard deviation or a small sample size.

### Missing population standard deviation

When we don't have the population standard deviation (σ), we have to utilize our "best guess" at the population standard deviation with our sample standard deviation (s).

Considering that we're utilizing a "best guess"... we have to provide room for error in case the sample standard deviation (s) is not an accurate representation of the population standard deviation (σ).

This "room for error" is accounted for with t-scores!

### Too small of sample size

This circles back to the Central Limit Theorem from z-scores with sampling distributions, which states:

The Central Limit Theorem states that as you increase your sample size, your sampling distribution will become normal, no matter if your population distribution is normal or not!

If our sample size is too small, then we can't assume that the sample distribution is a normal distribution. The size of the sample must be at least 30 for us to make the assumption that it's normal.

This can be visualized in this graphic from Statistics How To:

Notice how as our sample size increases, we're getting closer and closer to a normal distribution shape like the one below that we used for z-scores.

While the distribution with "Ten dice" looks normal...

...it's close, but not exactly normal. And if it's not exactly normal, we can't use z-scores. We have to settle for t-scores.

If there was a graphic with 30 dice, then we'd be able to use z-scores, since that would be a truthfully normal distribution.

If our sample size is 30, then we can assume our sample distribution has a normal shape. If it's below 30, then we must utilize a t-distribution to account for the fact that our sampling distribution might not be normal.

### Addressing degrees of freedom (df)

When working with t-scores, we'll utilize degrees of freedom (df) to determine the shape of our t-distribution. Degrees of freedom (df) are directly correlated to your sample size, which means (based on the Central Limit Theorem) the more degrees of freedom that you have, the more "normal" your t-distribution will be!

All you need to understand about degrees of freedom right now is the following:

When working with t-scores, to calculate your degrees of freedom (df), just subtract one from your sample size!

### t-distributions vs. the normal distribution

In articles prior to this, we were only working with the normal distribution (also called the z-distribution, because it's used with z-scores).

t-scores, however, do not use the normal distribution! They utilize their own slightly different t-distribution.

Similar to how z-scores are utilized with normal distributions (also referred to as z-distributions) to determine p-values, t-scores are utilized with t-distributions to determine p-values.

Check out the below visual from JMP that visualizes t-distributions with varying degrees of freedom vs. the normal distribution.

The main takeaway is this: As our degrees of freedom increase from purple to blue to orange, the t-distribution gets closer and closer to looking like a normal distribution (represented with the green curve)!

This is because higher degrees of freedom are a direct result of a larger sample size. And when you have larger sample size...

• Your sample standard deviation (s) is a more accurate representation of population standard deviation (σ)
• Your sample distribution becomes more and more like a normal distribution due to the Central Limit Theorem

To be clear: the normal distribution is the "gold standard". You should always strive to use a normal distribution when you can. If you cannot (due to no population standard deviation or a sample size below 30), then you'll have to settle for a t-distribution.

A t-distribution is similar to a normal distribution, but is not exactly normal. It has heavier tails to account for the unknown of population standard deviation (σ) or too small of sample size. The t-distribution becomes more and more normal as the degrees of freedom increase!

### This flowchart will save you some headache...

For the sake of your sanity, refer to the below graphic if you're ever confused whether to use t-score or z-score.

Put in verbal form...

• If you ever don't have the population standard deviation (σ), utilize t-score to provide "room for error" for using the sample standard deviation (s).
• If you have the population standard deviation (σ) but your sample size is not greater than 30, utilize t-score to account for the Central Limit Theorem. If your sample is too small, then your sample distribution isn't normal!
• If you have the population standard deviation (σ) and your sample size is greater than 30, utilize z-score!

## Why don't proportions use t-scores?

When we're not given the population standard deviation, we must use t-scores with sample means.

However... proportions don't have standard deviation! That's why with proportions, we're free to go ahead and utilize z-scores.

Now remember: this does not mean that you can throw the assumptions out the window! You still need to check those for each sample that you work with.

## Practice, practice, practice!

To get some practice working with t-scores, let's go through Question #3 of the Practice Midterm!