# Chi Square – Independence (Hypothesis test)

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

As with all hypothesis test problems, we're going to take the following steps to solve!

• Step 1 - State the hypotheses
• Step 2 - Calculate the test statistic
• Step 3 - Find the p-value
• Step 4 - Make the concluding statement

## Step 1 - State the hypotheses

Let's start with the null hypothesis (H0), then we'll move onto the alternative hypothesis (Ha).

### Defining your null hypothesis (H0)

When working with Chi Square - Independence Test, your null hypothesis claims that the two variables are independent of each other.

We will write that like so:

H0The variables are independent.

To be clear, the first variable we're testing is the type of hat they were wearing...

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

...and the second is whether or not they received a bid.

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

### Defining your alternative hypothesis (Ha)

When working with Chi Square - Independence Test, your alternative hypothesis claims that the two variables are not independent of each other.

We will write that like so:

H0The variables are independent.
HaThe variables are not independent.

In other words, the alternative hypothesis is claiming that the hat a recruit wore and whether or not they got a bid is not independent of each other!

Does this mean that they're "dependent" on each other?

Not necessarily, it just means that we have enough evidence to claim that they're not independent... a.k.a. they could be related.

Think about it: being related to something doesn't mean your dependent on it!

## Step 2 - Calculate the test statistic

Before even beginning to calculate our test statistic, we have to check out assumptions!

We're working with Chi Square - Independence Test, therefore we need to check the following assumptions:

1. The data is counted.
2. The counts must be randomly selected from the population.
3. Each count must be 5 or greater.

Concerning #1, we're dealing with counts here, so this assumption is met!

Concerning #2, our counts were randomly taken from the population.

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

Concerning #3, all of our counts are greater than 5.

Therefore, all of our assumptions are passed! Now we can move onto calculating our test statistic!

### The Chi Square test statistic formula

Like we did in Chi Square - Goodness of Fit (Hypothesis test), we'll utilize the following formula when calculating the X2 test statistic for our sample:

Just to be clear here, the "X2" represents the test statistic (also referred to as the "Chi Square test statistic")...

...and this "Σ" symbol...

...means that we're going to run this equation...

...on each cell in our table.

Let's go ahead and refactor this table setup so that it's easier to see our expected values for each cell, which will then enable us to calculate our "(obs - exp)2 / exp" value for each cell.

### Calculating the expected values

To compute the expected value for each cell, we have to go back to the original table...

...and compute the following equation for each cell:

Expected value = (row total column total) / table total

So, in the case of those who received a bid and wore a backwards hat...

...their row total was 39...

Expected value = (39column total) / table total

...and their column total was 34...

Expected value = (39 x 34) / table total

...and considering that the table total was 85...

Expected value = (39 x 34) / 85

...we can compute an Expected value of 15.6!

Expected value = (39 x 34) / 85 = 15.6

Let's go ahead and place that expected value into our refactored table.

When we solve out the expected values for the rest of these rows, we get the following output!

### Calculating the X2 value for each row

...and the expected value is 15.6...

...therefore, the X2 value for this row is equal to 1.869.

When we compute this for each of the rows, we get the following values:

### Summing the X2 values

When we sum all of these values...

...we get a final X2 value of 8.541!

## Step 3 - Find your p-value

We're going to use the X2 table for this...

...which corresponds to the area to the right of our X2 test statistic under the X2 curve, which typically looks something like this (it becomes broader with more degrees of freedom!):

Can I see how the X2 curve becomes broader with more degrees of freedom?

Here's how it looks with 5 degrees of freedom (this is the same image as above):

Here's how it looks with 10 degrees of freedom:

Here's how it looks with 20 degrees of freedom:

Before we can find our p-value, we must determine our degrees of freedom!

### Calculating your degrees of freedom (df)

To calculate your degrees of freedom (df) with a Chi Square - Independence Test, you'll utilize the following formula:

df = (# of rows - 1) x (# of columns - 1)

In our case, we've got 2 rows...

df = (2 - 1) x (# of columns - 1)

...and 3 columns.

df = (2 - 1) x (3 - 1)

Therefore, we have 2 degrees of freedom!

df = (2 - 1) x (3 - 1) = 1 x 2 = 2

### Recognizing our alpha level (α)

Considering that our alpha level (α) is 0.05...

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

...this means that on our X2 curve with 2 degrees of freedom...

...we'll be assessing if our p-value occurs in this alpha level (α) area!

### Finding our p-value

We know our X2 value is 0.647, so that means we'll be using the X2 table to find a range of X2 values that our X2 value fits between.

To start, let's zone in on the row corresponding to 2 degrees of freedom (df).

From here, we can identify that our X2 value of 8.542 falls between 7.378 and 9.210.

This corresponds to a p-value between 0.025 and 0.01!

0.01 < p < 0.025

### Visualizing our p-value

Since the X2 table represents p-values to the right of our X2 value of 8.542...

...this means that the area to the right of our X2 value has a p-value somewhere between 0.01 and 0.025.

This provides us with all the information that we need to know! It shows us that our p-value (corresponding to our X2 value) is within our alpha level (α) of 0.05!

Why do we not need an exact p-value?

Because in hypothesis tests, all that matters is whether or not your p-value is above or below your alpha level!

Knowing that our p-value lies somewhere between 0.01 and 0.025...

0.01 < p < 0.025

...is enough intel for us to determine that our actual p-value corresponding to our X2 value of 8.542 is above our alpha level!

## Step 4 - Make your concluding statement

Your concluding statement is going to center around the alpha level declared in the problem. In most cases, that alpha level will be 0.05. Each problem should explicitly state the alpha level. In our problem, it's 0.05.

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

As we stated in What is a hypothesis test?...

- If the p-value is below the alpha level, then we reject the null hypothesis
- If the p-value is above the alpha level, then we fail to reject the null hypothesis.

In our case, our p-value range below the alpha level. Remember: all values between 0.01 and 0.025 are less than 0.05!

0.01 < p < 0.025

This means that we reject our null hypothesis!

### Applying the Chi Square answer template

If you remember in What is a hypothesis test?, we gave the following answer template when working with t-scores:

Since our p-value range of p-value range is less / greater than our alpha level of alpha level value, we reject / fail to reject the null hypothesis. We do / don't have enough evidence to support the alternative hypothesis, which states that description of the alternative hypothesis.

We're going to use this same exact template for Chi Square tests!

Applied to our question, this would give us the following answer to our original question!

Answer: Since our p-value range of 0.01 < p < 0.025 is less than our alpha level of 0.05, we reject the null hypothesis. We do have enough evidence to support the alternative hypothesis, which states that the hat that a recruit wore and whether or not they received a bid are not independent of each other at Crammer Nation University.