Chi Square – Independence (Hypothesis test)

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

As with all hypothesis test problems, we're going to take the following steps to solve!

  • Step 1 - State the hypotheses
  • Step 2 - Calculate the test statistic
  • Step 3 - Find the p-value
  • Step 4 - Make the concluding statement

Step 1 - State the hypotheses

Let's start with the null hypothesis (H0), then we'll move onto the alternative hypothesis (Ha).

Defining your null hypothesis (H0)

When working with Chi Square - Independence Test, your null hypothesis claims that the two variables are independent of each other.

We will write that like so:

H0The variables are independent.

To be clear, the first variable we're testing is the type of hat they were wearing...

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

...and the second is whether or not they received a bid.

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

Defining your alternative hypothesis (Ha)

When working with Chi Square - Independence Test, your alternative hypothesis claims that the two variables are not independent of each other.

We will write that like so:

H0The variables are independent.
HaThe variables are not independent.

In other words, the alternative hypothesis is claiming that the hat a recruit wore and whether or not they got a bid is not independent of each other!

Does this mean that they're "dependent" on each other?

Not necessarily, it just means that we have enough evidence to claim that they're not independent... a.k.a. they could be related.

Think about it: being related to something doesn't mean your dependent on it!

Step 2 - Calculate the test statistic

Before even beginning to calculate our test statistic, we have to check out assumptions!

We're working with Chi Square - Independence Test, therefore we need to check the following assumptions:

1. The data is counted.
2. The counts must be randomly selected from the population.
3. Each count must be 5 or greater.

Concerning #1, we're dealing with counts here, so this assumption is met!

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

Concerning #2, our counts were randomly taken from the population.

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

Concerning #3, all of our counts are greater than 5.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

Therefore, all of our assumptions are passed! Now we can move onto calculating our test statistic!

The Chi Square test statistic formula

Like we did in Chi Square - Goodness of Fit (Hypothesis test), we'll utilize the following formula when calculating the X2 test statistic for our sample:

Just to be clear here, the "X2" represents the test statistic (also referred to as the "Chi Square test statistic")...

...and this "Σ" symbol...

...means that we're going to run this equation...

...on each cell in our table.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

Let's go ahead and refactor this table setup so that it's easier to see our expected values for each cell, which will then enable us to calculate our "(obs - exp)2 / exp" value for each cell.

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat21
Received bid, wore no hat10
Received bid, wore bucket hat8
Denied bid, wore backwards hat13
Denied bid, wore no hat26
Denied bid, wore bucket hat7

Calculating the expected values

To compute the expected value for each cell, we have to go back to the original table...

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

...and compute the following equation for each cell:

Expected value = (row total column total) / table total

So, in the case of those who received a bid and wore a backwards hat...

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

...their row total was 39...

Expected value = (39column total) / table total

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

...and their column total was 34...

Expected value = (39 x 34) / table total

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

...and considering that the table total was 85...

Expected value = (39 x 34) / 85

...we can compute an Expected value of 15.6!

Expected value = (39 x 34) / 85 = 15.6

Let's go ahead and place that expected value into our refactored table.

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat21(39 x 34) / 85 = 15.6
Received bid, wore no hat10
Received bid, wore bucket hat8
Denied bid, wore backwards hat13
Denied bid, wore no hat26
Denied bid, wore bucket hat7

When we solve out the expected values for the rest of these rows, we get the following output!

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat21(39 x 34) / 85 = 15.6
Received bid, wore no hat10(39 x 36) / 85 = 16.518
Received bid, wore bucket hat8(39 x 15) / 85 = 6.882
Denied bid, wore backwards hat13(39 x 34) / 85 = 18.4
Denied bid, wore no hat26(39 x 34) / 85 = 19.482
Denied bid, wore bucket hat7(39 x 34) / 85 = 8.118

Calculating the X2 value for each row

Let's start with the first row. The observed value is 21...

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat2115.6(21 - exp)2 / exp
Received bid, wore no hat1016.518
Received bid, wore bucket hat86.882
Denied bid, wore backwards hat1318.4
Denied bid, wore no hat2619.482
Denied bid, wore bucket hat78.118

...and the expected value is 15.6...

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat2115.6(21 - 15.6)2 / 15.6
Received bid, wore no hat1016.518
Received bid, wore bucket hat86.882
Denied bid, wore backwards hat1318.4
Denied bid, wore no hat2619.482
Denied bid, wore bucket hat78.118

...therefore, the X2 value for this row is equal to 1.869.

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat2115.6(21 - 15.6)2 / 15.6 = 1.869
Received bid, wore no hat1016.518
Received bid, wore bucket hat86.882
Denied bid, wore backwards hat1318.4
Denied bid, wore no hat2619.482
Denied bid, wore bucket hat78.118

When we compute this for each of the rows, we get the following values:

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat2115.6(21 - 15.6)2 / 15.6 = 1.869
Received bid, wore no hat1016.518(10 – 16.518)2 / 16.518= 2.572
Received bid, wore bucket hat86.882(8 – 6.882)2 / 6.882 = 0.181
Denied bid, wore backwards hat1318.4(13 – 18.4)2 / 18.4 = 1.585
Denied bid, wore no hat2619.482(26 – 19.482)2 / 19.482 = 2.180
Denied bid, wore bucket hat78.118(7 – 8.118)2 / 8.118 = 0.154

Summing the X2 values

When we sum all of these values...

DescriptionObservedExpected(obs - exp)2 / exp
Received bid, wore backwards hat2115.61.869
Received bid, wore no hat1016.5182.572
Received bid, wore bucket hat86.8820.181
Denied bid, wore backwards hat1318.41.585
Denied bid, wore no hat2619.4822.180
Denied bid, wore bucket hat78.1180.154
Total8.541

...we get a final X2 value of 8.541!

Step 3 - Find your p-value

We're going to use the X2 table for this...

Chi square table

...which corresponds to the area to the right of our X2 test statistic under the X2 curve, which typically looks something like this (it becomes broader with more degrees of freedom!):

Can I see how the X2 curve becomes broader with more degrees of freedom?

Here's how it looks with 5 degrees of freedom (this is the same image as above):

Here's how it looks with 10 degrees of freedom:

Here's how it looks with 20 degrees of freedom:

Before we can find our p-value, we must determine our degrees of freedom!

Calculating your degrees of freedom (df)

To calculate your degrees of freedom (df) with a Chi Square - Independence Test, you'll utilize the following formula:

df = (# of rows - 1) x (# of columns - 1)

In our case, we've got 2 rows...

df = (2 - 1) x (# of columns - 1)

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

...and 3 columns.

df = (2 - 1) x (3 - 1)

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

Therefore, we have 2 degrees of freedom!

df = (2 - 1) x (3 - 1) = 1 x 2 = 2

Recognizing our alpha level (α)

Considering that our alpha level (α) is 0.05...

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

...this means that on our X2 curve with 2 degrees of freedom...

...we'll be assessing if our p-value occurs in this alpha level (α) area!

Finding our p-value

We know our X2 value is 0.647, so that means we'll be using the X2 table to find a range of X2 values that our X2 value fits between.

To start, let's zone in on the row corresponding to 2 degrees of freedom (df).

From here, we can identify that our X2 value of 8.542 falls between 7.378 and 9.210.

This corresponds to a p-value between 0.025 and 0.01!

0.01 < p < 0.025

Visualizing our p-value

Since the X2 table represents p-values to the right of our X2 value of 8.542...

...this means that the area to the right of our X2 value has a p-value somewhere between 0.01 and 0.025.

This provides us with all the information that we need to know! It shows us that our p-value (corresponding to our X2 value) is within our alpha level (α) of 0.05!

Why do we not need an exact p-value?

Because in hypothesis tests, all that matters is whether or not your p-value is above or below your alpha level!

Knowing that our p-value lies somewhere between 0.01 and 0.025...

0.01 < p < 0.025

...is enough intel for us to determine that our actual p-value corresponding to our X2 value of 8.542 is above our alpha level!

Step 4 - Make your concluding statement

Your concluding statement is going to center around the alpha level declared in the problem. In most cases, that alpha level will be 0.05. Each problem should explicitly state the alpha level. In our problem, it's 0.05.

Question: Crammer Nation University just completed their Greek Life rush. Sigma Apple Pi is curious whether the hat that a recruit wore during recruitment had an impact on their bid across all fraternities. They conduct a random sample of 85 recruits. Utilizing an alpha level of 0.05, compute a hypothesis test and determine whether these two variables are independent of one another.

 Backwards hatNo hatBucket hat Total
Received bid2110839
Did not receive bid1326746
Total34361585

As we stated in What is a hypothesis test?...

- If the p-value is below the alpha level, then we reject the null hypothesis
- If the p-value is above the alpha level, then we fail to reject the null hypothesis.

In our case, our p-value range below the alpha level. Remember: all values between 0.01 and 0.025 are less than 0.05!

0.01 < p < 0.025

This means that we reject our null hypothesis!

Applying the Chi Square answer template

If you remember in What is a hypothesis test?, we gave the following answer template when working with t-scores:

Since our p-value range of p-value range is less / greater than our alpha level of alpha level value, we reject / fail to reject the null hypothesis. We do / don't have enough evidence to support the alternative hypothesis, which states that description of the alternative hypothesis.

We're going to use this same exact template for Chi Square tests!

Applied to our question, this would give us the following answer to our original question!

Answer: Since our p-value range of 0.01 < p < 0.025 is less than our alpha level of 0.05, we reject the null hypothesis. We do have enough evidence to support the alternative hypothesis, which states that the hat that a recruit wore and whether or not they received a bid are not independent of each other at Crammer Nation University.

Leave a Comment