Understanding the result of a chi-square test

Description

Going beyond the chi-square statistic and its P-value

Introduction

A chisquare test can be used for assessing whether there is association between two categorical variables. The problem it has is that knowing that an association exists is only part of the story; we want to know what is making the association happen. This is the same kind of thing that happens with analysis of variance: a significant \(F\)-test indicates that the group means are not all the same, but not which ones are different.

Recently I discovered that R’s chisq.test has something that will help in understanding this.

Packages

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.1     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.2
✔ purrr     1.2.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

which I always seem to need for something.

Example

How do males and females differ in their choice of eyewear (glasses, contacts, neither), if at all? Some data (frequencies):

eyewear <- tribble(
  ~gender, ~contacts, ~glasses, ~none,
  "female", 121, 32, 129,
  "male", 42, 37, 85
)
eyewear

It is a little difficult to compare since there are fewer males than females here, but we might suspect that males proportionately are more likely to wear glasses and less likely to wear contacts than females.

Does the data support an association at all?

eyewear %>% select(-gender) %>% chisq.test() -> z
z


    Pearson's Chi-squared test

data:  .
X-squared = 17.718, df = 2, p-value = 0.0001421

There is indeed an association.

Coding note: normally chisq.test accepts as input a matrix (eg. output from table), but it also accepts a data frame as long as all the columns are frequencies. So I had to remove the gender column first.¹

So, what kind of association? chisq.test has, as part of its output, residuals. Maybe you remember calculating these tests by hand, and have, lurking in the back of your mind somewhere, “observed minus expected, squared, divide by expected”. There is one of these for each cell, and you add them up to get the test statistic. The “Pearson residuals” in a chi-squared table are the signed square roots of these, where the sign is negative if observed is less than expected:

eyewear

z$residuals

      contacts   glasses       none
[1,]  1.766868 -1.760419 -0.5424069
[2,] -2.316898  2.308440  0.7112591

The largest (in size) residuals make the biggest contribution to the chi-squared test statistic, so these are the ones where observed and expected are farthest apart. Hence, here, fewer males wear contacts and more males wear glasses compared to what you would expect if there were no association between gender and eyewear.

I am not quite being sexist here: the male and female frequencies are equally far away from the expected in absolute terms:

eyewear

z$expected

      contacts glasses      none
[1,] 103.06278 43.6278 135.30942
[2,]  59.93722 25.3722  78.69058

but the contribution to the test statistic is more for the males because there are fewer of them altogether.

Footnotes

This behaviour undoubtedly comes from the days when matrices had row names which didn’t count as a column.↩︎