The National Cancer Institute’s SEER program has also computed the Yost index at the census tract level (link), and researchers there asked if I could check to see how well my values agreed with theirs. The good news is that they are virtually identical. There were a few minor things we did differently:
- My values include Puerto Rico, while the SEER values do not.
- I used more conservative exclusion criteria. Following a rule of thumb first proposed by Ana Diez Roux, I excluded tracts with populations less than 100, fewer than 30 housing units, or with more than 1/3 of the population living in group quarters. For the first two, the concern is that the census estimates are too unstable. For the third, the concern is that measures like median home value and unemployment rate are not meaningful in places like college campuses, military bases, and prisons. I am not sure of the exact rules that SEER used, but I ended up with about 1% fewer tracts.
- I imputed missing census values; I do not know exactly how SEER handled these. Missing values range from virtually 0% (education and poverty rate) to 3% (median house value and median rent).
For the years we have in common (2010-2014 and 2013-2017), the mean difference in percentile ranking is a mere 0.02 and the maximum difference is 4. The R-squared is above .99. That seemed too good to be true, so I rechecked it several times, and it is correct.