2011-07-02

Correlation and Causation

Correlation does not necessarily imply causation is probably the first lesson taught in a first course in statistical inference. In the second lesson, we are told that "the point" of statistical inference is to come up with "good" theories that explain the observed correlation. (What constitutes a good theory is the subject deserving of its own post!) A lot of times, we remember the first lesson, but don't really remember the second.

Take a look at newspapers that publish shocking results from studies, like increased ice cream sales linked to increased crime; or children attending a lot of 4th of July parades tend to grow up to be Republicans. These correlations are undoubtedly interesting, so it isn't surprising that the media would use these results as headlines. But in doing so, the most important 90% of the social science has been neglected: the causal mechanism that explains why ice cream sales are related to crime, or what the hell parade attendance have to do with party identification.

A lot of times, I see people disregard these studies as being unbelievable because the media overemphasizes the shocking correlations, so the more important causal mechanism that actually sheds light on the situation becomes second in priority. The readers, painfully aware of the first lesson, probably find it hard to translate the correlation to a real life explanation, so they dismiss the result. This is where people start to contrast "those studies" with reality.

It is true that an unexpected correlation can be the seed to developing a neat theory that explains it, but we should never forget that the causal mechanism is still the most important. I want to know it is warm weather that leads to increases in both ice cream sales and crime (more crime because the weather permits staying outside more). I actually don't see how the parade one works without being super cynical.

No comments:

Post a Comment