Counterintuitive fact o' the day (and one I'm a little embarrassed I just learned).
Let's say you run an experiment on a sample population. You do this by dividing the population into two subgroups and running the experiment on each separately. The experiment succeeds for each subgroup.
If the experiment succeeded for each sub-sample, then it necessarily succeeded for the combined sample, right?
Nope. Sometimes an experiment's apparent success on two samples disappears when the samples are combined. This is Simpson's paradox.
Example: A pharmaceutical company wants to determine whether a new drug performs better than an old drug. The manufacturer tests New Drug on 300 patients in Chicago and 300 patients in Cleveland. In Chicago, 240 of the patients get New Drug and the other 60 get Old Drug. Ninety of the former (37.5%) and 20 of the latter (33.3%) recover. Thus, New Drug performs better than Old Drug in the Chicago trial.
In Cleveland, 60 of the patients get New Drug and 240 get Old Drug. Thirty of the former (50%) and 110 of the latter (45.8%) recover. Hence New Drug performs better than Old Drug in the Cleveland trial, too.
But New Drug's apparent superiority disappears once the samples are combined -- 120 of the 300 that took New Drug recovered (40%) while 130 of the 300 that took Old Drug recovered (43.3%). Thus Old Drug did better than New Drug in the aggregate sample even though New Drug did better than Old Drug in each of the two sub-samples.*
Weird.
(*Example lifted from Simon & Blume, Mathematics for Economists)
