Sometimes even things that have little statistical significance can still carry tremendous social significance.
A colleague sent me an email with the above title and the following content.
We were talking about Jamaica childhood intervention study. The Science paper on returns to the intervention 20 years later found a 25% increase but an earlier draft had reported a 42% increase. See here.
Well, it turns out the same authors are back in the stratosphere! In a Sept 2021 preprint, they report a 43% increase, but now 30 rather than 20 years after the intervention (see abstract). It seems to be the same dataset and they again appear to have a p-value right around the threshold (I think this is the 0.04 in the first row of Table 1 but I did not check super carefully).
Of course, no mention I could find of selection effects, the statistical significance filter, Type M errors, the winner’s curse or whatever term you want to use for it.
Great article describing how statistics are misused in complex studies. And how even things that might not be statistically significant by math might sill be important when dealing with real people.
Essentially, they found a 37% improvement in wages from early childhood interventions 20 years later. That is a great number. But they do a lot of handwaving to get people to not notice that the error is ±44%! And lots of bending over to make this statistically significant.
As explained, this shows how noisy the data are. The real value wowuld be somewhere between a 7% negative effect to an 81% effect. If we were dealing with a biochemical rewaction, such noisiness would likely prevent the paper from being published as is. These numbers do not indicate statistical significanece. The error needs to be reduced.
But we are dealing with people and with very complex situations. Even a 10% increase would be huge for the society. And if it was over 50%, it would be miraculous.
So, while the numbers are not definitive in a statistical sense, they have huge social and economic ramifications. We know the “real” number is likely somewhere between the limits. So as for public policy, the data does show a postive effect and one that could be huge.
Interesting view from a statistician revealing how depnding on the setting, data that is noisy can still be used to a real positive purpose. Here it does not really matter if the ‘real’ nukmber is 10%, 20% or 80%. What matters is the likely positive effect.
The author of the post thinks papers should be published whether the data is statistically significant or not. ANd in this case I can see his point. The data are always going to be horribly noisy. That a positive effect can still be seen provides information to make policy decisions.
[Image: Jose Camões Silva]