Jonah Lehrer has posted a reply to criticisms of his article on the ‘decline effect’ in science. It doesn’t work, and his arguments miss the mark. As I said, what the statistics show is that science is difficult, and sometimes answers are very hard to come by; that does not justify his attempt to couch the problem in language that suggests a systematic failure of science.
John Allen Paulos is saying the same thing. You can trust him, he’s a professional mathematician.
In Richard Feynman’s commencement address – Cargo Cult Science – in 1974, he actually discussed all of this, just directed at the scientific audience not , no a lay audience. We in science know this but the media does such a terrible job explaining just how hard science is and how skeptical most scientists are of dazzling new results.
The scientific method works in spite of the fact that fallible humans are involved. The fact that so many published results turn out to not be as amazing on reinvestigation demonstrates the strength of science. not its weakness.
Some quotes from the wisdom of Feynman almost 40 years ago:
We’ve learned from experience that the truth will come out. Other experimenters will repeat your experiment and find out whether you were wrong or right. Nature’s phenomena will agree or they’ll disagree with your theory. And, although you may gain some temporary fame and excitement, you will not gain a good reputation as a scientist if you haven’t tried to be very careful in this kind of work. And it’s this type of integrity, this kind of care not to fool yourself, that is missing to a large extent in much of the research in cargo cult science.
or this one:
We have learned a lot from experience about how to handle some of the ways we fool ourselves. One example: Millikan measured the charge on an electron by an experiment with falling oil drops, and got an answer which we now know not to be quite right. It’s a little bit off, because he had the incorrect value for the viscosity of air. It’s interesting to look at the history of measurements of the charge of the electron, after Millikan. If you plot them as a function of time, you find that one is a little bigger than Millikan’s, and the next one’s a little bit bigger than that, and the next one’s a little bit bigger than that, until finally they settle down to a number which is higher.
Why didn’t they discover that the new number was higher right away? It’s a thing that scientists are ashamed of–this history–because it’s apparent that people did things like this: When they got a number that was too high above Millikan’s, they thought something must be wrong–and they would look for and find a reason why something might be wrong. When they got a number closer to Millikan’s value they didn’t look so hard. And so they eliminated the numbers that were too far off, and did other things like that. We’ve learned those tricks nowadays, and now we don’t have that kind of a disease.
or my favorite:
The first principle is that you must not fool yourself–and you are the easiest person to fool. So you have to be very careful about that. After you’ve not fooled yourself, it’s easy not to fool other scientists. You just have to be honest in a conventional way after that.
Too many scientists and entirely too many journalists allow themselves to be fooled. And Feynman discussed something mentioned in the above articles:
If you’ve made up your mind to test a theory, or you want to explain some idea, you should always decide to publish it whichever way it comes out. If we only publish results of a certain kind, we can make the argument look good. We must publish both kinds of results.
The publication incentives from academia often go against this need. Eventually the truth does come out but it can sometimes be hard in an area with lots of conflicting theories. For example, and Feynman mentions this, one lab’s results becomes another lab’s control. Except, they might not repeat the control, assuming the original paper was correct. Thus they can report ‘new’ data that may only represent regression to the mean.
Whereas if they repeated the control, they would see no real difference. If you job depends on having novel paper and repeating a control made it less likely you would get a paper published, then it is human nature to not repeat the control. Of course, and Feynman states this, there is usually strong social pressure in science to do the proper controls. If we read a paper and see that there was an obvious control not mentioned, our first impulse is to wonder why.
Sometimes, especially when dealing with human populations, redoing the control can be quite expensive. Much better to assume that what another lab published will serve as an historical control for the present work. This is another shortcut that permits people to get their work published where it looks like they are finding something new, rather than just reporting regression to the correct value.
Another point – scientists take a p value of 5% or less as meaning it is statistically significant. That is, there is less than a 1 in 20 chance that the experiment would yield the results by random chance. But if say 40 labs do the experiment, then several WILL get the ‘statistically significant’ result by pure chance. There is no real effect just random fluctuation. Those labs that get a result, though, then get to publish a paper on data which are really only due to random chance. When anyone else tries to repeat the experiment, we see regression to the mean and the significance disappears.
This is why so many researchers hold in abeyance the results until others have looked at them. And, conversely, why they will hold onto models that have been demonstrated to have great explanatory value because so much of their underlying data have been repeated.
Finally, there is strong social pressure to do things right and not take the sort of shortcuts that academia sometimes pushes for (e.g. lack of relevant control)/.Simply look what recently happened with the “arsenic -based life” stories came out. There were a lot of criticisms of the work and some people even wondered how it could have been published. The hell that can be raised, especially today, if you get a major paper published which still has several holes in the work serves as a very strong negative restraint.
While there are some definite incentives to put out shoddy papers that lack proper controls, there are also even stronger incentives to do it right. Particularly today when so many people online can examine the work post publication.
Coming up with a new model is wonderful but it had better be able to withstand assault from repetition and regression to the mean. That is what separates great work from the mediocre work that is generally seen.