How much of science is junk science?

Just how much of science is actually cargo cult science? Recall that cargo cult science is an activity that looks like science, but it does not actually work. So how do we tell if science “works?” I would propose that it works when the experiment and its results can be independently replicated. So we need to ask how much of science has indeed been independently replicated.

We don’t really know how much of science is actually cargo cult science. But I suspect it is more than most people think. Consider the article, Replication studies:Bad copy In the wake of high-profile controversies, psychologists are facing up to problems with replication, by Ed Yong. He begins by noting an experiment that was published in the peer review literature that showed evidence of psychic abilities. Because such findings were so controversial, three other labs tried to independently replicate the findings (less controversial studies apparently do not receive the same scrutiny). They failed to do so. But then they stumbled upon a bigger problem in science: “they faced serious obstacles to publishing their results.” Yong explains the situation as follows:

Positive results in psychology can behave like rumours: easy to release but hard to dispel. They dominate most journals, which strive to present new, exciting research. Meanwhile, attempts to replicate those studies, especially when the findings are negative, go unpublished, languishing in personal file drawers or circulating in conversations around the water cooler. “There are some experiments that everyone knows don’t replicate, but this knowledge doesn’t get into the literature,” says Wagenmakers. The publication barrier can be chilling, he adds. “I’ve seen students spending their entire PhD period trying to replicate a phenomenon, failing, and quitting academia because they had nothing to show for their time.” These problems occur throughout the sciences, but psychology has a number of deeply entrenched cultural norms that exacerbate them. It has become common practice, for example, to tweak experimental designs in ways that practically guarantee positive results. And once positive results are published, few researchers replicate the experiment exactly, instead carrying out ‘conceptual replications’ that test similar hypotheses using different methods. This practice, say critics, builds a house of cards on potentially shaky foundations.

It is interesting to note that in 2012, “once positive results are published, few researchers replicate the experiment exactly, instead carrying out ‘conceptual replications’ that test similar hypotheses using different methods.” Recall that Feynman outlined the same situation back in the 1940s:

She was very delighted with this new idea, and went to her professor. And his reply was, no, you cannot do that, because the experiment has already been done and you would be wasting time. This was in about 1947 or so, and it seems to have been the general policy then to not try to repeat psychological experiments, but only to change the conditions and see what happens.

So a lot of science is not truly being replicated and has not been replicated since the 1940s. That doesn’t mean it is necessarily cargo cult science, but it does mean a lot of it could be cargo cult science. Yet it gets worse than this. Joseph P. Simmons, Leif D. Nelson, and Uri Simonsohn recently published a paper in Psychological Science entitled, False-Positive Psychology : Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. They show that it is just too easy to generate positive results for just about any topic. They write:

Perhaps the most costly error is a false positive, the incorrect rejection of a null hypothesis. First, once they appear in the literature, false positives are particularly persistent. Because null results have many possible causes, failures to replicate previous findings are never conclusive. Furthermore, because it is uncommon for prestigious journals to publish null findings or exact replications, researchers have little incentive to even attempt them. Second, false positives waste resources: They inspire investment in fruitless research programs and can lead to ineffective policy changes. Finally, a field known for publishing false positives risks losing its credibility.

The key point, IMO, is how a failure the replicate previous findings can be explained away with many possible causes. This means there is a built in intellectual inertia that favors the existence of cargo cult science within science. In other words, if lab X cannot replicate the work of lab Y, the failure of lab X is likely to be explained according to technicalities and unseen, but assumed to be unimportant, variables. So the failure to replicate is simply filed away.  False positives are thus immunized to a certain extent. But it’s even worse than that:

In this article, we show that despite the nominal endorsement of a maximum false-positive rate of 5% (i.e., p ≤ .05), current standards for disclosing details of data collection and analyses make false positives vastly more likely. In fact, it is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis.

So it’s easy to statistically demonstrate a positive result when no such positive result truly exists. In fact, take in the abstract of this paper:

In this article, we accomplish two things. First, we show that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.

So, if it is “unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis” and a large chunk of science is not independently replicated and confirmed for a variety of reasons, it stands to reason that a large chunk of mainstream science is actually cargo cult science. Ah, but maybe this is a problem for social sciences, which do rely heavily on statistics to show their positive results. Surely this is not a problem for the hard sciences, right? Well, consider this recent report:

A former researcher at Amgen Inc has found that many basic studies on cancer — a high proportion of them from university labs — are unreliable, with grim consequences for producing new medicines in the future. During a decade as head of global cancer research at Amgen, C. Glenn Begley identified 53 “landmark” publications — papers in top journals, from reputable labs — for his team to reproduce. Begley sought to double-check the findings before trying to build on them for drug development. Result: 47 of the 53 could not be replicated. He described his findings in a commentary piece published on Wednesday in the journal Nature.

Here’s some more:

Other scientists worry that something less innocuous explains the lack of reproducibility. Part way through his project to reproduce promising studies, Begley met for breakfast at a cancer conference with the lead scientist of one of the problematic studies. “We went through the paper line by line, figure by figure,” said Begley. “I explained that we re-did their experiment 50 times and never got their result. He said they’d done it six times and got this result once, but put it in the paper because it made the best story. It’s very disillusioning.

and

The surest ticket to getting a grant or job is getting published in a high-profile journal,” said Fang. “This is an unhealthy belief that can lead a scientist to engage in sensationalism and sometimes even dishonest behavior.” The academic reward system discourages efforts to ensure a finding was not a fluke. Nor is there an incentive to verify someone else’s discovery. As recently as the late 1990s, most potential cancer-drug targets were backed by 100 to 200 publications. Now each may have fewer than half a dozen. “If you can write it up and get it published you’re not even thinking of reproducibility,” said Ken Kaitin, director of the Tufts Center for the Study of Drug Development. “You make an observation and move on. There is no incentive to find out it was wrong.”

So it’s not just a problem for the social sciences, now is it? What are we to make of all this? Does it mean we can toss out all of science? Of course not, as many scientific studies have been replicated in the lab or in the form of the generation of new technologies.

What it means is that anytime we are presented with new research findings, we should remain skeptical until the results have been independently replicated. This is especially true when someone who seems to have a political or metaphysical agenda is pushing the results from this study or that study. If you encounter such a person, the proper response is: “Has that study been replicated and, if so, can you provide the reference?” If they cannot show the study has been independently replicated by a different lab, then it is perfectly fair to remain skeptical and raise the very real possibility that someone is trying to advance their agenda by citing cargo cult science. They may accuse you of being “anti-science” for rejecting their favorite study, but you should then point out that anyone who is bothered or upset by the need to replicate scientific findings is the one who is truly “anti-science.”

About these ads
This entry was posted in Science and tagged . Bookmark the permalink.

10 Responses to How much of science is junk science?

  1. bbrown1 says:

    I spent two years doing basic science research on the replication of T4 virus DNA at Haverford College, then enrolled in a PhD program at Vanderbilt Unv. expanding on that research. I attended conferences where research was presented and submersed myself in the basic science research world for many years.
    I’d say about 1/3 of all basic research, most paid for with tax dollars (it’s an unbelievably massive ‘industry’) was fradulent. The incentives to falsify, often subtly, one’s findings, is overwhelming. The actual percentage of fraudulent studies might actually exceed 50%.
    I’m very surprised that this has not become a national scandal. I would encourage you to pursue this line of inquiry. You are on to something that is beyond massive. It was so obvious to me when I was in it, yet almost no one who is not a part of the basic research community has a clue, and it’s so easy for scientists to hide behind their jargon and stats.
    The waste of taxpayer money is criminal. The NIH is one of the worst culprits.

  2. Bilbo says:

    Mike,

    I’m wondering if a “simple” solution would be to start a peer-reviewed journal whose goal was only to publish work that has attempted to replicate previously published results.

  3. Bilbo says:

    The more I think about it, this is probably an ingenious suggrestion on my part. Once this journal gets started (let’s come up with a good name for it), the question will no longer be, “Is it peer-reviewed?” but “Has it been duplicated?” Researchers will no longer submit sloppy or fraudulent work, knowing that someone out there will have the ability to try to duplicate their effors and there would be a high probability of their duplicated attempts being published.

    I love it!

  4. Bilbo says:

    Oh, I guess “replicate” was the right word.

  5. Bilbo says:

    In fact, that might be a good name for the journal: “Replicate.”

  6. Michael says:

    It’s a nice idea, but I don’t see it ever happening.

    1. You can’t replicate all studies, so how do you pick which ones to replicate?
    2. How do we interpret failure to replicate?
    3. Most importantly, there is no incentive for such a journal and such work.

    In the meantime, we are left with what we have. So I would suggest that next time someone like Art tries to dazzle you with some abiogenesis paper, make sure the work has been replicated by a completely different lab.

  7. Crude says:

    2 stands out to me.

    In fact, one question I have would be… is attempting to replicate at all discouraged in science? I mean, would calling up some scientist and saying “Hey, we want to replicate your research to make sure your interpretation is right. Of course if it’s wrong we’ll announce this all over.” be, at least sometimes, interpreted negatively?

  8. Bilbo says:

    Mike,

    I’m disappointed. Here I thought you were not responding to my idea because you instantly saw how brilliant it was and werre contacting your science buddies everywhere, figuring out how to make the idea a reality. I wasn’t even going to insist on credit for it. Instead, I get what are very weak responses (especially for you).

    1. You can’t replicate all studies, so how do you pick which ones to replicate?”

    That’s up to the researchers.

    2. How do we interpret failure to replicate?”

    That question can be interpreted as, “How do we interpret whether the attempt to replicate was similar enough?” in which case by comparing the methodology. Or it can mean, “How do we interpret whether the results supported or failed to support the initial experiment?” in which case, it’s not important to determine whether the results pass or fail. It’s important that the results be published for others to see and determine whether further similar experiments need to be performed.

    3. Most importantly, there is no incentive for such a journal and such work.”

    Horse pucky. You’ve already pointed out PhD candidates who replicate work and can’t get it published. Now they would have even more incentive to do such work. And the incentive for the publishers? They become the gold standard for the second and equally important part of science — replication. Everone would be looking forward to the next issue of their publication to see whose work is put to the test.

    And then you top it off by telling me to make sure Art’s work has been replicated! And how the hell am I supposed to make sure that happens? But if there were a journal that rewards people for trying to replicate the work, then yeah, there’s a good chance somebody out there will try to replicate Art’s work, knowing that regardless of the results, they have a good chance of being published.

    Come on, Mike! I hand you the idea of a lifetime and you pee on it! I’m serious about this, man. Wake up!

  9. Bilbo says:

    Darn, forgot to close emphasis.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s