Blog – Negative Data

Blog from Dr Yvonne Couch

Reading Time: 7 minutes

17/06/2022

I was chatting with a friend recently about publishing things and he was lamenting that he had a ton of stuff that was just sort of languishing unfinished. It was interesting but just not complete enough to be considered a paper. He followed it up by saying that he thought people should know about it, and that people would be interested in the data, it was just hard to get it published. This evolved into a conversation about scientific publishing and how it’s evolved into what it is today. And we didn’t really conclude anything, or fix anything, but he did say something which has stuck with me. ‘It used to be about moving science on, advancing the field, not just getting as much data as possible into a paper’. So today we’re going to talk about how we, as early career researchers, can help the scientific field change by thinking about how, where and what we publish.

As an early lockdown project, I helped another friend write a historical review on the field of extracellular vesicles. It was immensely fun because I got to go back and dig out manuscripts written in the 40s and 50s. One of them, from 1946, is a seminal paper. It is 8 pages long and contains 5 tables. That’s it. And the best thing that they do in all of these papers is give you the data they have. They don’t try and solve everything, they just give you what they have and repeatedly use phrases like ‘this will be discussed in detail on a later occasion’.

I know we don’t live in the era of the gentleman scientist anymore, we’re all funding driven crazy people. But fundamental to our core ethos is the progress of science. The discovery of new things. The majority of us are not in this for kudos or the prospect of a Nobel prize, we’re in it to figure out puzzles and problems and learn things. And part of that learning process is the fundamentals of the scientific method, i.e. if we plan, control and execute our experiments properly they will either prove or disprove our hypothesis.

And that should be it. We should just go ‘here are my data, use them as you see fit’. But we don’t. We have to push on beyond the graphs where all the bars are the same to the graphs that look shiny. (side note I have a theory that this is why we’re suddenly all into heat maps and single cell sequencing – red and blue are always going to look different so they’ve sort of got the shiny built in, even if the data fundamentally doesn’t mean anything).

In clinical trial work there have been studies which show that up to a third of clinical trials remain unpublished, largely due to a lack of findings in favour of the novel therapy being tested.

Results were described as having a ‘lack of interest’. But here’s the thing. They’re still interesting, even if they’re negative. Why did the novel drug not work? How can we improve it? Even taking it back to more fundamental principles, if you show me the novel drug doesn’t work then I won’t waste my time and money trying to test the same thing you did.

In stroke research this kind of publication bias is particularly prevalent. A great study by Emily Sena and colleagues at the CAMRADES group showed that out of over 500 publications in the pre-clinical stroke field, only 2% reported no significant effects. This has a knock-on effect on progress, with many of these overly positive findings progressing erroneously to clinical trials, those clinical trials failing and then not being published. Sena and co estimated a further 14% of work in the field had not been published at all.

I have experienced this first hand. I did some work on post-stroke depression in a pre-clinical stroke model. Entirely negative, they could not have cared less that they had a stroke. I took it to a conference as a poster and the number of people who I met who said ‘oh yeah, we tried that, it doesn’t work’ was astounding. I had wasted months of my own time because people had chosen not to publish their findings.

But what causes publication bias? I suspect what underlies it all is something called the Pollyanna Principle (come on, you knew you were waiting for me to say something like this). In the late 1970s, Matlin and Stang demonstrated that “cognitive processes selectively favor processing of pleasant over unpleasant information”. In theory, data is just data. It’s neither pleasant nor unpleasant. But in science we struggle with our data because we often don’t approach the scientific method in the right way.

What we should be doing, and what we do in our grants, is write an achievable and appropriate hypothesis. We should then test this hypothesis and, if our experiments are appropriately controlled for and conducted then we should get a result. This result should refute or confirm the hypothesis.

What we’re often actually doing is going ‘oooo….I wonder what this does?’.

Publishing negative data can lead to greater information sharing and greater results achieved by the scientific community. Often, people learn more from making a mistake rather than achieving success and the same can be said for the scientific community as well.

This is actually a great way to do science, using our fundamental curiosity to drive our experiments. But what it results in is a type of negative data that Bradley Alger, in his work on the scientific hypothesis, refers to as ‘file draw data’. He proposes that what we actually end up with in science as we currently undertake it is two types of negative data – actionable and non-actionable, where the former are valuable and refute a specific hypothesis and the latter are less valuable and negative because they were incomplete or uninterpretable, likely due to poor experimental design. He highlights the importance of hypothesis testing, suggesting that rewarding researchers for effectively doing this should encourage the publication of negative data.

So how can we do that?

Right now, I’m not sure. The fundamental changes in behaviour need to come from everyone. But there are platforms out there designed to at least encourage this kind of behaviour.

Registered reports are predominantly used by the social sciences and are designed to specifically peer review the methods. Researchers say ‘this is what I’m going to do’, the reviewers say yay or nay and if they get the go ahead then the data is what it is, and is published after the registered report. Journals like Nature Human Behaviour and the European Journal of Personality specifically state that ‘papers resulting from this approach will be published regardless of the study outcomes’. But in the biological sciences this approach precludes the ‘oooo….I wonder what this does’ approach which might lead to serendipitous discoveries.

Registered reports also physically slow science down. Forcing researchers to get the methods approved before doing anything. Preprint servers, increasingly popular in the life sciences, do the opposite. They allow researchers to bypass peer review to get their work out in its raw form. Whilst this allows them to claim ownership and use this data to back up grant applications and so forth, the lack of peer review (even with all it’s inherent problems) means that data can be misinterpreted and poorly cited.

In between these two are journals which are trying to actively encourage the publication of negative data. The Journal of Cerebral Blood Flow and Metabolism has a negative results section, the PLOS collection has a variety of platforms for this kind of data including one charmingly entitled Positively Negative and f1000 is a platform designed for preliminary data, negative data and replication studies. The latter are particularly important in pre-clinical studies, where experimental variability is high and translation is poor, but are not actively encouraged by the majority of journals. This means there is little to no incentive to perform them and therefore do good science.

To encourage you to find appropriate platforms for your research and find space for what Alger described as ‘file drawer’ data, some engaging researchers in Berlin and the States have established a platform gloriously called fiddle (or; file drawer data liberation effort). This aims to prevent data from languishing uselessly somewhere and get it out there into the world where it could do some good. Their aim is to reduce publication bias and encourage data sharing.

Hopefully this meander through the pitfalls of the scientific method and publishing system as it stands have given you some ideas about what to do with your negative data. Alger concluded that he believed “a bunch of uninterpretable results should not count as “data” at all and that, in general, there is no real problem if they continue to rest in file drawers” but if you’ve specifically tested a hypothesis and found the data goes in the opposite direction to what you had expected or hoped, get it out there and help other people who might be trying to do the same thing.

Dr Yvonne Couch

Author

Dr Yvonne Couch is an Alzheimer’s Research UK Fellow at the University of Oxford. Yvonne studies the role of extracellular vesicles and their role in changing the function of the vasculature after stroke, aiming to discover why the prevalence of dementia after stroke is three times higher than the average. It is her passion for problem solving and love of science that drives her, in advancing our knowledge of disease. Yvonne has joined the team of staff bloggers at Dementia Researcher, and will be writing about her work and life as she takes a new road into independent research.

Follow @dr_yvonne_couch