
If there’s one thing scientists can add to maximize their data’s value, it’s “metadata, metadata, metadata”.
Miguel Acevedo typically gets two questions about his research on malaria in lizards. “Do lizards really get malaria?” (The answer is yes.) And, “Will I get malaria from a lizard?” (Not likely.)
Lizard malaria is a model for vector-borne disease ecology and evolution1. A colleague had been pursuing the same problem, at the same site in Puerto Rico, since the 1990s, and Acevedo, a wildlife ecologist at the University of Florida in Gainesville, wanted to combine those older data with his own to perform a long-term analysis. It was easier said than done. Whereas Acevedo’s data were logged using a standardized data-entry template, the colleague’s data were recorded in a mix of paper notebooks, Excel spreadsheets and hand-drawn maps. “It was some of the most organized data of that era, but we didn’t have the standards then that we have today,” he says. Columns weren’t necessarily consistent from sheet to sheet, nor did they use the same units, and it wasn’t always clear which sampling sites were being measured.
In the end, what could have been a morning’s effort took “six or seven months”, Acevedo says. “It’s a lot of work, and it’s not fun work, you know?”
Funder and publisher mandates, coupled with a growing emphasis on open science and reproducibility, mean that researchers are increasingly depositing data alongside their publications. Other scientists can use those data to drive new research. But not every journal requires that authors make their data sets available, and some authors decline to do so, either for fear of getting scooped or for lack of time. (The research data policy for Springer Nature, which publishes Nature, “strongly encourage[s] that all datasets supporting the analysis and conclusions of the paper are made publicly available at the time of publication”, and mandates “the sharing of community-endorsed data types”.)
Nature asked data scientists about their best practices for publishing usable, high-quality data — here’s what they said….
Read the full article on the Nature Careers Website