Report proposes standards for sharing data and code used in computational studies

Victoria Stodden
Victoria Stodden, Associate Professor

Reporting new research results involves detailed descriptions of methods and materials used in an experiment. But when a study uses computers to analyze data, create models or simulate things that can’t be tested in a lab, how can other researchers see what steps were taken or potentially reproduce results?

A new report by prominent leaders in computational methods and reproducibility lays out recommendations for ways researchers, institutions, agencies and journal publishers can work together to standardize sharing of data sets and software code. The paper "Enhancing reproducibility for computational methods" appears in the journal Science.

"We have a real issue in disclosure and reporting standards for research that involves computation – which is basically all research today," said Victoria Stodden, a University of Illinois professor of information science and the lead author of the paper. "The standards for putting enough information out there with your findings so that other researchers in the area are able to understand and potentially replicate your work were developed before we used computers."

[video:https://youtu.be/94qM6tnDtcQ]

"It is becoming increasingly accepted for researchers to value open data standards as an essential part of modern scholarship, but it is nearly impossible to reproduce results from original data without the authors' code," said Marcia McNutt, the president of the National Academy of Sciences and a co-corresponding author of the study. "This policy forum makes recommendations to enable practical and useful code sharing."

Sharing complete computational methods – data, code, parameters and the specific steps taken to arrive at the results – is difficult for researchers because there are no standards or guides to refer to, Stodden said. It's an extra step for busy researchers to incorporate into their reporting routine, and even if someone wants to share their data or code, there are questions of how to format and document it, where to store it and how to make it accessible.

The report makes seven specific recommendations, such as documenting digital objects and making them retrievable, open licensing, placing links to datasets and workflows in scientific articles, and reproducibility checks before publication in a scholarly journal.

The authors hope that disclosing computational methods will not only allow other researchers to verify and reproduce results, but also to build upon studies that have been done, such as performing different analyses with a dataset or using an established workflow with new data.

"Things like how you prepped your data – what you did with outliers or how you normalized variables, all the things that are standard in data analysis – can make a big impact on results," Stodden said. "Some researchers make code and data accessible on point of principle, so it's possible. But it takes time. We know it's hard, but in this report we're trying to say in a very productive and positive way that data, code and workflows need to be part of what gets disclosed as a scientific finding."

Research Areas:
Tags:
Updated on
Backto the news archive

Related News

Stodden reappointed to National Academy of Engineering Online Ethics Center Advisory Group

Associate Professor Victoria Stodden has been reappointed as a member of the Advisory Group for the National Academy of Engineering Online Ethics Center (OEC). OEC provides resources for understanding and addressing ethically significant issues in science and engineering, serving those who promote learning and advance the understanding of responsible research and practice.

Victoria Stodden

Chan joins iSchool faculty

The iSchool is pleased to announce that Anita Say Chan has joined the faculty. She also holds a joint appointment with the College of Media, where she is an associate professor of communications in the Department of Media and Cinema Studies.

Anita Say Chan

Downie to give keynote at digital scholarship symposium

Professor and Associate Dean for Research J. Stephen Downie will be the keynote speaker for Digital Scholarship Symposium 2019, which will be held on March 19 at The Chinese University of Hong Kong (CUHK). The theme of this year's symposium is "(Re-)Mining Text: From Traditional to Digital." Co-organized by the Hong Kong Literature Research Centre and CUHK Library, the event aims to explore techniques and applications of text mining in the era of digital scholarship.

J. Stephen Downie, Professor and Associate Dean for Research

Bonn to present research at NFAIS 2019 Humanities Roundtable

Associate Professor Maria Bonn will discuss Publishing Without Walls (PWW) at the National Federation of Advanced Information Science (NFAIS) 2019 Humanities Roundtable, which will be held on March 10 in Washington, D.C. The topic of this year's program is "Evaluation of Digital Scholarship in the Humanities and Its Impact." It will address the skills, tools, and resources required for digital humanities evaluation as well as how publishers, libraries, and content aggregators can better support digital humanities.

Maria Bonn

Uni High students advance their research with iSchool faculty

Through the Frankel Scholars internship program, students enrolled in University of Illinois Laboratory High School are able to collaborate with iSchool faculty mentors on various research projects. This year's inaugural program, managed by the National Center for Supercomputing Applications (NCSA), includes 16 students who complete internships led by an NCSA staff member or an NCSA faculty affiliate from other units on campus, including the iSchool.