Report proposes standards for sharing data and code used in computational studies

Victoria Stodden
Victoria Stodden, Associate Professor

Reporting new research results involves detailed descriptions of methods and materials used in an experiment. But when a study uses computers to analyze data, create models or simulate things that can’t be tested in a lab, how can other researchers see what steps were taken or potentially reproduce results?

A new report by prominent leaders in computational methods and reproducibility lays out recommendations for ways researchers, institutions, agencies and journal publishers can work together to standardize sharing of data sets and software code. The paper "Enhancing reproducibility for computational methods" appears in the journal Science.

"We have a real issue in disclosure and reporting standards for research that involves computation – which is basically all research today," said Victoria Stodden, a University of Illinois professor of information science and the lead author of the paper. "The standards for putting enough information out there with your findings so that other researchers in the area are able to understand and potentially replicate your work were developed before we used computers."

[video:https://youtu.be/94qM6tnDtcQ]

"It is becoming increasingly accepted for researchers to value open data standards as an essential part of modern scholarship, but it is nearly impossible to reproduce results from original data without the authors' code," said Marcia McNutt, the president of the National Academy of Sciences and a co-corresponding author of the study. "This policy forum makes recommendations to enable practical and useful code sharing."

Sharing complete computational methods – data, code, parameters and the specific steps taken to arrive at the results – is difficult for researchers because there are no standards or guides to refer to, Stodden said. It's an extra step for busy researchers to incorporate into their reporting routine, and even if someone wants to share their data or code, there are questions of how to format and document it, where to store it and how to make it accessible.

The report makes seven specific recommendations, such as documenting digital objects and making them retrievable, open licensing, placing links to datasets and workflows in scientific articles, and reproducibility checks before publication in a scholarly journal.

The authors hope that disclosing computational methods will not only allow other researchers to verify and reproduce results, but also to build upon studies that have been done, such as performing different analyses with a dataset or using an established workflow with new data.

"Things like how you prepped your data – what you did with outliers or how you normalized variables, all the things that are standard in data analysis – can make a big impact on results," Stodden said. "Some researchers make code and data accessible on point of principle, so it's possible. But it takes time. We know it's hard, but in this report we're trying to say in a very productive and positive way that data, code and workflows need to be part of what gets disclosed as a scientific finding."

Research Areas:
Tags:
Updated on
Backto the news archive

Related News

New HRI Research Clusters include iSchool faculty

Two projects led by iSchool faculty members have been selected as Humanities Research Institute (HRI) Research Clusters for 2020-2021. Formerly known as the Illinois Program for Research in the Humanities, HRI fosters interdisciplinary study in the humanities, arts, and social sciences at the University of Illinois. HRI Research Clusters enable faculty and graduate students to “develop questions or subjects of inquiry that require or would be enhanced by collaborative work.” Projects selected as clusters receive grants of $2,500 to support their activities.

Midwest Big Data Innovation Hub announces leadership changes

As its second year of new funding begins, there is new leadership at the Midwest Big Data Innovation Hub (MBDH), with a swap in principal investigators and the appointment of a new executive director. Catherine Blake, a co-principal investigator (PI) on the project, has moved into the PI role, while William (Bill) Gropp transitions to co-PI duties. Long-time Hub staff member John MacMullen was named executive director in January.

Catherine Blake

Hinchliffe receives ALISE Best Conference Paper Award

Affiliate Professor Lisa Janicke Hinchliffe, professor and coordinator for information literacy services and instruction in the University Library,  and Kyle Jones, assistant professor in the School of Informatics and Computing at the Indiana University-Indianapolis, have received the ALISE Best Conference Paper Award for "New Methods, New Needs: Preparing Academic Library Practitioners to Address Ethical Issues Associated with Learning Analytics."

Lisa Janicke Hinchliffe

Stodden proposes guide for developing common data science approaches

The use of data science tools in research across campuses has exploded–from engineering and science to the humanities and social sciences. But there is no established data science discipline and no recognized way for various academic fields to develop and integrate accepted data science processes into research. Associate Professor Victoria Stodden has proposed a framework for guiding researchers and curriculum development in data science and for aiding policy and funding decisions. She outlines the approach in the journal Communications of the ACM.

Victoria Stodden

Pintar named University of Illinois Distinguished Teacher-Scholar

Teaching Associate Professor and Acting BS/IS Program Director Judith Pintar has been selected by the Office of the Provost and the Vice Chancellor for Academic Affairs as the University of Illinois Distinguished Teacher-Scholar for the 2020-2021 academic year. The program offers faculty members an opportunity to engage in an in-depth analysis of the craft and art of teaching, consider new approaches, and put their insights to work in ways that will benefit their students and the campus community. Pintar will receive $7,500 for her project and an additional $7,500 for a research assistant.

Judith Pintar