School of Information Sciences

Report proposes standards for sharing data and code used in computational studies

Reporting new research results involves detailed descriptions of methods and materials used in an experiment. But when a study uses computers to analyze data, create models or simulate things that can’t be tested in a lab, how can other researchers see what steps were taken or potentially reproduce results?

A new report by prominent leaders in computational methods and reproducibility lays out recommendations for ways researchers, institutions, agencies and journal publishers can work together to standardize sharing of data sets and software code. The paper "Enhancing reproducibility for computational methods" appears in the journal Science.

"We have a real issue in disclosure and reporting standards for research that involves computation – which is basically all research today," said Victoria Stodden, a University of Illinois professor of information science and the lead author of the paper. "The standards for putting enough information out there with your findings so that other researchers in the area are able to understand and potentially replicate your work were developed before we used computers."

[video:https://youtu.be/94qM6tnDtcQ]

"It is becoming increasingly accepted for researchers to value open data standards as an essential part of modern scholarship, but it is nearly impossible to reproduce results from original data without the authors' code," said Marcia McNutt, the president of the National Academy of Sciences and a co-corresponding author of the study. "This policy forum makes recommendations to enable practical and useful code sharing."

Sharing complete computational methods – data, code, parameters and the specific steps taken to arrive at the results – is difficult for researchers because there are no standards or guides to refer to, Stodden said. It's an extra step for busy researchers to incorporate into their reporting routine, and even if someone wants to share their data or code, there are questions of how to format and document it, where to store it and how to make it accessible.

The report makes seven specific recommendations, such as documenting digital objects and making them retrievable, open licensing, placing links to datasets and workflows in scientific articles, and reproducibility checks before publication in a scholarly journal.

The authors hope that disclosing computational methods will not only allow other researchers to verify and reproduce results, but also to build upon studies that have been done, such as performing different analyses with a dataset or using an established workflow with new data.

"Things like how you prepped your data – what you did with outliers or how you normalized variables, all the things that are standard in data analysis – can make a big impact on results," Stodden said. "Some researchers make code and data accessible on point of principle, so it's possible. But it takes time. We know it's hard, but in this report we're trying to say in a very productive and positive way that data, code and workflows need to be part of what gets disclosed as a scientific finding."

Research Areas:
Tags:
Updated on
Backto the news archive

Related News

Faculty and staff recognized with inaugural iSchool awards

The iSchool recognized faculty and staff for their contributions to teaching and outstanding service to the School at a ceremony on May 6. Interim Dean Emily Knox presented plaques to the inaugural recipients of the Faculty Teaching Award, Adjunct Teaching Award, and Staff Excellence Award.

Paper by He's lab recognized at ICLR 2026 workshop

The iDEA-iSAIL Joint Laboratory at the University of Illinois received an Outstanding Paper Award at the International Conference on Learning Representations (ICLR) 2026 Logical Reasoning of Large Language Models Workshop for their paper, "RAG Over Tables: Hierarchical Memory Index, Multi-State Retrieval, and Benchmarking." Paper authors include lab members Jingrui He, professor and MSIM program director; Sirui Chen, Xinrui He, and Zihao Li, computer science PhD students; Jiaru Zou, computer science MS student; Dongqi Fu, alum; as well as Jiawei Han, professor of computer science, and Yada Zhu, IBM collaborator. Chen gave an oral presentation of the research at the workshop, which was held last month in Rio de Janeiro, Brazil. This award was selected out of 206 accepted papers at the workshop.

Jingrui He

iSchool to shape development of cultural heritage documentation standards

The School of Information Sciences at the University of Illinois Urbana-Champaign has formally joined the special interest group (SIG) that leads the development of the CIDOC Conceptual Reference Model (CRM), an ISO standard (21127:2023) for the exchange and integration of wide-ranging scientific and scholarly documentation about the past. 

Nicola Carboni

Downie presents TORCHLITE in Germany

This week, Professor and Executive Associate Dean J. Stephen Downie was a guest speaker at the Herder Institute in Marburg and the University of Göttingen. Downie, who serves as co-director of the HathiTrust Research Center (HTRC), lectured on the HTRC's "Tools for Open Research and Computation with HathiTrust: Leveraging Intelligent Text Extraction" (TORCHLITE) project.

J. Stephen Downie

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Email: ischool@illinois.edu

Back to top