School of Information Sciences

Is code enough? Stodden, Marinov research focuses on providing code rather than generated data with research

Sharing and reusing research data is becoming increasingly common in the scientific world, allowing researchers to more easily build on the work of others as they seek new discoveries.

A new research project being conducted by Associate Professor Victoria Stodden and Illinois Computer Science Professor Darko Marinov aims to answer key questions about how researchers can reliably share the code used to generate their data rather than the more costly data itself.

"Our question was, when is it possible to save only the code that produced simulated data, and that’s all I need to save, and when do I also need to save the data?" Stodden said. "Simulation codes can produce massive amounts of data, for example petabytes of data. If I can rerun the code and regenerate the data, in theory I don't even need to save the data. For what types of codes is that possible? That's exactly the question we're trying to answer."

The National Science Foundation is funding the work by Stodden and Marinov, providing $300,000 over two years. Stodden, who also is an Illinois Computer Science faculty affiliate, is the PI on the project. Marinov is the co-PI.

As Stodden, who is the lead investigator for the project, explains, the format for scholarly articles has changed little in decades. It provides only a small space to discuss how researchers derived their results.

But as computation has become more integral to research across virtually every scientific field, that format has become inadequate, she said.

"There's such an amount of complexity – the computer can do X calculations per second. So how do you actually explain the increased complexity of computational research in words in a small section in a paper? It can be very, very difficult," Stodden said.

Now some journals, she said, have begun to require researchers to publish their data and code along with their findings.

Stodden and Marinov, an expert on the testing and reliability of software, wondered whether providing the code alone could reliably allow the results of a given paper to be reproduced. And if it is the code that accompanies the published research, what kind of standard should it meet?

"If code is going to travel with this scholarly output, the community will need to come to some type agreement regarding code standards," she said.

For their project, the two are focusing on physics research as an example because of its intensive computational needs.

In preliminary work using articles from the Journal of Computational Physics, Stodden and her group tried to replicate the computational results from 55 articles and were unable to reproduce any. After contacting the authors, Stodden says they came away with the impression that many believed reproducing their computational results would be straightforward, something they found not to be the case.

Eventually, Stodden and Marinov hope to determine whether and how code could be reliably substituted for data for a wide range of fields.

"We want to learn how to do better scientific software, software that is more reliable, and that researchers can trust more," Stodden said. "These questions have come about not because the scientific community isn't doing a good job; they came about because computation is so important, and increasingly so. We're chasing fascinating opportunities here."

Research Areas:
Tags:
Updated on
Backto the news archive

Related News

He inducted into Sigma Xi

Professor Jingrui He has been inducted into Sigma Xi, The Scientific Research Honor Society. Sigma Xi is the international honor society of science and engineering and one of the oldest and largest scientific organizations in the world, boasting a history of service to science and society spanning over 125 years. It has a multidisciplinary membership of scientists, engineers, and scholars, and Sigma Xi chapters can be found in universities and colleges, government laboratories, and commercial research centers.

Jingrui He

Hassan and Bashir receive distinguished paper award

A paper co-authored by PhD student Muhammad Hassan and Associate Professor Masooda Bashir received the Distinguished Paper Award at the Workshop on Security and Privacy in Standardized IoT, which was held last month in San Diego, California, in conjunction with the Network and Distributed System Security (NDSS) Symposium 2026. 

iSchool researchers to present work at Technocracy Conference

This week, iSchool PhD students and faculty will present their research at the Technocracy Conference. Hosted by the Unit for Criticism and Interpretive Theory at the University of Illinois on March 5–6, the conference will begin with a panel of graduate student papers and continue the following day with invited speakers and a keynote. All events will take place at the Levis Faculty Center on the Urbana campus. 

New multi-institutional project to use AI to represent past historical periods

A new project led by a team of researchers from four universities aims to create and evaluate language models that represent past historical periods. The project, "Artificial Intelligence for Cultural and Historical Reasoning," was recently selected for a 2025 Humanities and AI Virtual Institute (HAVI) award from Schmidt Sciences. The $800,000 grant will be split among four institutions: Cornell University, the University of Illinois Urbana-Champaign, The University of British Columbia, and McGill University. Professor Ted Underwood will serve as the principal investigator for the portion of the project at Illinois.

Ted Underwood

Wang group to present at WSDM26

Professor and Associate Dean for Research Dong Wang and PhD student Ruohan Zong will present their research at the 19th ACM International Conference on Web Search and Data Mining (WSDM 26), which will be held from February 22–26 in Boise, Idaho. WSDM is a premier international conference in web search, data mining, and AI, known for its highly selective acceptance rates. This year, the acceptance rate for the main track of the conference was only 16 percent. 

Dong Wang

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Email: ischool@illinois.edu

Back to top