This project examines the impact of different research funding structures on the training of future scientists, particularly graduate students and postdoctoral fellows, and the impact on their subsequent outcomes. Our proposed research begins by examining the way in which research (and most training) is funded and done. We classify projects by whether they are large or small scale (by funding size); multiple researchers; or multiple institutions. We construct different measures of project teams, and capture the subsequent trajectories of the students and postdoctoral fellows during and after their contact with the teams. We make use of a natural experiment and quasi experimental statistical techniques to separate the effect of funding structures from the other factors contributing to...

The goal of this research is to help researchers develop and use relatively simple tools to describe species in a way that make those descriptions easier to share with other scientists and easier for computers to process and analyze. The approach is bottom-up and iterative, involving the rapid prototyping of tools, combining of existing tools, and the tailoring of applications developed for one purpose but now being reused for this scientific activity. Innovation from this project is applicable to the long-term development of open source software initiatives serving labs throughout the world. The project provides rich, real-world training for graduate students in library and information sciences, training them to be much needed cross-disciplinary researchers in a field desperate for...

Data Observation Network for Earth (DataONE) is a collaborative, global project that is laying the groundwork for a new, innovative approach to conducting environmental science research. DataONE is a distributed framework and sustainable infrastructue poised to resolve many of the key challenges that hinder the realization of more global, open, and reproducible science, through four interrelated cyberinfrastructure (CI) activities:

  • significantly expanding the volume and diversity of data available to researchers for large-scale scientific innovation and discovery;
  • incorporating innovative and high-value science-enabling features into the DataONE CI;
  • maintaining and improving core software and...
Taxonomists are scientists who describe the world’s biodiversity. These descriptions of millions of species allow scientists to do many different kinds of research, including basic biology, environmental science, climate research, agriculture, and medicine. The problem is that describing any one species is not easy. The language used by taxonomists to describe their data is complex, and typically not easily understandable by computers nor even other scientists. This situation makes it harder to search for patterns across millions of species documented by thousands of researchers over many decades of work worldwide.

The goal of this research is to help researchers develop and use relatively simple tools to describe species in a way that makes those descriptions easier to share...

How can we be rule compliant and still innovate? The collection and analysis of human-centered and/ or data are governed by multiple sets of norms and regulations. Problems can arise when researchers are unaware of applicable rules, uninformed about their practical meaning and compatibility, and insufficiently skilled in implementing them. We are developing and delivering educational modules to address this issue.


Oct. 3, 2017

Professor and Center for Informatics Research in Science and Scholarship (CIRSS) Director Bertram Ludäscher and collaborators are presenting their joint work and tools for data quality, cleaning, and provenance at the 33rd Annual Biodiversity Information Standards conference, TDWG 2017, from October 1-6 in Ottawa, Canada. The annual conference provides a forum for developing standards and demonstrating new technologies and tools for biodiversity informatics. This year's theme is "Data Integration in a Big Data Universe: Associating Occurrences with Genes, Phenotypes, and Environments."

Three of the abstracts presented at TDWG 2017 are outcomes of the Kurator project, a collaboration between Illinois and the Museum of Comparative Zoology (MCZ) at Harvard University. Kurator is a suite of biodiversity...

Aug. 25, 2017

Thanks to a new online resource for paleoenvironmental data and models under development at Illinois and partner institutions, historian Richard Flint can gauge whether environmental factors played an important role in driving the migration of Pueblo Indians from the Spanish province of New Mexico in the seventeenth century. Using SKOPE (Synthesizing Knowledge of Past Environments), scholars such as Flint and the larger community of archaeologists will be able to discover, explore, visualize, and synthesize knowledge of environments in the recent or remote past.

"We are aiming to support different types of users—from researchers asking fundamental questions in the historical social sciences using climate retrodictions from tree-ring...

Aug. 1, 2017

Assistant Professor Jodi Schneider will serve as a keynote speaker for the eighth annual VIVO Conference, which will be held August 2-4 in New York City. VIVO is member-supported, open-source software and an ontology for representing scholarship. Hundreds of universities around the world are using VIVO software to showcase the experts, publications, and impact of researchers in academic institutions.

The international conference brings together the VIVO community and its partners to share the latest developments in Semantic Web academic profiles. Schneider will give the keynote, "Viewing universities as landscapes of scholarship."

Abstract: The university can be seen as a collection of individuals, or as an administrative engine, but what sets a university apart is the production of knowledge and knowledgeable people, through teaching, learning, and scholarly inquiry. In 2000, Michael Heaney proposed that the...

Jul. 24, 2017

Assistant Professor Matthew Turk is partnering on a project to help resolve the growing gap between food supply and demand in the face of global climate change. Led by Amy Marshall-Colón, principal investigator and assistant professor of plant biology, Crops in silico (Cis) will integrate a suite of virtual plant models at different scales through $274,000 in funding from The Foundation for Food and Agriculture Research (FFAR), a nonprofit organization that builds unique partnerships to support innovative and actionable science addressing today's food and agriculture challenges. The FFAR grant matches seed funding the project has received from the Institute for Sustainability,...

Dec. 9, 2016

Associate Professor Victoria Stodden will present her research at A University Symposium: Promoting Credibility, Reproducibility and Integrity in Research on December 9 at Columbia University. Hosted by Columbia's Office of the Executive Vice President for Research and other New York City research institutions, the symposium will bring together leading experts, journal editors, funders, and researchers to discuss how issues of reproducibility and research integrity are being handled by institutions, journals, and federal agencies.  

Stodden will participate in the session, "Repeat After Me: Current Issues in Reproducibility," with Jeffrey Drazen, editor-in-chief of The New England Journal of Medicine; Hany Farid, professor and chair of computer science at Dartmouth; Leonard Freeman, president of the Global Biological Standards Institute; and Londa Schiebinger, John L. Hinds Professor of History of...

Dec. 8, 2016

Reporting new research results involves detailed descriptions of methods and materials used in an experiment. But when a study uses computers to analyze data, create models or simulate things that can’t be tested in a lab, how can other researchers see what steps were taken or potentially reproduce results?

A new report by prominent leaders in computational methods and reproducibility lays out recommendations for ways researchers, institutions, agencies and journal publishers can work together to standardize sharing of data sets and software code. The paper "Enhancing reproducibility for computational methods" appears in the journal Science.

"We have a real issue in disclosure and reporting standards for research that involves computation – which is basically all research today," said Victoria Stodden, a University of Illinois professor of information science and the lead author of the paper. "The standards for putting enough...

Nov. 16, 2016

Three iSchool students will participate in the Library and Information Technology Association (LITA) Forum, which will be held November 17-20 in Fort Worth, Texas. The LITA Forum is the annual conference for professionals in archives, libraries, and other information services.

Nicholas Wolf, master's student and research data management librarian at New York University (NYU), will give a talk with Vicky Steeves, NYU librarian for research data management and reproducibility, titled "Using Openness as Foundation for Research Data Management Services." 

Abstract: This talk will describe the building and scaling up of research data management services at NYU solely using open source tools and data for instruction and best practices recommendations. Through demonstrating the applicability of tools such as OpenRefine, the Open Science Framework, ReproZip, and languages such as Python and R in library instruction...