Data Curation Subscribe to Data Curation


Institute of Museum and Library Services

This project will create both a master’s and doctoral-level specialization in Socio-technical Data Analytics (SODA). Partnerships with local researchers and businesses who already work with large data-sets will enable master's graduates to receive first-hand experience with both the social and technical implications of large digital data collections, and thus be well-prepared for leadership roles in academic and corporate environments. Similarly, doctoral students will consider multiple stages of the information lifecycle, which will help to ensure that their research findings will generalize to a range of scholarly and business practices. Case studies from these partners will be incorporated into new courses that will initially be held on campus and will later be evolved to the School...

University of Illinois Extension

The Illinois Digital Innovation Leadership Program will increase opportunities for entrepreneurship, economic development, and innovation through the expansion of digital manufacturing, digital media production, and data analytics. Supported by the University of Illinois Extension, the project will engage Illinoisans with mobile digital design and innovation labs, or “DigiTech Hubs,” which will serve as high-tech inventor workshops equipped with tools for everything from audio production to 3D printing. Digital Innovation Leadership staff will work with 4-H clubs, public libraries, and public schools to develop permanent community-based and -supported studios, creating a network that will build statewide capacity in digital...


Feb. 21, 2017

iSchool staff and students will participate in the 12th International Digital Curation Conference (IDCC), which will be held on February 20-23 in Edinburgh, Scotland. IDCC is organized annually by the UK-based Digital Curation Centre and provides opportunities for educators and professionals to consider digital curation in a multidisciplinary context. The theme of this year's conference is "Upstream, Downstream: embedding digital curation workflows for data science, scholarship and society."
iSchool presentations include:

"When Scientists Become Social Scientists: How Citizen Science Projects Learn About Volunteers," a paper authored by iSchool Assistant Professor Peter Darch.

"Revealing the Detailed Lineage of Script Outputs using Hybrid Provenance," a paper authored by iSchool postdoctoral research associates Qian Zhang and Yang Cao and Professor Bertram Ludäscher, director of...

Nov. 18, 2016

Where in the world is Carmen Sandiego? Children playing this educational video game on their school's computer in the 1990s got an entertaining geography lesson while in hot pursuit of Carmen and her villains. Preserving a video game such as this for future generations to study and appreciate involves challenges beyond the obvious fact that computers no longer support the software needed to play the game. In "Where Does Significance Lie: Locating the Significant Properties of Video Games in Preserving Virtual Worlds II Data," Rhiannon Bettivia, a postdoctoral research associate at the iSchool, examines some of the difficulties inherent in video game preservation and comes to the...

Oct. 5, 2016

Professor Bertram Ludäscher will present the international tutorial at the thirty-first Brazilian Symposium on Databases (SBBD2016) in Salvador-Bahia on October 4-7. SBBD, an official event of the Brazilian Computer Society, is the largest venue in Latin America for presenting and discussing research results in the database domain. The symposium brings together researchers, students, and practitioners from Brazil and abroad for technical sessions, invited talks, and tutorials given by distinguished speakers from the international research community. Ludäscher’s tutorial is titled "Provenance in Databases and Scientific Workflows."

Abstract: In computer science, data provenance describes the lineage and processing history of data as it is transformed through queries or workflows. Different computer science sub-disciplines have studied approaches to capture and exploit provenance, e.g., the systems and programming...

Oct. 3, 2016

Provenance information describes the origin and history of artifacts. Because of the vital role played by data and workflow provenance in support of transparency and reproducibility in computational and data science, creating tools for capturing and using provenance information is an important yet challenging task.

Post-doctoral Research Associate Yang Cao and Professor Bertram Ludäscher recently presented joint work on data provenance at the Data Observation Network for Earth (DataONE) All Hands Meeting in Santa Ana Pueblo, New Mexico. In their poster and system demonstration, jointly authored by a team of University of Illinois students and staff as well as collaborators from the UK, Cao and Ludäscher demonstrated how the YesWorkflow tool is "Revealing the Detailed History of Script Outputs with Hybrid Provenance Queries."1

In an earlier article for the Winter 2015/6 issue of DataONE News, "Your Data has a History,...

Sep. 19, 2016

Most academic librarians stepping into a position can model their work on that of their predecessors. But not Thomas Padilla (MS '14). On his appointment in April as the first humanities data curator at the University of California, Santa Barbara (UCSB) Library (and the first in the entire University of California system), Padilla has had to draw on a number of different disciplines to shape his role of working with data throughout its life cycle, creating a support plan for digital humanities researchers, and providing research data consultation. Formerly the digital scholarship librarian at Michigan State University Libraries, Padilla is pioneering a new niche for academic...

Sep. 14, 2016

Policies and practices in data management—including data preservation and sharing—are increasingly important and complicated aspects of research today. Scientific research and data centers as well as universities and academic libraries are leading the way in developing and implementing best practices in data management. But how do they integrate data management strategies and experts into their workflows?

It is at this intersection of people and institutions that doctoral candidate Cheryl Thompson is conducting her research. Specifically, she explores how organizations develop data expertise and services to support science.

“My research focuses on the role of institutions in data use and access in scientific and research environments. By studying organizations and professions, I investigate the conditions that advance or hinder data-intensive research as well as the emerging data profession and its required expertise,” said Thompson.

“As the need for quality...

Jun. 24, 2016

Developed in the 1940s and 1950s, nuclear magnetic resonance (NMR) spectroscopy measures physical and chemical properties of atoms or molecules by measuring change in the magnetic resonance of the nuclei of atoms. The process is used by scientists for a variety of applications, such as substance identification. In biomolecular science, NMR supports discovery and identification of new drugs, disease and metabolic research, study of structural biology, and more.

Advances in computational applications and data-sharing tools have opened new doors for use of information gleaned from NMR spectroscopy, but new challenges have emerged as well. To make possible its varied applications, myriad software tools are employed from a range of sources and using a variety of semantic approaches. This complicates data management, inhibiting dissemination and reproduction of important findings.

A research team based at the iSchool at Illinois, the University of Wisconsin (UW), and the...