At the National Center for Atmospheric Research (NCAR), scientists study major global issues from climate change to solar activity to air quality. The data collected at NCAR is shared with other researchers in the United States and internationally, contributing to solutions to world-wide environmental challenges.
A thousand miles away, students at the University of Illinois have access to one of the few programs that prepares professionals to manage the growing amounts of data collected by research centers like NCAR. GSLIS is among a handful of schools to offer a specialization in data curation, the active and ongoing management of data through its lifecycle of interest and usefulness to scholarship, science, and education.
GSLIS students complement their coursework in the specialization with hands-on experience. To facilitate practical experience in a real research environment, GSLIS, with collaborators at the University of Tennessee’s School of Information Sciences and NCAR, created the Data Curation Education in Research Centers (DCERC) program, which began in 2010 and is funded by the Institute of Museum and Library Services.
The goal of the program is to develop a sustainable model for data curation education that can be expanded to produce more graduates who are prepared to face the growing challenges of data management in data-intensive science. A key component of that model will be establishment of iSchool-data center relationships that will provide students with practical field experience and mentorship in exemplar data centers. Students work closely with data and science mentors to gain a holistic understanding of managing data to advance the research process.
“It’s always important for students to get out and see how things are done in practice, but this opportunity with NCAR is extraordinary because it is one of the premier data centers in the world. [The students] are learning not only about the best practices in data management and data services that have evolved in this very mature, sophisticated, large-scale data center, but they also are exposed to the cutting-edge research,” said principal investigator and GSLIS Professor Emerita Carole Palmer. “It’s one of the few places where you get that integration and experience it all. It’s one of the best places I can imagine placing students.”
During the final field experience term of the program, two master’s students from each of the partner schools spent their summers doing data curation work at the Boulder, Colorado, facility. The experience gave each of these students the opportunity to apply the skills they’ve learned in their coursework and see what working at a major research and data center is really like.
GSLIS student Chung-Yi Hou arrived at NCAR’s Computational and Information Systems Laboratory (CISL) with a strong idea about what she wanted to take away from her experience. Hou enrolled in the master’s program at GSLIS with the goal of shifting to an environmentally-focused career after working for several years in electrical engineering. Just two semesters in to her specialization in data curation, she’s proven herself to be up to the challenge and reaffirmed her career goals.
“I wanted to have an opportunity to work on as many aspects of the data lifecycle as possible,” said Hou. She proposed a project in which she verified data quality, harvested metadata descriptions, and collected and published provenance documentation for a climate re-analysis dataset of more than 184,000 files. Her work has improved usability and understanding of the dataset, which is now accessible online.
Joining Hou in pursuit of a career that puts technical expertise to work for a good cause is fellow GSLIS student Sean Gordon. He worked at NCAR’s Earth Observing Laboratory on an initiative to make the lab’s data more discoverable to researchers internally, across NCAR units, and outside the center.
Gordon developed geospatial metadata templates in multiple standards, which prepared a sample of the lab’s metadata catalog for integration into a geoportal at another NCAR lab and at NASA's Global Change Master Directory (GMCD). Geoportals promote dataset discovery by aggregating metadata catalogs from multiple repositories. The proof of concept model he developed may serve as a prototype for the future connection of all the repositories at NCAR so that researchers can search all of them simultaneously through a single interface.
“It’s been the biggest opportunity and learning experience I’ve gotten to enjoy through GSLIS,” he said. In addition to leaving with new skills, Gordon returned from his experience in Boulder with a new career outlook. He has decided to point his post-graduation job search toward positions that have a positive impact on the world while allowing him to do the technical work he enjoys.
“When your user group is trying to do something not only good for themselves but good for the whole world, it’s a really great feeling to know that you’re able to support that and able to help them enhance their science,” said Gordon.
Gordon and Hou will each present posters on their NCAR work at the 2014 American Geophysical Union (AGU) Fall Meeting, the world’s largest conference on Earth and space science. GSLIS doctoral student and DCERC project coordinator Cheryl Thompson will also present at the AGU meeting.
Thompson is on-site at NCAR this fall to wrap up the DCERC project evaluation and investigate methods for sustaining the program model. She is also conducting her own research into workforce needs in the field and how large research centers organize data work. In keeping with evaluation conducted since the project began, Thompson expects to find confirmation of the program’s success.
“It’s been very positively received,” she said. “I think the blending of the theory that they get in the classroom and the hands-on, real-world experience at NCAR makes this a very valuable learning experience. Students are learning new skills, but they’re also taking the theory, especially in thinking about the lifecycle model, and they’re bringing it into the projects here. NCAR is learning and benefiting a lot by having students here because they bring a different perspective.”
Matt Mayernik, research data services specialist at the NCAR library, worked closely with the data curation students and saw first-hand how LIS perspectives have contributed to NCAR’s data management approach. He found that the center’s data-focused teams were able to take on new projects that may not have been possible without help from the DCERC students.
Project leaders on both sides are looking ahead to possible future collaborations. “Both the iSchools and the data center have benefited tremendously,” Palmer said. “What we would like to see is an expanded set of iSchools and data centers in a sort of consortium where we can propagate the model out into a larger exchange, because it’s been very successful.”