The Socio-technical Data Analytics (SODA) specialization at GSLIS is making sure that students have the skills necessary to thrive in the messiness that surrounds working with real data. Now in its second year, students in the SODA specialization are producing a portfolio of projects that showcase how data analysis can contribute to solving real-world problems. Over the last two years, student projects have run the gamut, analyzing medical, linguistic, environmental, and social science data.
“SODA's aim is to prepare this generation of librarians and information scientists with tools and methods to tackle the large wave of data dissemination needs in every part of our everyday lives,” said Associate Professor Catherine Blake, who oversees the SODA program. “In contrast to traditional approaches where data collection is designed specifically to address a research question or business need, much of the potential in big data comes from reusing data that was collected for a very different purpose. This means that you have to spend time understanding how the data was created and then create tailored preprocessing, transformations, and resampling approaches to ensure that you have taken the data creation process into account when you do the analysis.”
One of the most unique aspects of the specialization is the opportunity for students to begin a data analysis portfolio. Students are paired with an organization, either from private industry or academia, which either provides a dataset or works with the student to identify datasets that should be combined to explore real problems in medicine, the environment, education, and the humanities.
Students begin this exploration during the Evidence-based Discovery course and refine it during the Introduction to Socio-technical Data Analytics course, both core requirements of the SODA specialization. At the end of the program, students complete a deeper analysis through an internship, practicum, or thesis with the project partner, bringing together what they have learned.
By the time the students are ready to graduate, they have first-hand experience with the challenges involved in reusing data, typically from multiple sources. Students in the program form substantial relationships with data holders outside of GSLIS, and the skills that students acquire with these projects give many reasons for them to stand out as applicants in private industry and academia.
“I think what is most exciting is that you can see the same set of principles applied to so many domains,” said Blake. Presentations over the last year have spanned a variety of topics including using sentiment to understand fiction, exploring potential correlations between local respiratory illness and ozone, and analyzing communications patterns used when a businesses is in crisis. The development of the SODA program was made possible in part by a grant from the Institute of Museum and Library Services.
Pictured above: GSLIS faculty working in SODA include (l-r) Miles Efron, Vetle Torvik, Catherine Blake, and Jana Diesner.