Generative AI and the Future of Research Speaker Series: Lucy Li

Lucy Li will present, "Language Models for People and Culture – From Pretraining to Application."
Lucy Li is a PhD candidate at the University of California, Berkeley, affiliated with Berkeley AI Research and the School of Information. Her research intersects natural language processing with computational social science and digital humanities (e.g. cultural analytics). She has worked with Microsoft Research’s Fairness, Accountability, Transparency, and Ethics (FATE) team and the Allen Institute for AI, and led collaborations with colleagues in education, psychology, and English literature. She has been recognized by EECS Rising Stars, Rising Stars in Data Science, an American Educational Research Association (AERA) Best Paper Award, and an NSF Graduate Research Fellowship.
Abstract:
Given the widespread use of language models (LMs) today, it is imperative that we examine assumptions around their supposed “general-purpose”, one-size-fits-all nature. In this talk, I’ll discuss three research projects that share this underlying theme. First, I’ll show how various notions of text “quality” used during LM pretraining data curation can result in language from different social groups being filtered at disparate rates. Next, I’ll present findings from a crowdsourcing study, in which we surface a range of expectations around what people believe “fair” or “good” model behavior should look like. Then, I’ll discuss experiments in which we interrogate the extent to which large LMs can assist cultural analytics scholarship. I’ll conclude with some thoughts on what I believe will shape the broader AI community in the future.
About the speaker series:
The CIRSS Speaker Series continues in Spring with a new theme of “Generative AI and the Future of Research.” Our speakers will share their research on the opportunities and risks associated with the rapidly evolving landscape of generative AI usage in scholarship.
We meet most Wednesdays, 9am-10am Central time, in Zoom. Everyone is welcome to attend. More information, including upcoming speaker schedule and links to recordings, is available on the series website. For weekly updates on upcoming talks, subscribe to our CIRSS Seminars mailing list. Our Spring series is led by Yuanxi Fu and Timothy McPhillips, and supported by the Center for Informatics Research in Science and Scholarship (CIRSS) and the School of Information Sciences at the University of Illinois at Urbana-Champaign.
This event is sponsored by Center for Informatics Research in Science and Scholarship