Jaihyun Park's Dissertation Defense

Jay Park

Jaihyun Park will defend his dissertation “A Computational Approach toward Understanding Political Language of Ideologically Opposing Groups—From Historical Newspapers to Social Media."

The committee includes Associate Professor Ryan Cordell (chair), Professor Ted Underwood, Assistant Professor JungHwan Yang, (Liberal Arts and Sciences, Department of Communication), and Assistant Professor JooYoung Seo.


Living in a democratic society grants us the freedom to hold and express our beliefs and thoughts, protected under the commonly referred to principle of “freedom of conscience.” As such, individuals act and speak in accordance with their belief systems and values, engaging in conversations to share perspectives, spread information aligned with their beliefs, or even persuade others to support causes they deem just. However, this freedom contributed to creating a polarized society where people find it difficult to unanimously agree with one another. Digital archives with a collection of text from historical newspapers to social media provide an opportunity to study how people of opposing viewpoints engaged in political battles and how they generated discourse around the topic they support. Taking this advantage, the thesis applied a computational approach to examine the political language of ideologically opposing groups using historical newspapers in Chronicling America and contemporary social media such as Twitter and Parler. 

For the historical context, the first two sections cover the issue of slavery and racism in the 19th and 20th centuries. (1) The first section examines the discourse around slaves and servants during the period covering the Civil War and studies how newspapers from the South and North used different words to create distinct discourse communities. Methodologically, embedding-based text mining technologies and the method of incorporating possible OCR errors are used. (2) The second section focuses on the derogatory word referring to Asian workers and examines how this word is used. Methodologically, embedding-based text mining technologies, the log-odds-ratio with informative Dirichlet prior, and network analysis to examine the most circulated story are used. For the contemporary context, (3) the third section introduces varied uses of toxic language on different social media platforms during the period of the 2020 U.S. presidential election. A state-of-the-art method for determining the degree of toxicity in the text, a statistical approach to distinguish over-represented words in each platform, and network analysis are used. 

By leveraging digital archives and applying computational text analysis, this thesis aims to bridge the gap between the research on historical and contemporary political language thus far divided into two separate fields, Digital Humanities and Computational Social Science.