Jiang and Mishra to present natural language processing research at COLING16

Doctoral students Ming Jiang and Shubhanshu Mishra will present research papers at the 26th International Conference on Computational Linguistics (COLING), which will be held December 11-16 in Osaka, Japan. The COLING conference, held every two years, is one of the top international conferences in the field of natural language processing and computational linguistics, which covers research topics such as question answering, text summarization, information extraction, discourse structure, and more. 

Jiang will present a paper coauthored with Assistant Professor Jana Diesner titled, "Says Who...? Identification of Expert versus Layman Critics’ Reviews of Documentary Films."

Abstract: We extend classic review mining work by building a binary classifier that predicts whether a review of a documentary film was written by an expert or a layman with 90.70% accuracy (F1 score), and compare the characteristics of the predicted classes. A variety of standard lexical and syntactic features was used for this supervised learning task. Our results suggest that experts write comparatively lengthier and more detailed reviews that feature more complex grammar and a higher diversity in their vocabulary. Layman reviews are more subjective and contextualized in peoples’ everyday lives. Our error analysis shows that laymen are about twice as likely to be mistaken as experts than vice versa. We argue that the type of author might be a useful new feature for improving the accuracy of predicting the rating, helpfulness and authenticity of reviews. Finally, the outcomes of this work might help researchers and practitioners in the field of impact assessment to gain a more fine-grained understanding of the perception of different types of media consumers and reviewers of a topic, genre or information product.

During the COLING16 workshop on noisy user-generated text (WNUT), Mishra will present a paper coauthored with Diesner titled, "Semi-supervised Named Entity Recognition in noisy-text." 

Abstract: Named entity recognition (NER) has played an immense role in improving information retrieval, text mining, and text based network construction. However, the most of the existing NER techniques are based on syntactically correct news corpus data, and hence don’t give good results on noisy data such as tweets because of issues like spelling errors, concept drifts, and few context words. In this paper, we describe our submission to the WNUT 2016 NER shared task, and also present an improvement over it using a semi-supervised approach. Our models are based on linear chain conditional random fields (CRFs), and use BIEOU NER chunking scheme, features based on word clusters and pre-trained distributed word representations; updated gazetteer features; global context predictions; and random feature dropout for up-sampling the training data. These approaches alleviate many issues related to NER on noisy data by allowing the meaning of new or rare tokens to be ingested into the system, while using existing training samples to improve the model. 

Diesner joined the iSchool faculty in 2012 and is a 2016 Dori J. Maynard Senior Fellow. Her research in social computing combines theories and methods from natural language processing, social network analysis, and machine learning. In her lab, she and her students develop and advance computational solutions that help people to measure and understand the interplay of information and socio-technical networks. They also bring these solutions into various application context, e.g. in the domain of impact assessment.

Updated on
Backto the news archive

Related News

Scholarship alleviates financial burden for returning student

During her time as an active-duty Naval Officer, Anna Hartman realized that she had a passion for helping others and building community. That passion, combined with a lifelong love of reading, led her to pursue an MSLIS degree at the University of Illinois. Hartman is receiving support for her studies through the Balz Endowment Fund, which was established by Nancy (BA LAS '70, MSLIS '72) and Dan (BS Media '68, MS Media '72) Balz to help make education more affordable for returning students.

Anna Hartman

Winning exhibits highlight evolution of music media and Uni High magazine

MSLIS students Monica Gil, Holly Bleeden, and Harrison Price were selected as winners of this year's Graduate Student Exhibit Contest, sponsored by the University of Illinois Library. Gil and Bleeden won first place for their exhibit, "Echoes of Time: The Evolution of Music Media," and Price won second place for his exhibit, "Unique-ly Illinois: Creative Writing from High School to Higher Education." The exhibits will be on display in the Marshall Gallery in the library through the end of March.

MSLIS students Monica Gil and Holly Bleeden standing next to their exhibit, "Echoes of Time: The Evolution of Music Media," at the Main Library.

Wei receives Amazon Post Internship Fellowship

PhD student Tianxin Wei has been awarded an Amazon Post Internship Fellowship, which will provide $20,000 in unrestricted funds and $20,000 in Amazon Web Services (AWS) credits to support Wei's research with his advisor, Professor Jingrui He. For the past two summers, Wei has served as an applied scientist intern at Amazon in Palo Alto, California. He has been part of a team that is working on search query understanding within Amazon apps and services, as well as developing shopping foundation models.

Tianxin Wei

iSchool participation in iConference 2025

The following iSchool faculty and students will participate in iConference 2025, which will be held virtually from March 11-14 and physically from March 18-22 in Bloomington, Indiana. The theme of this year's conference is "Living in an AI-gorithmic world."

Youth-AI-Safety named a winning team in international hackathon

A team of researchers from the SALT (Social Computing Systems) Lab has been selected as a winner in an international hackathon hosted by the Berkeley Center for Responsible, Decentralized Intelligence. The LLM Agents MOOC Hackathon brought together over 3,000 students, researchers, and practitioners from 127 countries to build and showcase innovative work in large language model (LLM) agents, grow the AI agent community, and advance LLM agent technology.