School of Information Sciences

Diesner and Mishra publish paper on NER tool for social media research

Twitter logo
Jana Diesner
Jana Diesner, Affiliate Associate Professor

The identification of proper names of people, organizations, and locations from raw texts, referred to as Named Entity Recognition (NER), can be highly accurate when researchers use NER tools on a large collection of text with proper syntax. However, using existing NER tools for analyzing social media text can lead to poor identification of named entities. In particular, Twitter text frequently includes inconsistent capitalization, spelling errors, and shortened versions of words.

TwitterNER, an open-source tool developed by doctoral student Shubhanshu Mishra, who is supervised by Assistant Professor Jana Diesner, can help researchers interested in performing NER on social media text. TwitterNER has recently been shown (in an independent evaluation by Humangeo) to perform better in terms of precision than some other publicly available systems for entity types of person, location, and organization, which are often of most interest to researchers.

"Our system relies on a combination of hand-engineered features," explained Mishra. "It follows the paradigm of transductive semi-supervised learning where all the labeled and unlabeled data is utilized to make predictions about the unlabeled data."

The original implementation of TwitterNER was created for the shared-task session at the 2016 Conference on Computational Linguistics (COLING) Workshop on "Noisy User-generated Text" (W-NUT). Workshop participants were asked to build an NER system for Twitter data, which was evaluated using a common test dataset. TwitterNER had a high level of precision among the various systems.

Diesner and Mishra then improved their approach and shared it with W-NUT by submitting the paper, "Semi-supervised Named Entity Recognition in noisy-text."

"Our original submission ranked seventh in the task, but our final improved version surpassed the second-best performing system on the concluded task," said Mishra. "The winning system was based on deep learning, but its implementation is not publicly available."

Mishra has an integrated MS and BS in mathematics and computing from the Indian Institute of Technology Kharagpur. He is interested in the analysis of information generation in social networks such as those in scholarly data and social media websites. His prior projects have included systems for user sentiment profiling, active learning using human-in-the-loop design pattern, and novelty profiling in scholarly data.

Diesner is an expert in human-centered computing, network science, natural language processing, and machine learning. Recognition for her research expertise include appointments as CIO Scholar for Information Research & Technology at Illinois (2018), faculty fellow at the National Center for Supercomputing Applications (NCSA) at Illinois (2015), and as a research fellow in the Dori J. Maynard Senior Research Fellows program through The Center for Investigative Reporting and The Robert C. Maynard Institute for Journalism Education (2016). She holds a PhD from the Computation, Organizations and Society (COS) program at Carnegie Mellon University's School of Computer Science.

Updated on
Backto the news archive

Related News

Hassan and Bashir receive distinguished paper award

A paper co-authored by PhD student Muhammad Hassan and Associate Professor Masooda Bashir received the Distinguished Paper Award at the Workshop on Security and Privacy in Standardized IoT, which was held last month in San Diego, California, in conjunction with the Network and Distributed System Security (NDSS) Symposium 2026. 

iSchool researchers to present work at Technocracy Conference

This week, iSchool PhD students and faculty will present their research at the Technocracy Conference. Hosted by the Unit for Criticism and Interpretive Theory at the University of Illinois on March 5–6, the conference will begin with a panel of graduate student papers and continue the following day with invited speakers and a keynote. All events will take place at the Levis Faculty Center on the Urbana campus. 

New multi-institutional project to use AI to represent past historical periods

A new project led by a team of researchers from four universities aims to create and evaluate language models that represent past historical periods. The project, "Artificial Intelligence for Cultural and Historical Reasoning," was recently selected for a 2025 Humanities and AI Virtual Institute (HAVI) award from Schmidt Sciences. The $800,000 grant will be split among four institutions: Cornell University, the University of Illinois Urbana-Champaign, The University of British Columbia, and McGill University. Professor Ted Underwood will serve as the principal investigator for the portion of the project at Illinois.

Ted Underwood

Wang group to present at WSDM26

Professor and Associate Dean for Research Dong Wang and PhD student Ruohan Zong will present their research at the 19th ACM International Conference on Web Search and Data Mining (WSDM 26), which will be held from February 22–26 in Boise, Idaho. WSDM is a premier international conference in web search, data mining, and AI, known for its highly selective acceptance rates. This year, the acceptance rate for the main track of the conference was only 16 percent. 

Dong Wang

New NSF award supports innovative role-playing game approach to strengthening research security in academia

A new National Science Foundation (NSF) award will support an innovative effort in the School of Information Sciences to strengthen research security by using structured role-playing games (RPG) to model the threats facing academic research environments. The project, titled "REDTEAM: Research Environment Defense Through Expert Attack Modeling," addresses a growing challenge: balancing the open, collaborative nature of academic research with increasing national security risks and sophisticated adversarial threats. 

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Email: ischool@illinois.edu

Back to top