Naiman receives NASA grant to digitize astrophysical literature

Jill Neiman
Jill Naiman, Teaching Assistant Professor

Teaching Assistant Professor Jill Naiman has received a $506,912 grant from the National Aeronautics and Space Administration (NASA) to digitize predigital scientific literature. Her project, "The Reading Time Machine: Transforming Astrophysical Literature into Actionable Data," is a collaboration with Harvard University and the Astrophysics Data System (ADS), a digital library portal operated by the Smithsonian Astrophysical Observatory (SAO) under a NASA grant. With over 15 million records, ADS is one of the most important archives in the scientific field of astronomy.

"Newer documents are ‘born digital,’ making them machine-readable and parseable," said Naiman. "This has not only helped domain scientists find relevant research more efficiently, but through methods like natural language processing, it also has facilitated new discoveries in these fields."

Naiman's project aims to extend these capabilities to predigital documents by extracting their text, figures, and tables, allowing researchers to apply the same information mining methods that are available to "born digital" documents. This will result in more easily searchable documents and new discoveries. The work will also enhance the screen-reading capabilities of these documents to make them more accessible.

For the project, researchers will use optical character recognition and object detection methods to find and "extract" any tables and figure captions in the text. According to Naiman, this is something that has been done in biomedical literature but not in astronomy. After the images are extracted, they will be classified (i.e., graph, photo, picture of sky), and the figure labels will be parsed to extract science-relevant information.

"In each step, we plan on publishing a database—to be hosted by ADS—and the code so that other folks can do the same to their ‘old’ scientific literature," she said. "The wealth of science generated by such 'indexing' efforts in other STEM fields has demonstrated that we have only scratched the surface of the discoveries possible when the community has access to science-ready data collected from the literature." 

Naiman earned her PhD in astronomy and astrophysics from the University of California, Santa Cruz, and completed National Science Foundation and Institute of Theory and Computation postdoctoral fellowships at the Harvard-Smithsonian Center for Astrophysics before coming to the University of Illinois. She is a Fiddler Faculty Fellow at the National Center for Supercomputing Applications (NCSA) at Illinois.

Research Areas:
Updated on
Backto the news archive

Related News

Chan authors new book connecting eugenics and Big Tech

Associate Professor Anita Say Chan has authored a new book that identifies how the eugenics movement foreshadows the predatory data tactics used in today's tech industry. Her book, Predatory Data: Eugenics in Big Tech and Our Fight for an Independent Future, was released this month by the University of California Press and featured in the news outlets San Francisco Chronicle and Mother Jones.

Anita Say Chan

CCB contributes to new Books to Parks site on Lyddie

The Center for Children's Books (CCB) collaborated with the National Park Service (NPS) to launch a new Books to Parks website on Lyddie, a 1991 novel by Katherine Paterson that highlights the experiences of young women working in textile mills in nineteenth-century Lowell, Massachusetts. 

Lyddie book

Layne-Worthey edits book on digital humanities and LIS

Glen Layne-Worthey, associate director for research support services for the HathiTrust Research Center (HTRC), and Isabel Galina, researcher at the Institute for Bibliographic Studies at the National University of Mexico, have edited a new book, The Routledge Companion to Libraries, Archives, and the Digital Humanities, which was recently released by Routledge.

Glen Layne-Worthey

Wang group to present at BigData 2024

Members of Associate Professor Dong Wang's research group, the Social Sensing and Intelligence Lab, will present their research at the 2024 IEEE International Conference on Big Data (BigData 2024), which will be held from December 15-18 in Washington, D.C. BigData 2024 is the premier venue to present and discuss progress in research, development, standards, and applications of topics in artificial intelligence, machine learning and big data analytics.

Dong Wang