School of Information Sciences

An informatics approach helps better identify chemical combinations in consumer products

Catherine Blake
Catherine Blake, Professor

By using products such as soap, shampoo, body lotion, toothpaste and makeup, the average consumer may be exposed to dozens of chemicals each day. It's not easy, though, to know exactly what is in many consumer products or what potential risks they pose, either individually or in combination.

A doctoral student and a professor in the University of Illinois School of Information Sciences are using an informatics approach to help prioritize chemical combinations for further testing by determining the prevalence of individual ingredients and their most likely combinations in consumer products.

Doctoral student Henry Gabb and professor Catherine Blake published the results of the first phase of their work in Environmental Health Perspectives, a journal of the National Institute of Environmental Health Sciences, part of the National Institutes of Health.

People are exposed to significantly higher levels of chemicals now than in the past from many sources, including consumer products.

"We are, in effect, test subjects in an uncontrolled biochemistry experiment. This has become an accepted, or perhaps ignored, trade-off of life in modern society," Gabb said.

In order to identify the chemicals present in consumer products, Gabb used a web-scraping program to gather product names, categories and ingredient lists from online retail sites such as Drugstore.com. The database he created includes nearly 39,000 products and more than 32,000 ingredient names.

Once he had information on the ingredients in consumer products, he had to solve the problem of chemical synonymy – the use of different names for the same substance.

"The same chemical can appear on multiple product labels under many different names. Unless you can resolve them to a unique chemical, you don't really know what you're counting," Gabb said. For example, according to the PubChem Compound database from the National Library of Medicine, wintergreen oil is another name for methyl salicylate, a suspected endocrine disruptor.

Gabb and Blake targeted 55 potential endocrine-disrupting and asthma-associated chemicals from a prior study that used gas chromatography-mass spectrometry analysis to measure the levels of these chemicals in consumer products. They found 30 percent of the products in their database contained at least one of the 55 target chemicals, and 13 percent contained more than one.

The informatics approach allows the researchers to look at many more products and detect many more chemicals than the gas chromatography-mass spectrometry approach, which is limited by the time it takes to prepare samples and run the experiments, among other things. However, the informatics approach is limited to what is actually listed on product labels, which are not always complete. Gas chromatography-mass spectrometry can identify chemicals that are not listed on a product label or even part of the product formulation, such as "chemicals that leach from the product packaging, degradation products or other impurities," Gabb said. The researchers said the two approaches should be considered complementary.

The initial informatics analysis considered chemical combinations within the same product, but combined exposure also occurs when several products are used in a given timeframe.

"This work provides another piece of the environmental-exposure puzzle and, unlike our genetic material, we can easily change our product usage," Blake said. "The combination of genetic susceptibility and individualized cumulative exposure – not just to other chemicals in consumer products, but from other sources such as air quality – empowers people to make informed decisions about changing the factors that directly influence their health outcomes."

Gabb and Blake hope their informatics approach can help prioritize testing based on the likelihood of exposure. They have started on the next phase of their research, in which they will expand their analysis from the 55 target chemicals in the first phase of the project to look at thousands of chemicals in the second phase.

They'll also study combinations of chemicals from multiple products based on actual consumer usage, rather than looking at products in isolation. They will use a dataset of consumer usage patterns, detailing what products are used and how often. The data can tell the researchers the chemicals and combination of chemicals consumers are being exposed to in a typical day or week.

Researchers can further prioritize which chemicals to study by also considering retention, or how the product is used. For example, shampoo and soap are rinsed off the body right away, while lotion is left on. Toothpaste and other products that come in contact with mucous membranes will likely result in more absorption of chemicals than a hair product.

Gabb and Blake's analysis also illustrates the difficulty consumers have in deciding which products to use or avoid. Manufacturers don't have to disclose the ingredients that produce fragrance and flavor in their products if those mixtures are considered proprietary. In such cases, the label would list "fragrance" or "flavor" rather than the specific ingredients. On the other hand, a label might list the chemicals that contribute to a fragrance, but not use the word "fragrance," leading a consumer to believe he or she is buying a fragrance-free product. This, in addition to chemical synonymy, makes a case for amending the Fair Packaging and Labeling Act to standardize ingredient nomenclature, at least for ingredients that are suspected of being harmful, Gabb said.

Gabb emphasized that the study examines the presence of potentially harmful chemicals (as determined by various authoritative sources like the EPA and NIH) in consumer products, but that it makes no value judgments regarding the safety of the chemicals themselves. His immediate goal is simply to help toxicologists better prioritize which chemicals and chemical combinations should be subjected to cumulative risk assessments.

Updated on
Backto the news archive

Related News

Hassan and Bashir receive distinguished paper award

A paper co-authored by PhD student Muhammad Hassan and Associate Professor Masooda Bashir received the Distinguished Paper Award at the Workshop on Security and Privacy in Standardized IoT, which was held last month in San Diego, California, in conjunction with the Network and Distributed System Security (NDSS) Symposium 2026. 

iSchool researchers to present work at Technocracy Conference

This week, iSchool PhD students and faculty will present their research at the Technocracy Conference. Hosted by the Unit for Criticism and Interpretive Theory at the University of Illinois on March 5–6, the conference will begin with a panel of graduate student papers and continue the following day with invited speakers and a keynote. All events will take place at the Levis Faculty Center on the Urbana campus. 

New multi-institutional project to use AI to represent past historical periods

A new project led by a team of researchers from four universities aims to create and evaluate language models that represent past historical periods. The project, "Artificial Intelligence for Cultural and Historical Reasoning," was recently selected for a 2025 Humanities and AI Virtual Institute (HAVI) award from Schmidt Sciences. The $800,000 grant will be split among four institutions: Cornell University, the University of Illinois Urbana-Champaign, The University of British Columbia, and McGill University. Professor Ted Underwood will serve as the principal investigator for the portion of the project at Illinois.

Ted Underwood

Wang group to present at WSDM26

Professor and Associate Dean for Research Dong Wang and PhD student Ruohan Zong will present their research at the 19th ACM International Conference on Web Search and Data Mining (WSDM 26), which will be held from February 22–26 in Boise, Idaho. WSDM is a premier international conference in web search, data mining, and AI, known for its highly selective acceptance rates. This year, the acceptance rate for the main track of the conference was only 16 percent. 

Dong Wang

New NSF award supports innovative role-playing game approach to strengthening research security in academia

A new National Science Foundation (NSF) award will support an innovative effort in the School of Information Sciences to strengthen research security by using structured role-playing games (RPG) to model the threats facing academic research environments. The project, titled "REDTEAM: Research Environment Defense Through Expert Attack Modeling," addresses a growing challenge: balancing the open, collaborative nature of academic research with increasing national security risks and sophisticated adversarial threats. 

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Email: ischool@illinois.edu

Back to top