School of Information Sciences

Haibo Jin

Haibo Jin

Doctoral Student

PhD, Information Sciences, Illinois (in progress)

Research focus

Development of trustworthy machine learning through robustness, interpretability, and alignment.

Publications & Papers

Jin, H., Zhang, P., Luo, M., & Wang, H. (2025). Reasoning Can Hurt the Inductive Abilities of Large Language Models. Advances in Neural Information Processing Systems.

Jin, H., Zhou, A., Menke, J., & Wang, H. (2024). Jailbreaking large language models against moderation guardrails via cipher characters. Advances in Neural Information Processing Systems, 37, 59408-59435.

Jin, H., Chen, R., Chen, J., Zheng, H., Zhang, Y., & Wang, H. (2024, September). Catchbackdoor: Backdoor detection via critical trojan neural path fuzzing. In European Conference on Computer Vision (pp. 90-106). Cham: Springer Nature Switzerland.

Chen, R., Jin, H., Liu, Y., Chen, J., Wang, H., & Sun, L. (2024, September). Editshield: Protecting unauthorized image editing by instruction-guided diffusion models. In European Conference on Computer Vision (pp. 126-142). Cham: Springer Nature Switzerland.

Zhang, P., Jin, H., Hu, L., Li, X., Kang, L., Luo, M., ... & Wang, H. Revolve: Optimizing AI Systems by Tracking Response Evolution in Textual Optimization. In Forty-second International Conference on Machine Learning.

Zhuang, J., Jin, H., Zhang, Y., Kang, Z., Zhang, W., Dagher, G. G., & Wang, H. (2025). Exploring the Vulnerability of the Content Moderation Guardrail in Large Language Models via Intent Manipulation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing.

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Fax: (217) 244-3302

Email: ischool@illinois.edu

Back to top