My name is Liyan Tang (唐立言 in Chinese). I’m a fourth-year Ph.D. student in Computer Science from the TAUR Lab (Text Analysis, Understanding, and Reasoning) at UT Austin advised by Greg Durrett. I have been fortunate to work with Ying Ding from UT iSchool, Yifan Peng from Weill Cornell Medicine and Justin F. Rousseau from UT Southwestern Medical Center (alphabetical order).

My research mainly focuses on the automatic evaluation of LLMs (especially on hallucination evaluation) and I recently work on post-training in general, with a focus on improving reasoning ability of LLMs via SFT and RL methods.

I had a research internship at Bespoke Labs (startup) 2024-2025. Check out our SOTA fact-checking model Bespoke-MiniCheck-7B on the LLM-AggreFact leaderboard across 11 hulluciantion detection datasets and the real-time demo based on my MiniCheck paper.


Selected Papers

ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
Liyan Tang, Grace Kim, Xinyu Zhao, Thom Lake, Wenxuan Ding, Fangcong Yin, Prasann Singhal, Manya Wadhwa, Zeyu Leo Liu, Zayne Sprague, Ramya Namuduri, Bodun Hu, Juan Diego Rodriguez, Puyuan Peng, Greg Durrett
arXiv preprint 2025

MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Endpoint Badge
Liyan Tang, Philippe Laban, Greg Durrett
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Endpoint Badge
Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yu’an Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown
Proceedings of the North American Chapter of the Association for Computational Linguistic (NAACL), 2024

Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors
Endpoint Badge
Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett
Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Evaluating Large Language Models on Medical Evidence Summarization \ Endpoint Badge
Liyan Tang, Zhaoyi Sun, Betina Idnay, Jordan G Nestor, Ali Soroush, Pierre A. Elias, Ziyang Xu, Ying Ding, Greg Durrett, Justin Rousseau, Chunhua Weng, Yifan Peng
npj Digital Medicine, 2023


News

06/02/2025: Research Intern at Google DeepMind, CA.
06/03/2024: Research Intern at Bespoke Labs (startup), CA.
12/16/2023: Completed my Master of Science degree in Computer Science at UT Austin.
05/15/2023: Applied Scientist Internship at Amazon, WA.
08/25/2021: Started my PhD at UT Austin.
05/14/2021: Completed my Bachelor of Science degree in Mathematics at UT Austin.