My name is Liyan Tang (唐立言 in Chinese). I’m a fourth-year Ph.D. student in Computer Science from the TAUR Lab (Text Analysis, Understanding, and Reasoning) at UT Austin advised by Greg Durrett. I have been fortunate to work with Ying Ding from UT iSchool, Yifan Peng from Weill Cornell Medicine and Justin F. Rousseau from UT Southwestern Medical Center (alphabetical order).
My research mainly focuses on the automatic evaluation of LLMs (especially on hallucination evaluation) and I recently work on post-training in general, with a focus on improving reasoning ability of LLMs via SFT and RL methods.
I had a research internship at Bespoke Labs (startup) 2024-2025. Check out our SOTA fact-checking model Bespoke-MiniCheck-7B on the LLM-AggreFact leaderboard across 11 hulluciantion detection datasets and the real-time demo based on my MiniCheck paper.
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
Liyan Tang, Grace Kim, Xinyu Zhao, Thom Lake, Wenxuan Ding, Fangcong Yin, Prasann Singhal, Manya Wadhwa, Zeyu Leo Liu, Zayne Sprague, Ramya Namuduri, Bodun Hu, Juan Diego Rodriguez, Puyuan Peng, Greg Durrett
arXiv preprint 2025
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang, Philippe Laban, Greg Durrett
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yu’an Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown
Proceedings of the North American Chapter of the Association for Computational Linguistic (NAACL), 2024
Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors
Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett
Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Evaluating Large Language Models on Medical Evidence Summarization \
Liyan Tang, Zhaoyi Sun, Betina Idnay, Jordan G Nestor, Ali Soroush, Pierre A. Elias, Ziyang Xu, Ying Ding, Greg Durrett, Justin Rousseau, Chunhua Weng, Yifan Peng
npj Digital Medicine, 2023
06/02/2025: Research Intern at Google DeepMind, CA.
06/03/2024: Research Intern at Bespoke Labs (startup), CA.
12/16/2023: Completed my Master of Science degree in Computer Science at UT Austin.
05/15/2023: Applied Scientist Internship at Amazon, WA.
08/25/2021: Started my PhD at UT Austin.
05/14/2021: Completed my Bachelor of Science degree in Mathematics at UT Austin.