LLM-Based Medical Document Evaluation: Integrating Human Expert Insights

Junhyuk Seo, Dasol Choi, Wonchul Cha, Taerim Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Large Language Models (LLMs) show potential in medical document generation, but ensuring reliability requires extensive expert involvement, limiting clinical applications. To address this challenge, we developed an LLM-based evaluation framework with three progressive Chain of Thought (CoT) strategies: Qualitative (expert persona), Quantitative-qualitative (error analysis), and Insight-integrated (expert reasoning). This framework captures nuanced evaluation patterns while maintaining efficiency. When tested on 33 LLM-generated Emergency Department records across five criteria, our Insight-integrated approach demonstrated strong correlation with expert evaluations (r=0.680, p < .001), outperforming both Qualitative (r=0.524) and Quantitative-qualitative (r=0.630) approaches. Our findings suggest that LLM-based evaluation frameworks can align with expert assessments as useful tools for validating medical documentation in clinical settings.

Original languageEnglish
Title of host publicationMEDINFO 2025 - Healthcare Smart x Medicine Deep
Subtitle of host publicationProceedings of the 20th World Congress on Medical and Health Informatics
EditorsMowafa S. Househ, Mowafa S. Househ, Zain Ul Abideen Tariq, Mahmood Al-Zubaidi, Uzair Shah, Elaine Huesing
PublisherIOS Press BV
Pages1029-1033
Number of pages5
ISBN (Electronic)9781643686080
DOIs
StatePublished - 7 Aug 2025
Event20th World Congress on Medical and Health Informatics, MEDINFO 2025 - Taipei, Taiwan, Province of China
Duration: 9 Aug 202513 Aug 2025

Publication series

NameStudies in Health Technology and Informatics
Volume329
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365

Conference

Conference20th World Congress on Medical and Health Informatics, MEDINFO 2025
Country/TerritoryTaiwan, Province of China
CityTaipei
Period9/08/2513/08/25

Keywords

  • clinical validation
  • evaluation framework
  • expert assessment
  • large language models
  • medical document evaluation
  • prompt engineering

Fingerprint

Dive into the research topics of 'LLM-Based Medical Document Evaluation: Integrating Human Expert Insights'. Together they form a unique fingerprint.

Cite this