Abstract
Objectives: This study develops and validates the confidence-linked and uncertainty-based staged (CLUES) framework by integrating large language models (LLMs) with uncertainty quantification to assist manual chart review while ensuring reliability through a selective human review. Materials and Methods: The CLUES framework assesses stroke-related hospitalizations using imaging reports for 1739 patients across 24 Korean hospitals (2011–2022). Uncertainty was quantified via entropy from LLM-derived confidence values. Our framework operated in 3 stages: (1) zero-shot prompting with ensemble averaging, where high-uncertainty cases advanced to stage 2, (2) few-shot prompting using retrieved low-uncertainty cases, with remaining high-uncertainty cases proceeding to stage 3, and (3) manual chart review for final uncertain cases. Performance was evaluated against physician-labeled data using F1-score and Cohen’s Kappa. Results: Among 1072 test cases, stage 1 classified 507 cases as low uncertainty, while 565 were high uncertainty. Stage 2 reclassified 280 cases as low uncertainty, leaving 285 for manual review. Low-uncertainty cases consistently outperformed high-uncertainty cases in both stages (weighted F1-scores: 0.94 vs 0.57 in stage 1 and 0.82 vs 0.58 in stage 2). The overall framework performance showed a progressive improvement in F1-scores from 0.840 (stage 1) to 0.878 (stage 2) to 0.955 (stage 3). Discussion: The CLUES framework reduced manual review burden by 75% while maintaining high accuracy. By integrating uncertainty quantification with selective human oversight, it provides an efficient and reliable approach to phenotype validation. Conclusion: This framework demonstrates the effective integration of LLMs into clinical workflows while ensuring human oversight, enhancing both accuracy and efficiency.
| Original language | English |
|---|---|
| Pages (from-to) | 1320-1327 |
| Number of pages | 8 |
| Journal | Journal of the American Medical Informatics Association |
| Volume | 32 |
| Issue number | 8 |
| DOIs | |
| State | Published - 1 Aug 2025 |
| Externally published | Yes |
Keywords
- entropy
- large language models
- phenotype
- review
- uncertainty