Why does this query need to be labeled? Enhancing active learning through explanation-based interventions in query selection

Jaewoong Shim, Seokho Kang

Research output: Contribution to journalArticlepeer-review

Abstract

Given the large amount of unlabeled data and limited labeling budget, active learning has emerged as a promising solution that reduces the labeling costs in building high-performance prediction models. However, conventional active learning suffers from an inability to explain why each data instance should be labeled owing to the black-box nature of the query selection procedure. In this study, we propose an enhanced active learning framework with explanation-based interventions to effectively leverage the expertise of labelers during query selection. In each active learning iteration, the evaluation score for each unlabeled instance, which serves as the criterion for query selection, is explained by decomposing it into the attributions of individual features. Subsequently, labelers assign feature weights based on their prior knowledge and intuition. Using these weights, the evaluation score is adjusted as a weighted sum of the individual feature attributions. Through this framework, labelers can gain a better understanding of how each unlabeled instance is evaluated and can systematically intervene in the query selection process by applying their knowledge in the form of feature weights. We demonstrate that our framework is effective in typical situations where noisy features are present in the dataset.

Original languageEnglish
Article number128443
JournalExpert Systems with Applications
Volume290
DOIs
StatePublished - 25 Sep 2025

Keywords

  • Active learning
  • Explainable machine learning
  • Explanation-based interventions
  • Shapley additive explanations

Fingerprint

Dive into the research topics of 'Why does this query need to be labeled? Enhancing active learning through explanation-based interventions in query selection'. Together they form a unique fingerprint.

Cite this