TY - JOUR
T1 - Why does this query need to be labeled?
T2 - Enhancing active learning through explanation-based interventions in query selection
AU - Shim, Jaewoong
AU - Kang, Seokho
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/9/25
Y1 - 2025/9/25
N2 - Given the large amount of unlabeled data and limited labeling budget, active learning has emerged as a promising solution that reduces the labeling costs in building high-performance prediction models. However, conventional active learning suffers from an inability to explain why each data instance should be labeled owing to the black-box nature of the query selection procedure. In this study, we propose an enhanced active learning framework with explanation-based interventions to effectively leverage the expertise of labelers during query selection. In each active learning iteration, the evaluation score for each unlabeled instance, which serves as the criterion for query selection, is explained by decomposing it into the attributions of individual features. Subsequently, labelers assign feature weights based on their prior knowledge and intuition. Using these weights, the evaluation score is adjusted as a weighted sum of the individual feature attributions. Through this framework, labelers can gain a better understanding of how each unlabeled instance is evaluated and can systematically intervene in the query selection process by applying their knowledge in the form of feature weights. We demonstrate that our framework is effective in typical situations where noisy features are present in the dataset.
AB - Given the large amount of unlabeled data and limited labeling budget, active learning has emerged as a promising solution that reduces the labeling costs in building high-performance prediction models. However, conventional active learning suffers from an inability to explain why each data instance should be labeled owing to the black-box nature of the query selection procedure. In this study, we propose an enhanced active learning framework with explanation-based interventions to effectively leverage the expertise of labelers during query selection. In each active learning iteration, the evaluation score for each unlabeled instance, which serves as the criterion for query selection, is explained by decomposing it into the attributions of individual features. Subsequently, labelers assign feature weights based on their prior knowledge and intuition. Using these weights, the evaluation score is adjusted as a weighted sum of the individual feature attributions. Through this framework, labelers can gain a better understanding of how each unlabeled instance is evaluated and can systematically intervene in the query selection process by applying their knowledge in the form of feature weights. We demonstrate that our framework is effective in typical situations where noisy features are present in the dataset.
KW - Active learning
KW - Explainable machine learning
KW - Explanation-based interventions
KW - Shapley additive explanations
UR - https://www.scopus.com/pages/publications/105007424067
U2 - 10.1016/j.eswa.2025.128443
DO - 10.1016/j.eswa.2025.128443
M3 - Article
AN - SCOPUS:105007424067
SN - 0957-4174
VL - 290
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 128443
ER -