TY - JOUR
T1 - A robust model training strategy using hard negative mining in a weakly labeled dataset for lymphatic invasion in gastric cancer
AU - Lee, Jonghyun
AU - Ahn, Sangjeong
AU - Kim, Hyun Soo
AU - An, Jungsuk
AU - Sim, Jongmin
N1 - Publisher Copyright:
© 2023 The Authors. The Journal of Pathology: Clinical Research published by The Pathological Society of Great Britain and Ireland and John Wiley & Sons Ltd.
PY - 2024/1
Y1 - 2024/1
N2 - Gastric cancer is a significant public health concern, emphasizing the need for accurate evaluation of lymphatic invasion (LI) for determining prognosis and treatment options. However, this task is time-consuming, labor-intensive, and prone to intra- and interobserver variability. Furthermore, the scarcity of annotated data presents a challenge, particularly in the field of digital pathology. Therefore, there is a demand for an accurate and objective method to detect LI using a small dataset, benefiting pathologists. In this study, we trained convolutional neural networks to classify LI using a four-step training process: (1) weak model training, (2) identification of false positives, (3) hard negative mining in a weakly labeled dataset, and (4) strong model training. To overcome the lack of annotated datasets, we applied a hard negative mining approach in a weakly labeled dataset, which contained only final diagnostic information, resembling the typical data found in hospital databases, and improved classification performance. Ablation studies were performed to simulate the lack of datasets and severely unbalanced datasets, further confirming the effectiveness of our proposed approach. Notably, our results demonstrated that, despite the small number of annotated datasets, efficient training was achievable, with the potential to extend to other image classification approaches used in medicine.
AB - Gastric cancer is a significant public health concern, emphasizing the need for accurate evaluation of lymphatic invasion (LI) for determining prognosis and treatment options. However, this task is time-consuming, labor-intensive, and prone to intra- and interobserver variability. Furthermore, the scarcity of annotated data presents a challenge, particularly in the field of digital pathology. Therefore, there is a demand for an accurate and objective method to detect LI using a small dataset, benefiting pathologists. In this study, we trained convolutional neural networks to classify LI using a four-step training process: (1) weak model training, (2) identification of false positives, (3) hard negative mining in a weakly labeled dataset, and (4) strong model training. To overcome the lack of annotated datasets, we applied a hard negative mining approach in a weakly labeled dataset, which contained only final diagnostic information, resembling the typical data found in hospital databases, and improved classification performance. Ablation studies were performed to simulate the lack of datasets and severely unbalanced datasets, further confirming the effectiveness of our proposed approach. Notably, our results demonstrated that, despite the small number of annotated datasets, efficient training was achievable, with the potential to extend to other image classification approaches used in medicine.
KW - artificial intelligence
KW - computational pathology
KW - gastric cancer
KW - hard negative mining
KW - lymphatic invasion
UR - https://www.scopus.com/pages/publications/85180198655
U2 - 10.1002/cjp2.355
DO - 10.1002/cjp2.355
M3 - Article
C2 - 38116763
AN - SCOPUS:85180198655
SN - 2056-4538
VL - 10
JO - Journal of Pathology: Clinical Research
JF - Journal of Pathology: Clinical Research
IS - 1
M1 - e355
ER -