TY - JOUR
T1 - Multiparametric MRI–based radiomics model for predicting human papillomavirus status in oropharyngeal squamous cell carcinoma
T2 - optimization using oversampling and machine learning techniques
AU - Sim, Yongsik
AU - Kim, Minjae
AU - Kim, Jinna
AU - Lee, Seung Koo
AU - Han, Kyunghwa
AU - Sohn, Beomseok
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to European Society of Radiology 2023.
PY - 2024/5
Y1 - 2024/5
N2 - Objectives: To develop and validate a multiparametric MRI–based radiomics model with optimal oversampling and machine learning techniques for predicting human papillomavirus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC). Methods: This retrospective, multicenter study included consecutive patients with newly diagnosed and pathologically confirmed OPSCC between January 2017 and December 2020 (110 patients in the training set, 44 patients in the external validation set). A total of 293 radiomics features were extracted from three sequences (T2-weighted images [T2WI], contrast-enhanced T1-weighted images [CE-T1WI], and ADC). Combinations of three feature selection, five oversampling, and 12 machine learning techniques were evaluated to optimize its diagnostic performance. The area under the receiver operating characteristic curve (AUC) of the top five models was validated in the external validation set. Results: A total of 154 patients (59.2 ± 9.1 years; 132 men [85.7%]) were included, and oversampling was employed to account for data imbalance between HPV-positive and HPV-negative OPSCC (86.4% [133/154] vs. 13.6% [21/154]). For the ADC radiomics model, the combination of random oversampling and ridge showed the highest diagnostic performance in the external validation set (AUC, 0.791; 95% CI, 0.775–0.808). The ADC radiomics model showed a higher trend in diagnostic performance compared to the radiomics model using CE-T1WI (AUC, 0.604; 95% CI, 0.590–0.618), T2WI (AUC, 0.695; 95% CI, 0.673–0.717), and a combination of both (AUC, 0.642; 95% CI, 0.626–0.657). Conclusions: The ADC radiomics model using random oversampling and ridge showed the highest diagnostic performance in predicting the HPV status of OPSCC in the external validation set. Clinical relevance statement: Among multiple sequences, the ADC radiomics model has a potential for generalizability and applicability in clinical practice. Exploring multiple oversampling and machine learning techniques was a valuable strategy for optimizing radiomics model performance. Key Points: • Previous radiomics studies using multiparametric MRI were conducted at single centers without external validation and had unresolved data imbalances. • Among the ADC, CE-T1WI, and T2WI radiomics models and the ADC histogram models, the ADC radiomics model was the best-performing model for predicting human papillomavirus status in oropharyngeal squamous cell carcinoma. • The ADC radiomics model with the combination of random oversampling and ridge showed the highest diagnostic performance.
AB - Objectives: To develop and validate a multiparametric MRI–based radiomics model with optimal oversampling and machine learning techniques for predicting human papillomavirus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC). Methods: This retrospective, multicenter study included consecutive patients with newly diagnosed and pathologically confirmed OPSCC between January 2017 and December 2020 (110 patients in the training set, 44 patients in the external validation set). A total of 293 radiomics features were extracted from three sequences (T2-weighted images [T2WI], contrast-enhanced T1-weighted images [CE-T1WI], and ADC). Combinations of three feature selection, five oversampling, and 12 machine learning techniques were evaluated to optimize its diagnostic performance. The area under the receiver operating characteristic curve (AUC) of the top five models was validated in the external validation set. Results: A total of 154 patients (59.2 ± 9.1 years; 132 men [85.7%]) were included, and oversampling was employed to account for data imbalance between HPV-positive and HPV-negative OPSCC (86.4% [133/154] vs. 13.6% [21/154]). For the ADC radiomics model, the combination of random oversampling and ridge showed the highest diagnostic performance in the external validation set (AUC, 0.791; 95% CI, 0.775–0.808). The ADC radiomics model showed a higher trend in diagnostic performance compared to the radiomics model using CE-T1WI (AUC, 0.604; 95% CI, 0.590–0.618), T2WI (AUC, 0.695; 95% CI, 0.673–0.717), and a combination of both (AUC, 0.642; 95% CI, 0.626–0.657). Conclusions: The ADC radiomics model using random oversampling and ridge showed the highest diagnostic performance in predicting the HPV status of OPSCC in the external validation set. Clinical relevance statement: Among multiple sequences, the ADC radiomics model has a potential for generalizability and applicability in clinical practice. Exploring multiple oversampling and machine learning techniques was a valuable strategy for optimizing radiomics model performance. Key Points: • Previous radiomics studies using multiparametric MRI were conducted at single centers without external validation and had unresolved data imbalances. • Among the ADC, CE-T1WI, and T2WI radiomics models and the ADC histogram models, the ADC radiomics model was the best-performing model for predicting human papillomavirus status in oropharyngeal squamous cell carcinoma. • The ADC radiomics model with the combination of random oversampling and ridge showed the highest diagnostic performance.
KW - Diffusion magnetic resonance imaging
KW - Machine learning
KW - Oropharynx
KW - Papillomavirus infections
KW - Squamous cell carcinoma of head and neck
UR - https://www.scopus.com/pages/publications/85174305864
U2 - 10.1007/s00330-023-10338-3
DO - 10.1007/s00330-023-10338-3
M3 - Article
C2 - 37848774
AN - SCOPUS:85174305864
SN - 0938-7994
VL - 34
SP - 3102
EP - 3112
JO - European Radiology
JF - European Radiology
IS - 5
ER -