TY - GEN
T1 - Denchmark
T2 - 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021
AU - Kim, Misoo
AU - Kim, Youngkyoung
AU - Lee, Eunseok
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/5
Y1 - 2021/5
N2 - A growing interest in deep learning (DL) has instigated a concomitant rise in DL-related software (DLSW). Therefore, the importance of DLSW quality has emerged as a vital issue. Simultaneously, researchers have found DLSW more complicated than traditional SW and more difficult to debug owing to the black-box nature of DL. These studies indicate the necessity of automatic debugging techniques for DLSW. Although several validated debugging techniques exist for general SW, no such techniques exist for DLSW. There is no standard bug benchmark to validate these automatic debugging techniques. In this study, we introduce a novel bug benchmark for DLSW, Denchmark, consisting of 4, 577 bug reports from 193 popular DLSW projects, collected through a systematic dataset construction process. These DLSW projects are further classified into eight categories: framework, platform, engine, compiler, tool, library, DL-based application, and others. All bug reports in Denchmark contain rich textual information and links with bug-fixing commits, as well as three levels of buggy entities, such as files, methods, and lines. Our dataset aims to provide an invaluable starting point for the automatic debugging techniques of DLSW.
AB - A growing interest in deep learning (DL) has instigated a concomitant rise in DL-related software (DLSW). Therefore, the importance of DLSW quality has emerged as a vital issue. Simultaneously, researchers have found DLSW more complicated than traditional SW and more difficult to debug owing to the black-box nature of DL. These studies indicate the necessity of automatic debugging techniques for DLSW. Although several validated debugging techniques exist for general SW, no such techniques exist for DLSW. There is no standard bug benchmark to validate these automatic debugging techniques. In this study, we introduce a novel bug benchmark for DLSW, Denchmark, consisting of 4, 577 bug reports from 193 popular DLSW projects, collected through a systematic dataset construction process. These DLSW projects are further classified into eight categories: framework, platform, engine, compiler, tool, library, DL-based application, and others. All bug reports in Denchmark contain rich textual information and links with bug-fixing commits, as well as three levels of buggy entities, such as files, methods, and lines. Our dataset aims to provide an invaluable starting point for the automatic debugging techniques of DLSW.
KW - Automatic debugging
KW - Bug Bench-mark
KW - Bug report
KW - Deep learning-related software
UR - https://www.scopus.com/pages/publications/85113609569
U2 - 10.1109/MSR52588.2021.00070
DO - 10.1109/MSR52588.2021.00070
M3 - Conference contribution
AN - SCOPUS:85113609569
T3 - Proceedings - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories, MSR 2021
SP - 540
EP - 544
BT - Proceedings - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories, MSR 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 May 2021 through 19 May 2021
ER -