Skip to main navigation Skip to search Skip to main content

An empirical study for class imbalance in extreme multi-label text classification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Extreme multi-label text classification (XMTC) is the problem of finding the most relevant multi-labels from a text corpus with millions of labels. One of the key challenges in XMTC is that most labels appear only a few times, i.e., the class imbalance issue. To overcome the class imbalance problem, existing studies suggested various methods using different loss functions (i.e., focal loss function) and data augmentation (i.e., mix-up). In this paper, we investigate the effectiveness of two main approaches over the RNN-based and transformer-based deep XMTC models. In experimental results, we found that some improvement can be achieved when focal loss and mix-up are applied for deep XMTC models on various datasets.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE International Conference on Big Data and Smart Computing, BigComp 2021
EditorsHerwig Unger, Jinho Kim, U Kang, Chakchai So-In, Junping Du, Walid Saad, Young-guk Ha, Christian Wagner, Julien Bourgeois, Chanboon Sathitwiriyawong, Hyuk-Yoon Kwon, Carson Leung
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages338-341
Number of pages4
ISBN (Electronic)9781728189246
DOIs
StatePublished - Jan 2021
Event2021 IEEE International Conference on Big Data and Smart Computing, BigComp 2021 - Jeju Island, Korea, Republic of
Duration: 17 Jan 202120 Jan 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Big Data and Smart Computing, BigComp 2021

Conference

Conference2021 IEEE International Conference on Big Data and Smart Computing, BigComp 2021
Country/TerritoryKorea, Republic of
CityJeju Island
Period17/01/2120/01/21

Fingerprint

Dive into the research topics of 'An empirical study for class imbalance in extreme multi-label text classification'. Together they form a unique fingerprint.

Cite this