TY - JOUR
T1 - Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation
AU - Lee, Minhyun
AU - Lee, Seungho
AU - Lee, Jongwuk
AU - Shim, Hyunjung
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2023/10/1
Y1 - 2023/10/1
N2 - Existing studies on semantic segmentation using image-level weak supervision have several limitations, including sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, an improved version of Explicit Pseudo-pixel Supervision (EPS++), which learns from pixel-level feedback by combining two types of weak supervision. Specifically, the image-level label provides the object identity via the localization map, and the saliency map from an off-the-shelf saliency detection model offers rich object boundaries. We devise a joint training strategy to fully utilize the complementary relationship between disparate information. Notably, we suggest an Inconsistent Region Drop (IRD) strategy, which effectively handles errors in saliency maps using fewer hyper-parameters than EPS. Our method can obtain accurate object boundaries and discard co-occurring pixels, significantly improving the quality of pseudo-masks. Experimental results show that EPS++ effectively resolves the key challenges of semantic segmentation using weak supervision, resulting in new state-of-the-art performances on three benchmark datasets in a weakly supervised semantic segmentation setting. Furthermore, we show that the proposed method can be extended to solve the semi-supervised semantic segmentation problem using image-level weak supervision. Surprisingly, the proposed model also achieves new state-of-the-art performances on two popular benchmark datasets.
AB - Existing studies on semantic segmentation using image-level weak supervision have several limitations, including sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, an improved version of Explicit Pseudo-pixel Supervision (EPS++), which learns from pixel-level feedback by combining two types of weak supervision. Specifically, the image-level label provides the object identity via the localization map, and the saliency map from an off-the-shelf saliency detection model offers rich object boundaries. We devise a joint training strategy to fully utilize the complementary relationship between disparate information. Notably, we suggest an Inconsistent Region Drop (IRD) strategy, which effectively handles errors in saliency maps using fewer hyper-parameters than EPS. Our method can obtain accurate object boundaries and discard co-occurring pixels, significantly improving the quality of pseudo-masks. Experimental results show that EPS++ effectively resolves the key challenges of semantic segmentation using weak supervision, resulting in new state-of-the-art performances on three benchmark datasets in a weakly supervised semantic segmentation setting. Furthermore, we show that the proposed method can be extended to solve the semi-supervised semantic segmentation problem using image-level weak supervision. Surprisingly, the proposed model also achieves new state-of-the-art performances on two popular benchmark datasets.
KW - Boundary
KW - co-occurrence
KW - saliency
KW - semantic segmentation
KW - semi-supervised
KW - weakly supervised
UR - https://www.scopus.com/pages/publications/85159820706
U2 - 10.1109/TPAMI.2023.3273592
DO - 10.1109/TPAMI.2023.3273592
M3 - Article
C2 - 37155377
AN - SCOPUS:85159820706
SN - 0162-8828
VL - 45
SP - 12341
EP - 12357
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 10
ER -