TY - GEN
T1 - Cross-Domain Semantic Segmentation on Inconsistent Taxonomy Using VLMs
AU - Lim, Jeongkee
AU - Kim, Yusung
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - The challenge of semantic segmentation in Unsupervised Domain Adaptation (UDA) emerges not only from domain shifts between source and target images but also from discrepancies in class taxonomies across domains. Traditional UDA research assumes consistent taxonomy between the source and target domains, thereby limiting their ability to recognize and adapt to the taxonomy of the target domain. This paper introduces a novel approach, Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using Vision Language Models (CSI), which effectively performs domain-adaptive semantic segmentation even in situations of source-target class mismatches. CSI leverages the semantic generalization potential of Visual Language Models (VLMs) to create synergy with previous UDA methods. It leverages segment reasoning obtained through traditional UDA methods, combined with the rich semantic knowledge embedded in VLMs, to relabel new classes in the target domain. This approach allows for effective adaptation to extended taxonomies without requiring any ground truth label for the target domain. Our method has shown to be effective across various benchmarks in situations of inconsistent taxonomy settings (coarse-to-fine taxonomy and open taxonomy) and demonstrates consistent synergy effects when integrated with previous state-of-the-art UDA methods. The implementation is available at https://github.com/jkee58/CSI.
AB - The challenge of semantic segmentation in Unsupervised Domain Adaptation (UDA) emerges not only from domain shifts between source and target images but also from discrepancies in class taxonomies across domains. Traditional UDA research assumes consistent taxonomy between the source and target domains, thereby limiting their ability to recognize and adapt to the taxonomy of the target domain. This paper introduces a novel approach, Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using Vision Language Models (CSI), which effectively performs domain-adaptive semantic segmentation even in situations of source-target class mismatches. CSI leverages the semantic generalization potential of Visual Language Models (VLMs) to create synergy with previous UDA methods. It leverages segment reasoning obtained through traditional UDA methods, combined with the rich semantic knowledge embedded in VLMs, to relabel new classes in the target domain. This approach allows for effective adaptation to extended taxonomies without requiring any ground truth label for the target domain. Our method has shown to be effective across various benchmarks in situations of inconsistent taxonomy settings (coarse-to-fine taxonomy and open taxonomy) and demonstrates consistent synergy effects when integrated with previous state-of-the-art UDA methods. The implementation is available at https://github.com/jkee58/CSI.
KW - Inconsistent Taxonomy
KW - Semantic Segmentation
KW - Unsupervised Domain Adaptation
UR - https://www.scopus.com/pages/publications/85210491127
U2 - 10.1007/978-3-031-73650-6_2
DO - 10.1007/978-3-031-73650-6_2
M3 - Conference contribution
AN - SCOPUS:85210491127
SN - 9783031736490
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 18
EP - 35
BT - Computer Vision – ECCV 2024 - 18th European Conference, Proceedings
A2 - Leonardis, Aleš
A2 - Ricci, Elisa
A2 - Roth, Stefan
A2 - Russakovsky, Olga
A2 - Sattler, Torsten
A2 - Varol, Gül
PB - Springer Science and Business Media Deutschland GmbH
T2 - 18th European Conference on Computer Vision, ECCV 2024
Y2 - 29 September 2024 through 4 October 2024
ER -