TY - GEN
T1 - Debiasing CLIP with Feature Augmentation and Selective Update to Preserve Zero-Shot Capability
AU - Hwang, Sung Joon
AU - Hong, Mann Soo
AU - Lee, Jee Hyong
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Vision-language models, including CLIP, have shown impressive performance across various tasks. However, these models often struggle to distinguish meaningful features from spurious correlations in dataset biases, resulting in gen-eralization issues. Mitigating these biases without compromising CLIP's core strengths for zero-shot performance is crucial but has been overlooked. We propose a novel adversarial training method to debias CLIP, combining PGD on the projection layer for feature augmentation and Selective Update for Debiasing (SUD). Our method targets the projection layer, using minimally distorted feature augmentation and selective update strategy guided by model predictions. PGD uses the embedding layer's gradients to generate biased features that induce misclassifications with minimal distortion. SUD applies different objective functions to refine features based on prediction accuracy. These methods effectively mitigate bias by training on misclassified samples and preserve the existing embedding space. Experiments confirm our method improves model robustness against biases while maintaining zero-shot performance. This approach offers a promising solution for debiasing vision-language models without degradation in performance.
AB - Vision-language models, including CLIP, have shown impressive performance across various tasks. However, these models often struggle to distinguish meaningful features from spurious correlations in dataset biases, resulting in gen-eralization issues. Mitigating these biases without compromising CLIP's core strengths for zero-shot performance is crucial but has been overlooked. We propose a novel adversarial training method to debias CLIP, combining PGD on the projection layer for feature augmentation and Selective Update for Debiasing (SUD). Our method targets the projection layer, using minimally distorted feature augmentation and selective update strategy guided by model predictions. PGD uses the embedding layer's gradients to generate biased features that induce misclassifications with minimal distortion. SUD applies different objective functions to refine features based on prediction accuracy. These methods effectively mitigate bias by training on misclassified samples and preserve the existing embedding space. Experiments confirm our method improves model robustness against biases while maintaining zero-shot performance. This approach offers a promising solution for debiasing vision-language models without degradation in performance.
KW - Adversarial Training
KW - Debiasing
KW - Feature Augmentation
KW - Selective Up-date
KW - Vision-Language Model
UR - https://www.scopus.com/pages/publications/85214707468
U2 - 10.1109/SCISISIS61014.2024.10759909
DO - 10.1109/SCISISIS61014.2024.10759909
M3 - Conference contribution
AN - SCOPUS:85214707468
T3 - 2024 Joint 13th International Conference on Soft Computing and Intelligent Systems and 25th International Symposium on Advanced Intelligent Systems, SCIS and ISIS 2024
BT - 2024 Joint 13th International Conference on Soft Computing and Intelligent Systems and 25th International Symposium on Advanced Intelligent Systems, SCIS and ISIS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - Joint 13th International Conference on Soft Computing and Intelligent Systems and 25th International Symposium on Advanced Intelligent Systems, SCIS and ISIS 2024
Y2 - 9 November 2024 through 12 November 2024
ER -