TY - JOUR
T1 - Few-Shot Anomaly Detection via Personalization
AU - Kwak, Sangkyung
AU - Jeong, Jongheon
AU - Lee, Hankook
AU - Kim, Woohyuck
AU - Seo, Dongho
AU - Yun, Woojin
AU - Lee, Wonjin
AU - Shin, Jinwoo
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - Even with a plenty amount of normal samples, anomaly detection has been considered as a challenging machine learning task due to its one-class nature, i. e., the lack of anomalous samples in training time. It is only recently that a few-shot regime of anomaly detection became feasible in this regard, e. g., with a help from large vision-language pre-trained models such as CLIP, despite its wide applicability. In this paper, we explore the potential of large text-to-image generative models in performing few-shot industrial anomaly detection. Specifically, recent text-to-image models have shown unprecedented ability to generalize from few images to extract their common and unique concepts, and even encode them into a textual token to 'personalize' the model: so-called textual inversion. Here, we question whether this personalization is specific enough to discriminate the given images from their potential anomalies, which are often, e. g., open-ended, local, and hard-to-detect. We observe that standard textual inversion exhibits a weaker understanding in localized details within objects, which is not enough for detecting industrial anomalies accurately. Thus, we explore the utilization of model personalization to address anomaly detection and propose Anomaly Detection via Personalization (ADP). ADP enables extracting fine-grained local details shared in the images with simple-yet an effective regularization scheme from the zero-shot transferability of CLIP. We also propose a self-tuning scheme to further optimize the performance of our detection pipeline, leveraging synthetic data generated from the personalized generative model. Our experiments show that the proposed inversion scheme could achieve state-of-the-art results on two industrial anomaly benchmarks, MVTec-AD and VisA, in the regime of few normal samples.
AB - Even with a plenty amount of normal samples, anomaly detection has been considered as a challenging machine learning task due to its one-class nature, i. e., the lack of anomalous samples in training time. It is only recently that a few-shot regime of anomaly detection became feasible in this regard, e. g., with a help from large vision-language pre-trained models such as CLIP, despite its wide applicability. In this paper, we explore the potential of large text-to-image generative models in performing few-shot industrial anomaly detection. Specifically, recent text-to-image models have shown unprecedented ability to generalize from few images to extract their common and unique concepts, and even encode them into a textual token to 'personalize' the model: so-called textual inversion. Here, we question whether this personalization is specific enough to discriminate the given images from their potential anomalies, which are often, e. g., open-ended, local, and hard-to-detect. We observe that standard textual inversion exhibits a weaker understanding in localized details within objects, which is not enough for detecting industrial anomalies accurately. Thus, we explore the utilization of model personalization to address anomaly detection and propose Anomaly Detection via Personalization (ADP). ADP enables extracting fine-grained local details shared in the images with simple-yet an effective regularization scheme from the zero-shot transferability of CLIP. We also propose a self-tuning scheme to further optimize the performance of our detection pipeline, leveraging synthetic data generated from the personalized generative model. Our experiments show that the proposed inversion scheme could achieve state-of-the-art results on two industrial anomaly benchmarks, MVTec-AD and VisA, in the regime of few normal samples.
KW - Industrial anomaly detection
KW - model personalization
KW - text-to-image diffusion model
KW - vision-language model
UR - https://www.scopus.com/pages/publications/85182951462
U2 - 10.1109/ACCESS.2024.3355021
DO - 10.1109/ACCESS.2024.3355021
M3 - Article
AN - SCOPUS:85182951462
SN - 2169-3536
VL - 12
SP - 11035
EP - 11051
JO - IEEE Access
JF - IEEE Access
ER -