TY - GEN
T1 - Understanding the Latent Space of Diffusion Models through the Lens of Riemannian Geometry
AU - Park, Yong Hyun
AU - Kwon, Mingi
AU - Choi, Jaewoong
AU - Jo, Junghyo
AU - Uh, Youngjung
N1 - Publisher Copyright:
© 2023 Neural information processing systems foundation. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Despite the success of diffusion models (DMs), we still lack a thorough understanding of their latent space.To understand the latent space xt ∈ X, we analyze them from a geometrical perspective.Our approach involves deriving the local latent basis within X by leveraging the pullback metric associated with their encoding feature maps.Remarkably, our discovered local latent basis enables image editing capabilities by moving xt, the latent space of DMs, along the basis vector at specific timesteps.We further analyze how the geometric structure of DMs evolves over diffusion timesteps and differs across different text conditions.This confirms the known phenomenon of coarse-to-fine generation, as well as reveals novel insights such as the discrepancy between xt across timesteps, the effect of dataset complexity, and the time-varying influence of text prompts.To the best of our knowledge, this paper is the first to present image editing through x-space traversal, editing only once at specific timestep t without any additional training, and providing thorough analyses of the latent structure of DMs.The code to reproduce our experiments can be found at https://github.com/enkeejunior1/Diffusion-Pullback.
AB - Despite the success of diffusion models (DMs), we still lack a thorough understanding of their latent space.To understand the latent space xt ∈ X, we analyze them from a geometrical perspective.Our approach involves deriving the local latent basis within X by leveraging the pullback metric associated with their encoding feature maps.Remarkably, our discovered local latent basis enables image editing capabilities by moving xt, the latent space of DMs, along the basis vector at specific timesteps.We further analyze how the geometric structure of DMs evolves over diffusion timesteps and differs across different text conditions.This confirms the known phenomenon of coarse-to-fine generation, as well as reveals novel insights such as the discrepancy between xt across timesteps, the effect of dataset complexity, and the time-varying influence of text prompts.To the best of our knowledge, this paper is the first to present image editing through x-space traversal, editing only once at specific timestep t without any additional training, and providing thorough analyses of the latent structure of DMs.The code to reproduce our experiments can be found at https://github.com/enkeejunior1/Diffusion-Pullback.
UR - https://www.scopus.com/pages/publications/85179030922
M3 - Conference contribution
AN - SCOPUS:85179030922
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Neural information processing systems foundation
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
Y2 - 10 December 2023 through 16 December 2023
ER -