A Plug-and-Play Approach for Robust Image Editing in Text-to-Image Diffusion Models

Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

With the advancement of diffusion models, a wide range of image editing techniques have also been developed. To support these, various inversion methods have been introduced to preserve the original content. However, these inversion methods often exhibit instability, often failing to reconstruct certain images, particularly when applied to high-resolution diffusion models equipped with deep U-Nets. To address this issue, we propose a novel plug-and-play RLI (Residual Linear Interpolation) method. During the forward process, the method operates within the self-attention mechanism and performs an interpolation between the attention values before and after the computation. This interpolation mitigates abrupt changes in the attention map, thereby enabling smoother transitions in spatial representations and reducing unintended distortions of the original content. Our method is compatible with various existing diffusion model variants, inversion techniques, and image editing approaches. In particular, it provides a significant solution to the reconstruction failure observed when using Null-text Inversion with SDXL, where the null-text optimization does not converge properly. In addition, we demonstrate that, when combined with diverse inversion methods and image editing methods across multiple diffusion models, our approach achieves superior preservation of the original content, both quantitatively and qualitatively, without compromising the existing editing performance. The code is available at https://github.com/ugiugi0823/ICCVW-RLI © 2025 IEEE.

키워드

DiffusionImage editingImage synthesis
제목
A Plug-and-Play Approach for Robust Image Editing in Text-to-Image Diffusion Models
저자
Jo, HyunwookMaeng, JiseungPark, Jun HyungAhn, NamhyukPark, In Kyu
DOI
10.1109/ICCVW69036.2025.00454
발행일
2025
유형
Proceedings Paper
저널명
Proceedings - 2025 IEEE/CVF International Conference on Computer Vision Workshops, ICCV-W 2025
페이지
4380 ~ 4389