A Plug-and-Play Approach for Robust Image Editing in Text-to-Image Diffusion Models

Jo, Hyunwook; Maeng, Jiseung; Park, Jun Hyung; Ahn, Namhyuk; Park, In Kyu

doi:10.1109/ICCVW69036.2025.00454

상세 보기

A Plug-and-Play Approach for Robust Image Editing in Text-to-Image Diffusion Models

Jo, Hyunwook;
Maeng, Jiseung;
Park, Jun Hyung;
Ahn, Namhyuk;
Park, In Kyu

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

With the advancement of diffusion models, a wide range of image editing techniques have also been developed. To support these, various inversion methods have been introduced to preserve the original content. However, these inversion methods often exhibit instability, often failing to reconstruct certain images, particularly when applied to high-resolution diffusion models equipped with deep U-Nets. To address this issue, we propose a novel plug-and-play RLI (Residual Linear Interpolation) method. During the forward process, the method operates within the self-attention mechanism and performs an interpolation between the attention values before and after the computation. This interpolation mitigates abrupt changes in the attention map, thereby enabling smoother transitions in spatial representations and reducing unintended distortions of the original content. Our method is compatible with various existing diffusion model variants, inversion techniques, and image editing approaches. In particular, it provides a significant solution to the reconstruction failure observed when using Null-text Inversion with SDXL, where the null-text optimization does not converge properly. In addition, we demonstrate that, when combined with diverse inversion methods and image editing methods across multiple diffusion models, our approach achieves superior preservation of the original content, both quantitatively and qualitatively, without compromising the existing editing performance. The code is available at https://github.com/ugiugi0823/ICCVW-RLI © 2025 IEEE.

키워드

Diffusion; Image editing; Image synthesis

제목: A Plug-and-Play Approach for Robust Image Editing in Text-to-Image Diffusion Models

저자: Jo, Hyunwook; Maeng, Jiseung; Park, Jun Hyung; Ahn, Namhyuk; Park, In Kyu

DOI: 10.1109/ICCVW69036.2025.00454

발행일: 2025

유형: Proceedings Paper

저널명: Proceedings - 2025 IEEE/CVF International Conference on Computer Vision Workshops, ICCV-W 2025

페이지: 4380 ~ 4389