FlowEdit: Inversion-Free Text-Based Editing
Using Pre-Trained Flow Models

Technion – Israel Institute of Technology

Abstract

Editing real images using a pre-trained text-to-image (T2I) diffusion/flow model often involves inverting the image into its corresponding noise map. However, inversion by itself is typically insufficient for obtaining satisfactory results, and therefore many methods additionally intervene in the sampling process. Such methods achieve improved results but are not seamlessly transferable between model architectures. Here, we introduce FlowEdit, a text-based editing method for pre-trained T2I flow models, which is inversion-free, optimization-free and model agnostic. Our method constructs an ODE that directly maps between the source and target distributions (corresponding to the source and target text prompts) and achieves a lower transport cost than the inversion approach. This leads to state-of-the-art results, as we illustrate with Stable Diffusion 3 and FLUX.

Overview

See the following video for visual intuition about our method. The video includes narration.


The following figure illustrates the main idea behind our method: (a) In inversion based editing, the source image Zsrc0 is first mapped to the noise space by solving the forward ODE conditioned on the source prompt (left path). Then, the extracted noise is used to solve the reverse ODE conditioned on the target prompt to obtain Ztar0 (right path). The images at the bottom visualize this transition.
(b) We reinterpret inversion as a direct path between the source and target distributions (bottom path). This is done by using the velocities calculated during the inversion and sampling (green and red arrows) to calculate an editing direction (orange arrow) that drives the evolution of the direct path Zinvt through an ODE. The resulting path is noise-free, as demonstrated by the images at the bottom.
(c) FlowEdit traverses a shorter direct path, ZFEt, without relying on inversion. At each timestep, we directly add random noise to src0 to obtain srct and use that direction to create tart from ZFEt (gray parallelogram). We then calculate the corresponding velocities and average over multiple realizations (not shown in the figure) to obtain the next ODE step (orange arrow). The images at the bottom demonstrate our noise-free path.
See our paper for more details.


Real Image Editing

A bicycle parked next to a red brick building

A vespa parked next to a red brick building

A rabbit sitting in a field with flowers

A puppy sitting in a field with flowers

A glass of milk

A glass of beer

A restaurant called Luna

A restaurant called Sol

A woman meditating

A wooden statue meditating

A cat wearing a crown

A cat wearing a top hat

A coconut shell filled with splashing water

A baseball shell filled with splashing water

A wolf standing on a cliff, howling

A Husky standing on a cliff, looking

A horse in the field

A pink toy horse in the field

Two penguins

Two origami penguins

Clownfish swimming in a reef

Goldfish swimming in a reef

A dog in the snow

A deer in the snow



Comparisons

Stable Diffusion 3

FLUX


Paper

FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models
Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli.

Bibtex

@article{kulikov2024flowedit, title = {FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models}, author = {Kulikov, Vladimir and Kleiner, Matan and Huberman-Spiegelglas, Inbar and Michaeli, Tomer}, journal = {arXiv preprint arXiv:2412.08629}, year = {2024} }

Our official code can be found in the official github repository.



References

[1] Xiaofeng Yang, Cheng Chen, Xulei Yang, Fayao Liu and Guosheng Lin. "Text-to-Image Rectified Flow as Plug-and-Play Priors."
[2] Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai and Wen-Sheng Chu. "Semantic Image Inversion and Editing using Stochastic Rectified Differential Equations."


Acknowledgements

This webpage was originally made by Matan Kleiner with the help of Hila Manor. The code for the original template can be found here.
Icons are taken from font awesome or from Academicons.