Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Exemplar image pairs ,  ]

Abstract

In recent years, instruction-based image editing methods have garnered significant attention in image editing. However, despite encompassing a wide range of editing priors, these methods are helpless when handling editing tasks that are challenging to accurately describe through language. We propose InstructBrush, an inversion method for instruction-based image editing methods to bridge this gap. It extracts editing effects from exemplar image pairs as editing instructions, which are further applied for image editing. Two key techniques are introduced into InstructBrush, Attention-based Instruction Optimization and Transformation-oriented Instruction Initialization, to address the limitations of the previous method in terms of inversion effects and instruction generalization. To explore the ability of instruction inversion methods to guide image editing in open scenarios, we establish a Transformation-Oriented Paired Benchmark (TOP-Bench), which contains a rich set of scenes and editing types. The creation of this benchmark paves the way for further exploration of instruction inversion. Quantitatively and qualitatively, our approach achieves superior performance in editing and is more semantically consistent with the target editing effects.

More Samples

sample5
Our method shows robust performance in local editing. It extracts the editing effect from reference image pairs, and then applies it for editing new images.
IN: Input Image. GT: Ground Truth.
sample6
Our method shows robust performance in global editing. It extracts the editing effect from reference image pairs, and then applies it for editing new images.
IN: Input Image. GT: Ground Truth.

Comparison to Other Methods

sample5
Our method achieves superior performance in local editing. It effectively avoids introducing editing-independent information from the training image and shows better instruction generalization.
sample6
Our method achieves superior performance in global editing. It effectively avoids introducing editing-independent information from the training image and shows better instruction generalization.

TOP-Bench

sample7
To investigate the editing capabilities of various instruction inversion methods in open scenarios and facilitate a fair comparison of these methods, We establish a benchmark named TOP-Bench (Transformation-Oriented Paired Benchmark), which can be utilized for both qualitative and quantitative evaluations. Our benchmark spans 25 datasets shown above corresponding to different editing effects. Each dataset consists of 10 pairs of training images and 5 pairs of testing images, totaling 750 images. Additionally, we provide text instructions aligned with the transformation effects for each dataset.

Application

InstructBrush can extract different image tones based on a handful of data pairs and apply them to new images. The images on the left give the same input image, and the images on the right show the editing images, corresponding to different image tones.

How does it work?

Description of the image
  • InstructBrush inverts instructions from Exemplar image pairs: by proposing novel Transformation-oriented Instruction Initialization (a) and Attention-based Instruction Optimization (b) modules. The former is proposed to initialize the instruction, which effectively introduces the editing-related prior to facilitate semantic alignment of the instruction with the exemplar image pairs. The latter introduces the editing instruction into the cross-attention layers of the instruction-based image editing model and directly optimizes the Keys and Values corresponding to the instruction within these layers. After optimization, the learned instructions are used to guide the editing of new images (c).

BibTeX

@article{zhao2024instructbrush,
            title={InstructBrush: Learning Attention-based Instruction Optimization for Image Editing},
            author={Zhao, Ruoyu and Fan, Qingnan and Kou, Fei and Qin, Shuai and Gu, Hong and Wu, Wei and Xu, Pengcheng and Zhu, Mingrui and Wang, Nannan and Gao, Xinbo},
            journal={arXiv preprint arXiv:2403.18660},
            year={2024}
            }