Home / Research Library / Blended Diffusion for Text-driven Editing of Natur...
🤖 Artificial Intelligence OpenAlex

Blended Diffusion for Text-driven Editing of Natural Images

📅 June 1, 2022 👤 Omri Avrahami, Dani Lischinski, Ohad Fried 📖 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 📊 690 citations

🤖 Plain-English Summary

Natural language offers a highly intuitive interface for image editing. We compare against several baselines and related methods, both qualitatively and quantitatively, and show that our method outperforms these solutions in terms of overall realism, ability to preserve the background and matching the text.

🔑 Key Findings

  • In this paper, we introduce the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with an ROI mask.
  • We achieve our goal by leveraging and combining a pretrained language-image model (CLIP), to steer the edit towards a user-provided text prompt, with a denoising diffusion probabilistic model (DDPM) to generate natural-looking results.
  • To seamlessly fuse the edited region with the unchanged parts of the image, we spatially blend noised versions of the input image with the local text-guided diffusion latent at a progression of noise levels.

💡 Why This Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

Read the full paper
Access the original peer-reviewed research via OpenAlex.

View on DOI ↗

📋 Article Details

Category 🤖 Artificial Intelligence
Published Jun 01, 2022
Journal 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Authors Omri Avrahami, Dani Lischinski, Ohad Fried
DOI 10.1109/cvpr52688.2022.01767
Citations 690
Source OpenAlex

More 🤖 Artificial Intelligence Research