Home / Research Articles Hub / Blended Diffusion for Text-driven Editing of Natur...
🤖 Artificial Intelligence OpenAlex

Blended Diffusion for Text-driven Editing of Natural Images

📅 Published: June 1, 2022 👤 Omri Avrahami, Dani Lischinski, Ohad Fried 📖 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 📊 690 citations
AI-Generated Summary

Natural language offers a highly intuitive interface for image editing. We compare against several baselines and related methods, both qualitatively and quantitatively, and show that our method outperforms these solutions in terms of overall realism, ability to preserve the background and matching the text.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 In this paper, we introduce the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with an ROI mask.
  • 2 We achieve our goal by leveraging and combining a pretrained language-image model (CLIP), to steer the edit towards a user-provided text prompt, with a denoising diffusion probabilistic model (DDPM) to generate natural-looking results.
  • 3 To seamlessly fuse the edited region with the unchanged parts of the image, we spatially blend noised versions of the input image with the local text-guided diffusion latent at a progression of noise levels.
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published Jun 1, 2022
Journal 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
DOI 10.1109/cvpr52688.2022.01767
Citations 690
Authors Omri Avrahami, Dani Lischinski, Ohad Fried