Home / Research Library / Scalable Diffusion Models with Transformers
⚙️ Engineering & Technology OpenAlex

Scalable Diffusion Models with Transformers

📅 October 1, 2023 👤 William Peebles, Saining Xie 📖 Research Journal 📊 1,407 citations

🤖 Plain-English Summary

We explore a new class of diffusion models based on the transformer architecture. We find that DiTs with higher Gflops—through increased transformer depth/width or increased number of input tokens—consistently have lower FID.

🔑 Key Findings

  • We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches.
  • We analyze the scalability of our Diffusion Transformers (DiTs) through the lens of forward pass complexity as measured by Gflops.
  • We find that DiTs with higher Gflops—through increased transformer depth/width or increased number of input tokens—consistently have lower FID.

💡 Why This Matters

These innovations can translate to real-world improvements in technology, infrastructure, and everyday tools.

Read the full paper
Access the original peer-reviewed research via OpenAlex.

View on DOI ↗

📋 Article Details

Category ⚙️ Engineering & Technology
Published Oct 01, 2023
Journal Research Journal
Authors William Peebles, Saining Xie
DOI 10.1109/iccv51070.2023.00387
Citations 1,407
Source OpenAlex

More ⚙️ Engineering & Technology Research