Home / Research Articles Hub / Scalable Diffusion Models with Transformers
⚙️ Engineering & Technology OpenAlex

Scalable Diffusion Models with Transformers

📅 Published: October 1, 2023 👤 William Peebles, Saining Xie 📖 Research Journal 📊 1,407 citations
AI-Generated Summary

We explore a new class of diffusion models based on the transformer architecture. We find that DiTs with higher Gflops—through increased transformer depth/width or increased number of input tokens—consistently have lower FID.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches.
  • 2 We analyze the scalability of our Diffusion Transformers (DiTs) through the lens of forward pass complexity as measured by Gflops.
  • 3 We find that DiTs with higher Gflops—through increased transformer depth/width or increased number of input tokens—consistently have lower FID.
Why It Matters

These innovations can translate to real-world improvements in technology, infrastructure, and everyday tools.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Engineering & Technology Papers ← Back to Hub 📚 Learning Hub