Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. As a general sequence model backbone, Mamba achieves advanced performance across several modalities such as language, audio, and genomics.
This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.
Read the full paper
Access the original peer-reviewed research via OpenAlex.
| Category | 🤖 Artificial Intelligence |
| Published | Dec 01, 2023 |
| Journal | arXiv (Cornell University) |
| Authors | Albert Gu, Tri Dao |
| DOI | 10.48550/arxiv.2312.00752 |
| Citations | 1,001 |
| Source | OpenAlex |