Topic models can be useful tools to discover latent topics in collections of documents. More specifically, BERTopic generates document embedding with pre-trained transformer-based language models, clusters these embeddings, and finally, generates topic representations with the class-based TF-IDF procedure.
This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.
Read the full paper
Access the original peer-reviewed research via OpenAlex.
| Category | 🤖 Artificial Intelligence |
| Published | Mar 11, 2022 |
| Journal | arXiv (Cornell University) |
| Authors | Maarten Grootendorst |
| DOI | 10.48550/arxiv.2203.05794 |
| Citations | 1,323 |
| Source | OpenAlex |