Vision transformers have shown great success due to their high model capabilities. Compared to the recent efficient model MobileViT-XXS, EfficientViT-M2 achieves 1.8% superior accuracy, while running <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$5.8\times/3.7\times$</tex> faster on the GPU/CPU, and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$7.4\times faster$</tex> when converted to ONNX format.
⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.
This work deepens our understanding of the fundamental laws governing the universe, from subatomic particles to cosmic structures.
This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:
Read Full Paper at OpenAlex