News

  • Check out how the AI community is discussing STORM! From technical podcasts to detailed audio summaries, these resources offer a variety of ways to learn about our latest research in efficient video-LLMs. You can find the full list of videos here.

  • 1 paper accepted to CVPR 2026.

  • 3 papers accepted to NeurIPS 2025.

  • We recently introduced Nemotron Nano V2, a 9B hybrid model that delivers competitive or superior accuracy on reasoning benchmarks while achieving up to 6× higher inference throughput in reasoning tasks (e.g., 8k input and 16k output tokens). The model builds on our Mamba-based Hybrid LLM work.

  • Our token-efficient long video model for multimodal LLMs (STORM) is on arXiv. It achieves more than 5% improvement on MLVU and LongVideoBench compared to SOTA while reducing the computation costs by up to 8× and the decoding latency by 2.4-2.9×. Check the project page for more details!

  • We intoduce efficient 2D parallel sequence modeling for image classifcaiton and generation. The paper is accepted to CVPR 2025!

  • Another Mamba-Transformer Hybrid LLM is on arxiv! Check the blog post. The paper is accepted as a Spotlight to ICLR 2025!

Research Interests

  • Recurrent Neural Network (RNN), State-Space Models (SSM), Linear RNNs

  • Sequence Learning, Spatio-Temporal Learning

Selected Projects