News
-
1 paper accepted to ECCV 2024.
-
We’ve released a new 8B Mamba-based Hybrid LLM! The checkpoints as well as the code are also released as part of NVIDIA’s Megatron-LM project.
-
1 paper accepted to CVPR 2024.
-
1 paper accepted to NeurIPS 2023.
Research Interests
-
Recurrent Neural Network (RNN), State-Space Models (SSM), Linear RNNs
-
Sequence Learning, Spatio-Temporal Learning
-
Predictive Learning, Few-shot Learning, Lifelong Learning
Selected Projects
-
R Waleffe, W Byeon, D Riach, B Norick, V Korthikanti, T Dao, A Gu, A Hatamizadeh, S Singh, D Narayanan, G Kulshreshtha, V Singh, J Casper, J Kautz, M Shoeybi, B Catanzaro, “An Empirical Study of Mamba-based Language Models”, arXiv, 2024
-
J T.H. Smith, S De Mello, J Kautz, S W. Linderman, W Byeon, “Convolutional State Space Models for Long-Range Spatiotemporal Modeling”, NeurIPS, 2023
-
J Su, W Byeon, F Huang, “Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework”, ICML, 2022
-
B Wu*, O Hennigh, J Kautz, S Choudhry, W Byeon*, “Physics Informed RNN-DCT Networks for Time-Dependent Partial Differential Equations”, ICCS 2022 (*) equal contributions
- Presented at NeurIPS’21 Workshop on ML and the Physical Science
- Released as part of NVIDIA Mudulus