Large language models with controllable working memory D Li, AS Rawat, M Zaheer, X Wang, M Lukasik, A Veit, F Yu, S Kumar arXiv preprint arXiv:2211.05110, 2022 | 71 | 2022 |
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks Atish Agarwala, Abhimanyu Das, Brendan Juba, Rina Panigrahy, Vatsal Sharan ... International Conference on Learning Representations, 2021 | 15* | 2021 |
A unified cascaded encoder asr model for dynamic model sizes S Ding, W Wang, D Zhao, TN Sainath, Y He, R David, R Botros, X Wang, ... arXiv preprint arXiv:2204.06164, 2022 | 14 | 2022 |
Sketch based memory for neural networks R Panigrahy, X Wang, M Zaheer International Conference on Artificial Intelligence and Statistics, 3169-3177, 2021 | 10 | 2021 |
A theoretical view on sparsely activated networks C Baykal, N Dikkala, R Panigrahy, C Rashtchian, X Wang Advances in Neural Information Processing Systems 35, 30071-30084, 2022 | 9 | 2022 |
Back and forth error compensation and correction method for linear hyperbolic systems with application to the Maxwell's equations X Wang, Y Liu Journal of Computational Physics: X 1, 100014, 2019 | 7 | 2019 |
On the benefits of learning to route in mixture-of-experts models N Dikkala, N Ghosh, R Meka, R Panigrahy, N Vyas, X Wang Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 6 | 2023 |
Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients R Li, X Wang, H Zha, M Tao arXiv preprint arXiv:2002.08949, 2020 | 4 | 2020 |
JaxPruner: A concise library for sparsity research JH Lee, W Park, NE Mitchell, J Pilault, JSO Ceron, HB Kim, N Lee, ... Conference on Parsimony and Learning, 515-528, 2024 | 3 | 2024 |
Layernas: Neural architecture search in polynomial complexity Y Fan, D Alon, J Shen, D Peng, K Kumar, Y Long, X Wang, F Iliopoulos, ... arXiv preprint arXiv:2304.11517, 2023 | 3 | 2023 |
Provable hierarchical lifelong learning with a sketch-based modular architecture Z Deng, Z Fryer, B Juba, R Panigrahy, X Wang arXiv preprint arXiv:2112.10919, 2021 | 2 | 2021 |
Alternating updates for efficient transformers C Baykal, D Cutler, N Dikkala, N Ghosh, R Panigrahy, X Wang Advances in Neural Information Processing Systems 36, 2024 | 1 | 2024 |
Sketching based representations for robust image classification with provable guarantees N Dikkala, SR Karingula, R Meka, J Nelson, R Panigrahy, X Wang Advances in Neural Information Processing Systems 35, 5459-5470, 2022 | 1 | 2022 |
Unified Cascaded Encoder ASR model for Dynamic Model Sizes S Ding, Y He, X Wang, W Wang, T Strohman, TN Sainath, ... US Patent US20230326461A1, 2023 | | 2023 |
The Power of External Memory in Increasing Predictive Model Capacity C Baykal, DJ Cutler, N Dikkala, N Ghosh, R Panigrahy, X Wang arXiv preprint arXiv:2302.00003, 2023 | | 2023 |
JAXPruner: A Modular Library for Sparsity Research ICLR 2023 Workshop on Sparsity in Neural Networks, 2023 | | 2023 |
One network fits all? Modular versus monolithic task formulations in neural networks A Das, A Agarwala, B Juba, R Zhang, R Panigrahy, X Wang | | 2021 |
The back and forth error compensation and correction method for linear hyperbolic systems and a conservative BFECC limiter X Wang Georgia Institute of Technology, 2018 | | 2018 |
Understanding the Capabilities and Limitations of Neural Networks for Multi-task Learning V Sharan, X Wang, B Juba, R Panigrahy | | |