signSGD with majority vote is communication efficient and fault tolerant J Bernstein, J Zhao, K Azizzadenesheli, A Anandkumar arXiv preprint arXiv:1810.05291, 2018 | 202 | 2018 |
Cost-effective training of deep cnns with active model adaptation SJ Huang, JW Zhao, ZY Liu Proceedings of the 24th ACM SIGKDD International Conference on Knowledge …, 2018 | 84 | 2018 |
Galore: Memory-efficient llm training by gradient low-rank projection J Zhao, Z Zhang, B Chen, Z Wang, A Anandkumar, Y Tian arXiv preprint arXiv:2403.03507, 2024 | 51* | 2024 |
Learning compositional functions via multiplicative weight updates J Bernstein, J Zhao, M Meister, MY Liu, A Anandkumar, Y Yue Advances in neural information processing systems 33, 13319-13330, 2020 | 20 | 2020 |
Lns-madam: Low-precision training in logarithmic number system using multiplicative weight update J Zhao, S Dai, R Venkatesan, B Zimmer, M Ali, MY Liu, B Khailany, ... IEEE Transactions on Computers 71 (12), 3179-3190, 2022 | 16 | 2022 |
Zero initialization: Initializing neural networks with only zeros and ones J Zhao, F Schäfer, A Anandkumar arXiv preprint arXiv:2110.12661, 2021 | 13 | 2021 |
Incremental fourier neural operator J Zhao, RJ George, Y Zhang, Z Li, A Anandkumar arXiv preprint arXiv:2211.15188, 2022 | 12 | 2022 |
Zero initialization: Initializing residual networks with only zeros and ones J Zhao, FT Schaefer, A Anandkumar | 11 | 2021 |
Inrank: Incremental low-rank learning J Zhao, Y Zhang, B Chen, F Schäfer, A Anandkumar arXiv preprint arXiv:2306.11250, 2023 | 7 | 2023 |
Incremental spectral learning in Fourier neural operator J Zhao, RJ George, Z Li, A Anandkumar arXiv preprint arXiv:2211.15188, 2022 | 3 | 2022 |
Machine learning training in logarithmic number system Z Jiawei, SH Dai, R Venkatesan, MY Liu, WJ Dally, A Anandkumar US Patent App. 17/346,100, 2022 | 3 | 2022 |
MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training C Luo, J Zhao, Z Chen, B Chen, A Anandkumar arXiv preprint arXiv:2407.15892, 2024 | 1 | 2024 |
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients A Jaiswal, L Yin, Z Zhang, S Liu, J Zhao, Y Tian, Z Wang arXiv preprint arXiv:2407.11239, 2024 | | 2024 |
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Z Zhang, A Jaiswal, L Yin, S Liu, J Zhao, Y Tian, Z Wang arXiv preprint arXiv:2407.08296, 2024 | | 2024 |
Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs RJ George, J Zhao, J Kossaifi, Z Li, A Anandkumar arXiv e-prints, arXiv: 2211.15188, 2022 | | 2022 |
Incremental Low-Rank Learning J Zhao, Y Zhang, B Chen, FT Schaefer, A Anandkumar Workshop on Efficient Systems for Foundation Models@ ICML2023, 0 | | |