Vegeta: Vertically-integrated extensions for sparse/dense gemm tile acceleration on cpus G Jeong, S Damani, AR Bambhaniya, E Qin, CJ Hughes, S Subramoney, ... 2023 IEEE International Symposium on High-Performance Computer Architecture …, 2023 | 14 | 2023 |
Enabling flexibility for sparse tensor acceleration via heterogeneity E Qin, R Garg, A Bambhaniya, M Pellauer, A Parashar, S Rajamanickam, ... arXiv preprint arXiv:2201.08916, 2022 | 7 | 2022 |
Hardware-Software co-design for real-time latency-accuracy navigation in tinyML applications P Behnam, J Tong, A Khare, Y Chen, Y Pan, P Gadikar, A Bambhaniya, ... IEEE Micro, 2023 | 5 | 2023 |
COMET: A comprehensive cluster design methodology for distributed deep learning training DK Kadiyala, S Rashidi, T Heo, AR Bambhaniya, T Krishna, A Daglis arXiv preprint arXiv:2211.16648, 2022 | 5 | 2022 |
Subgraph stationary hardware-software inference co-design P Behnam, A Tumanov, T Krishna, P Gadikar, Y Chen, J Tong, Y Pan, ... Proceedings of Machine Learning and Systems 5, 563-577, 2023 | 3 | 2023 |
Accelerating Attention Based Models via HW-SW Co-Design using Fine-Grained Sparsification AR Bambhaniya, A Yazdanbakhsh, S Subramanian, T Krishna Architecture and System Support for Transformer Models (ASSYST@ ISCA 2023), 2023 | 2 | 2023 |
Demystifying Platform Requirements for Diverse LLM Inference Use Cases A Bambhaniya, R Raj, G Jeong, S Kundu, S Srinivasan, M Elavazhagan, ... arXiv preprint arXiv:2406.01698, 2024 | 1 | 2024 |
Progressive Gradient Flow for Robust N: M Sparsity Training in Transformers AR Bambhaniya, A Yazdanbakhsh, S Subramanian, SC Kao, S Agrawal, ... arXiv preprint arXiv:2402.04744, 2024 | 1 | 2024 |
Enabling Real-time DNN Switching via Weight-Sharing J Tong, Y Chen, Y Pan, A Bambhaniya, A Khare, T Heo, A Tumanov, ... Conference proceedings International Symposium on Computer Architecture, 2022 | 1 | 2022 |
Leveraging Memory Expansion to Accelerate Large-Scale DL Training D Kadiyala, S Rashidi, T Heo, A Bambhaniya, T Krishna, A Daglis 2024 IEEE International Symposium on Performance Analysis of Systems and …, 2024 | | 2024 |
Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition G Jeong, PA Tsai, AR Bambhaniya, SW Keckler, T Krishna arXiv preprint arXiv:2403.07953, 2024 | | 2024 |
Progressive Gradient Flow for Robust N: M Sparsity Training in Transformers A Rajeshkumar Bambhaniya, A Yazdanbakhsh, S Subramanian, SC Kao, ... arXiv e-prints, arXiv: 2402.04744, 2024 | | 2024 |
Proteus: HLS-based NoC Generator and Simulator AR Bambhaniya, Y Chen, R Banerjee, T Krishna 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1-6, 2023 | | 2023 |
COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training D Kiran Kadiyala, S Rashidi, T Heo, A Rajeshkumar Bambhaniya, ... arXiv e-prints, arXiv: 2211.16648, 2022 | | 2022 |
MoE-ERAS: Expert Residency Aware Selection AR Bambhaniya, SC Kumar, T Krishna Machine Learning for Computer Architecture and Systems 2024, 0 | | |
Sparsify the Weights but Let the Gradients Flow! A Yazdanbakhsh, AR Bambhaniya, S Subramanian, SC Kao, S Agrawal, ... | | |