Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training E Qin, A Samajdar, H Kwon, V Nadella, S Srinivasan, D Das, B Kaul, ... 2020 IEEE International Symposium on High Performance Computer Architecture …, 2020 | 423 | 2020 |
A study of BFLOAT16 for deep learning training D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ... arXiv preprint arXiv:1905.12322, 2019 | 330 | 2019 |
Out-of-distribution detection using an ensemble of self supervised leave-out classifiers A Vyas, N Jammalamadaka, X Zhu, D Das, B Kaul, TL Willke Proceedings of the European conference on computer vision (ECCV), 550-564, 2018 | 279 | 2018 |
Scaledeep: A scalable compute architecture for learning and evaluating deep networks S Venkataramani, A Ranjan, S Banerjee, D Das, S Avancha, ... Proceedings of the 44th Annual International Symposium on Computer …, 2017 | 270 | 2017 |
Distributed deep learning using synchronous stochastic gradient descent D Das, S Avancha, D Mudigere, K Vaidynathan, S Sridharan, D Kalamkar, ... arXiv preprint arXiv:1602.06709, 2016 | 211 | 2016 |
Method and apparatus to manage network addresses B Kaul, N Tulpule, M Zhu, P Krishnaswamy US Patent App. 10/651,929, 2005 | 203 | 2005 |
Mixed precision training of convolutional neural networks using integer operations D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ... arXiv preprint arXiv:1802.00930, 2018 | 194 | 2018 |
Ternary neural networks with fine-grained quantization N Mellempudi, A Kundu, D Mudigere, D Das, B Kaul, P Dubey arXiv preprint arXiv:1705.01462, 2017 | 134 | 2017 |
Mixed precision training with 8-bit floating point N Mellempudi, S Srinivasan, D Das, B Kaul arXiv preprint arXiv:1905.12334, 2019 | 76 | 2019 |
Apparatuses, methods, and systems for neural networks S Venkataramani, D Das, A Ranjan, S Banerjee, S Avancha, ... US Patent App. 16/317,497, 2019 | 56 | 2019 |
Data structure and movement for lattice-based simulations AG Shet, SH Sorathiya, S Krithivasan, AM Deshpande, B Kaul, ... Physical Review E—Statistical, Nonlinear, and Soft Matter Physics 88 (1 …, 2013 | 52 | 2013 |
X-mann: A crossbar based architecture for memory augmented neural networks A Ranjan, S Jain, JR Stevens, D Das, B Kaul, A Raghunathan Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019 | 43 | 2019 |
On scale-out deep learning training for cloud and hpc S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ... arXiv preprint arXiv:1801.08030, 2018 | 35 | 2018 |
Mixed low-precision deep learning inference using dynamic fixed point N Mellempudi, A Kundu, D Das, D Mudigere, B Kaul arXiv preprint arXiv:1701.08978, 2017 | 29 | 2017 |
Manna: An accelerator for memory-augmented neural networks JR Stevens, A Ranjan, D Das, B Kaul, A Raghunathan Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019 | 28 | 2019 |
PolyDL: Polyhedral optimizations for creation of high-performance dl primitives S Tavarageri, A Heinecke, S Avancha, B Kaul, G Goyal, R Upadrasta ACM Transactions on Architecture and Code Optimization (TACO) 18 (1), 1-27, 2021 | 26 | 2021 |
Rail: Risk-averse imitation learning A Santara, A Naik, B Ravindran, D Das, D Mudigere, S Avancha, B Kaul arXiv preprint arXiv:1707.06658, 2017 | 25 | 2017 |
Madras: Multi agent driving simulator A Santara, S Rudra, SA Buridi, M Kaushik, A Naik, B Kaul, B Ravindran Journal of Artificial Intelligence Research 70, 1517-1555, 2021 | 24 | 2021 |
Exploring shared-memory optimizations for an unstructured mesh CFD application on modern parallel systems D Mudigere, S Sridharan, A Deshpande, J Park, A Heinecke, ... 2015 IEEE International Parallel and Distributed Processing Symposium, 723-732, 2015 | 22 | 2015 |
On vectorization for lattice based simulations AG Shet, K Siddharth, SH Sorathiya, AM Deshpande, SD Sherlekar, ... International Journal of Modern Physics C 24 (12), 1340011, 2013 | 21 | 2013 |