Ammar Ahmad Awan 个人学术档案

引用次数

	总计	2019 年至今
引用	1503	1328
h 指数	21	19
i10 指数	28	25

380

190

285

2013201420152016201720182019202020212022202320245 13 19 20 26 84 105 135 189 195 372 330

开放获取的出版物数量

查看全部

18 篇文章

5 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Dhabaleswar K. PandaProfessor of Computer Science, The Ohio State University在 cse.ohio-state.edu 的电子邮件经过验证
Hari SubramoniThe Ohio State University在 cse.ohio-state.edu 的电子邮件经过验证
He YuxiongMicrosoft Research在 microsoft.com 的电子邮件经过验证
Ching-Hsiang ChuResearch Scientist, Meta/Facebook在 meta.com 的电子邮件经过验证
Khaled HamidoucheAMD Research在 amd.com 的电子邮件经过验证
Jeff RasleyMicrosoft在 microsoft.com 的电子邮件经过验证
Reza Yazdani AminabadiMicrosoft Research在 microsoft.com 的电子邮件经过验证
Minjia ZhangUniversity of Illinois at Urbana-Champagin在 illinois.edu 的电子邮件经过验证
Arpan JainThe Ohio State University在 osu.edu 的电子邮件经过验证
Conglong LiSenior Researcher at Microsoft, CMU Ph.D.在 microsoft.com 的电子邮件经过验证
Olatunji RuwaseMicrosoft Research在 microsoft.com 的电子邮件经过验证
Akshay VenkateshNVIDIA; Ohio State University在 nvidia.com 的电子邮件经过验证
Quentin AnthonyPhD Student, Ohio State University在 osu.edu 的电子邮件经过验证
Jahanzeb HashmiSenior Architect, NVIDIA在 nvidia.com 的电子邮件经过验证
Zhewei YaoSnowflake在 snowflake.com 的电子邮件经过验证
Xiaoyi LuAssistant Professor, University of California, Merced在 ucmerced.edu 的电子邮件经过验证
Kawthar Shafie KhorassaniAMD在 amd.com 的电子邮件经过验证
(Altamont) Bracy Hamilton EltonPenguin Computing在 bracyelton.com 的电子邮件经过验证
Sourav ChakrabortySamsung Semiconductor Inc在 samsung.com 的电子邮件经过验证
Zeeshan PERVEZUniversity of Wolverhampton在 wlv.ac.uk 的电子邮件经过验证

关注

Ammar Ahmad Awan

Microsoft

在 osu.edu 的电子邮件经过验证 - 首页

Deep Learning HPC Parallel I/O MPI Cloud Computing


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
S-caffe: Co-designing mpi runtimes and caffe for scalable deep learning on modern gpu clusters AA Awan, K Hamidouche, JM Hashmi, DK Panda ACM PPoPP '17 52 (8), 193-205, 2017	178	2017
Deepspeed-inference: enabling efficient inference of transformer models at unprecedented scale RY Aminabadi, S Rajbhandari, AA Awan, C Li, D Li, E Zheng, O Ruwase, ... SC22: International Conference for High Performance Computing, Networking …, 2022	165	2022
Deepspeed-moe: Advancing mixture-of-experts inference and training to power next-generation ai scale S Rajbhandari, C Li, Z Yao, M Zhang, RY Aminabadi, AA Awan, J Rasley, ... International conference on machine learning, 18332-18346, 2022	161	2022
An in-depth performance characterization of CPU-and GPU-based DNN training on modern architectures AA Awan, H Subramoni, DK Panda Proceedings of the Machine Learning on HPC Environments, 1-8, 2017	82	2017
1-bit adam: Communication efficient large-scale training with adam’s convergence speed H Tang, S Gan, AA Awan, S Rajbhandari, C Li, X Lian, J Liu, C Zhang, ... International Conference on Machine Learning, 10118-10129, 2021	74	2021
Phi-3 technical report: A highly capable language model locally on your phone M Abdin, SA Jacobs, AA Awan, J Aneja, A Awadallah, H Awadalla, ... arXiv preprint arXiv:2404.14219, 2024	67	2024
Scalable and efficient moe training for multitask multilingual models YJ Kim, AA Awan, A Muzio, AFC Salinas, L Lu, A Hendy, S Rajbhandari, ... arXiv preprint arXiv:2109.10465, 2021	62	2021
Scalable distributed dnn training using tensorflow and cuda-aware mpi: Characterization, designs, and performance evaluation AA Awan, J Bédorf, CH Chu, H Subramoni, DK Panda 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2019	57	2019
Efficient large message broadcast using NCCL and CUDA-aware MPI for deep learning AA Awan, K Hamidouche, A Venkatesh, DK Panda Proceedings of the 23rd European MPI Users' Group Meeting, 15-22, 2016	56	2016
Optimized broadcast for deep learning workloads on dense-GPU InfiniBand clusters: MPI or NCCL? AA Awan, CH Chu, H Subramoni, DK Panda Proceedings of the 25th European MPI Users' Group Meeting, 1-9, 2018	54	2018
Gems: Gpu-enabled memory-aware model-parallelism system for distributed dnn training A Jain, AA Awan, AM Aljuhani, JM Hashmi, QG Anthony, H Subramoni, ... SC20: International Conference for High Performance Computing, Networking …, 2020	47	2020
Privacy-aware searching with oblivious term matching for cloud storage Z Pervez, AA Awan, AM Khattak, S Lee, EN Huh The Journal of Supercomputing 63, 538-560, 2013	47	2013
Nv-group: link-efficient reduction for distributed deep learning on modern dense gpu systems CH Chu, P Kousha, AA Awan, KS Khorassani, H Subramoni, DK Panda Proceedings of the 34th ACM International Conference on Supercomputing, 1-12, 2020	42	2020
Performance characterization of dnn training using tensorflow and pytorch on modern clusters A Jain, AA Awan, Q Anthony, H Subramoni, DKDK Panda 2019 IEEE International Conference on Cluster Computing (CLUSTER), 1-11, 2019	40	2019
Oc-dnn: Exploiting advanced unified memory capabilities in cuda 9 and volta gpus for out-of-core dnn training AA Awan, CH Chu, H Subramoni, X Lu, DK Panda 2018 IEEE 25th International Conference on High Performance Computing (HiPC …, 2018	36	2018
Deepspeed-chat: Easy, fast and affordable rlhf training of chatgpt-like models at all scales Z Yao, RY Aminabadi, O Ruwase, S Rajbhandari, X Wu, AA Awan, ... arXiv preprint arXiv:2308.01320, 2023	33	2023
1-bit LAMB: communication efficient large-scale large-batch training with LAMB’s convergence speed C Li, AA Awan, H Tang, S Rajbhandari, Y He 2022 IEEE 29th International Conference on High Performance Computing, Data …, 2022	28	2022
Scaling tensorflow, pytorch, and mxnet using mvapich2 for high-performance deep learning on frontera A Jain, AA Awan, H Subramoni, DK Panda 2019 IEEE/ACM Third Workshop on Deep Learning on Supercomputers (DLS), 76-83, 2019	28	2019
Cuda kernel based collective reduction operations on large-scale gpu clusters CH Chu, K Hamidouche, A Venkatesh, AA Awan, DK Panda 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2016	27	2016
Exploiting GPUDirect RDMA in designing high performance OpenSHMEM for NVIDIA GPU clusters K Hamidouche, A Venkatesh, AA Awan, H Subramoni, CH Chu, ... 2015 IEEE International Conference on Cluster Computing, 78-87, 2015	26	2015

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用