Shengen Yan 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	2159	1567
h 指数	18	15
i10 指数	24	24

320

160

240

2013201420152016201720182019202020212022202320249 24 86 143 129 168 235 234 273 314 316 193

开放获取的出版物数量

查看全部

7 篇文章

3 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Yun (Eric) LiangProfessor of EECS, Peking University, ACM Distinguished Scientist在 pku.edu.cn 的电子邮件经过验证
Yunquan ZhangProfessor of Institute of Computing Technology, CAS在 ict.ac.cn 的电子邮件经过验证
Xiuhong LiPeking University在 pku.edu.cn 的电子邮件经过验证
Ren WuNovuMind Inc.在 novumind.com 的电子邮件经过验证
Huiyang ZhouProfessor of North Carolina State University在 ncsu.edu 的电子邮件经过验证
Gang SunMomenta在 momenta.ai 的电子邮件经过验证
Weiyan WangHong Kong University of Science & Technology在 connect.ust.hk 的电子邮件经过验证
Yi YangNEC Labs在 nec-labs.com 的电子邮件经过验证
Hongwen DaiGoogle在 ncsu.edu 的电子邮件经过验证
Sun PengShanghai Artificial Intelligence Laboratory在 pjlab.org.cn 的电子邮件经过验证

关注

Shengen Yan

The Chinese University of HongKong

在 ie.cuhk.edu.hk 的电子邮件经过验证

Large Scale Deep Learning Heterogeneous Computing


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Deep image: Scaling up image recognition R Wu, S Yan, Y Shan, Q Dang, G Sun arXiv preprint arXiv:1501.02876, 2015	523	2015
Evaluating fast algorithms for convolutional neural networks on FPGAs L Lu, Y Liang, Q Xiao, S Yan 2017 IEEE 25th annual international symposium on field-programmable custom …, 2017	281	2017
Exploring heterogeneous algorithms for accelerating deep convolutional neural networks on FPGAs Q Xiao, Y Liang, L Lu, S Yan, YW Tai Proceedings of the 54th Annual Design Automation Conference 2017, 1-6, 2017	225	2017
yaSpMV: Yet another SpMV framework on GPUs S Yan, C Li, Y Zhang, H Zhou Acm Sigplan Notices 49 (8), 107-118, 2014	179	2014
Evaluating fast algorithms for convolutional neural networks on FPGAs Y Liang, L Lu, Q Xiao, S Yan IEEE Transactions on Computer-Aided Design of Integrated Circuits and …, 2019	148	2019
StreamScan: fast scan algorithms for GPUs without global barrier synchronization S Yan, G Long, Y Zhang Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of …, 2013	121	2013
Characterization and prediction of deep learning workloads in large-scale gpu datacenters Q Hu, P Sun, S Yan, Y Wen, T Zhang Proceedings of the International Conference for High Performance Computing …, 2021	98	2021
Optimizing network performance for distributed dnn training on gpu clusters: Imagenet/alexnet training in 1.5 minutes P Sun, W Feng, R Han, S Yan, Y Wen arXiv preprint arXiv:1902.06855, 2019	78	2019
A coordinated tiling and batching framework for efficient GEMM on GPUs X Li, Y Liang, S Yan, L Jia, Y Li Proceedings of the 24th symposium on principles and practice of parallel …, 2019	59	2019
Towards distributed machine learning in shared clusters: A dynamically-partitioned approach P Sun, Y Wen, NBD Ta, S Yan 2017 IEEE International Conference on Smart Computing (SMARTCOMP), 1-6, 2017	46	2017
GPURoofline: a model for guiding performance optimizations on GPUs H Jia, Y Zhang, G Long, J Xu, S Yan, Y Li Euro-Par 2012 Parallel Processing: 18th International Conference, Euro-Par …, 2012	43	2012
AMOS: enabling automatic mapping for tensor computations on spatial accelerators with hardware abstraction S Zheng, R Chen, A Wei, Y Jin, Q Han, L Lu, B Wu, X Li, S Yan, Y Liang Proceedings of the 49th Annual International Symposium on Computer …, 2022	41	2022
Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs C Li, Y Yang, H Dai, S Yan, F Mueller, H Zhou 2014 IEEE International Symposium on Performance Analysis of Systems and …, 2014	41	2014
Diesel: A dataset-based distributed storage and caching system for large-scale deep learning training L Wang, S Ye, B Yang, Y Lu, H Zhang, S Yan, Q Luo Proceedings of the 49th International Conference on Parallel Processing, 1-11, 2020	30	2020
Gradientflow: Optimizing network performance for large-scale distributed dnn training P Sun, Y Wen, R Han, W Feng, S Yan IEEE Transactions on Big Data 8 (2), 495-507, 2019	29	2019
Parallelization and performance optimization on face detection algorithm with OpenCL: A case study W Wang, Y Zhang, S Yan, Y Zhang, H Jia Tsinghua Science and Technology 17 (3), 287-295, 2012	24	2012
Enabling efficient fast convolution algorithms on GPUs via MegaKernels L Jia, Y Liang, X Li, L Lu, S Yan IEEE Transactions on Computers 69 (7), 986-997, 2020	20	2020
Timed dataflow: Reducing communication overhead for distributed machine learning systems P Sun, Y Wen, TNB Duong, S Yan 2016 IEEE 22nd International Conference on Parallel and Distributed Systems …, 2016	19	2016
Elan: Towards generic and efficient elastic training for deep learning L Xie, J Zhai, B Wu, Y Wang, X Zhang, P Sun, S Yan 2020 IEEE 40th International Conference on Distributed Computing Systems …, 2020	16	2020
A cross-platform SpMV framework on many-core architectures Y Zhang, S Li, S Yan, H Zhou ACM Transactions on Architecture and Code Optimization (TACO) 13 (4), 1-25, 2016	16	2016

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用