关注
Youjie Li
Youjie Li
DistMLSyser | PyTorch Post-doc | UIUC Ph.D.
在 bytedance.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Accelerating Distributed Reinforcement Learning with In-Switch Computing
Y Li, IJ Liu, Y Yuan, D Chen, A Schwing, J Huang
The 46th International Symposium on Computer Architecture (ISCA'19), 2019
1452019
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training
Y Li, M Yu, S Li, S Avestimehr, NS Kim, A Schwing
Advances in Neural Information Processing Systems (NeurIPS'18), 8045-8056, 2018
1202018
Energy efficient parallel neuromorphic architectures with approximate arithmetic on FPGA
Q Wang, Y Li, B Shao, S Dey, P Li
Neurocomputing 221, 146-158, 2017
1062017
A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks
Y Li, J Park, M Alian, Y Yuan, Z Qu, P Pan, R Wang, A Schwing, ...
The 51st International Symposium on Microarchitecture (MICRO'18), 175-188, 2018
1022018
GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training
M Yu, Z Lin, K Narra, S Li, Y Li, NS Kim, A Schwing, M Annavaram, ...
Advances in Neural Information Processing Systems (NeurIPS'18), 5123-5133, 2018
822018
DeepStore: In-Storage Acceleration for Intelligent Queries
VS Mailthody, Z Qureshi, W Liang, Z Feng, SG Gonzalo, Y Li, H Franke, ...
The 52nd International Symposium on Microarchitecture (MICRO'19), 2019
772019
BNS-GCN: Efficient full-graph training of graph convolutional networks with partition-parallelism and random boundary node sampling
C Wan, Y Li, A Li, NS Kim, Y Lin
Fifth Conference on Machine Learning and Systems (MLSys'22), 2022
672022
PipeGCN: Efficient full-graph training of graph convolutional networks with pipelined feature communication
C Wan, Y Li, CR Wolfe, A Kyrillidis, NS Kim, Y Lin
arXiv preprint arXiv:2203.10428, 2022
652022
Liquid state machine based pattern recognition on FPGA with firing-activity dependent power gating and approximate computing
Q Wang, Y Li, P Li
The IEEE International Symposium on Circuits and Systems (ISCAS'16), 361-364, 2016
602016
Harmony: Overcoming the hurdles of gpu memory capacity to train massive dnn models on commodity servers
Y Li, A Phanishayee, D Murray, J Tarnawski, NS Kim
arXiv preprint arXiv:2202.01306, 2022
202022
Visage: enabling timely analytics for drone imagery
S Jha, Y Li, S Noghabi, V Ranganathan, P Kumar, A Nelson, M Toelle, ...
The 27th Annual International Conference on Mobile Computing and Networking …, 2021
172021
Accelerating distributed reinforcement learning with in-switch computing. In 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA)
Y Li, IJ Liu, Y Yuan, D Chen, A Schwing, J Huang
IEEE, 279ś291, 2019
172019
Doing More with Less: Training Large DNN Models on Commodity Servers for the Masses
Y Li, A Phanishayee, D Murray, NS Kim
Hot Topics in Operating Systems (HotOS’21), 2021
62021
BDS-GCN: Efficient full-graph training of graph convolutional nets with partition-parallelism and boundary sampling
C Wan, Y Li, NS Kim, Y Lin
22020
Energy Efficient Spiking Neuromorphic Architectures for Pattern Recognition
Y Li
Master Thesis, ECE, Texas A&M University, 2016
12016
Communication-Centric Cross-Stack Acceleration for Distributed Machine Learning
Y Li
Ph.D. Dissertation, ECE, UIUC, 2022
2022
BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Boundary Node Sampling
C Wan, Y Li, A Li, NS Kim, Y Lin
arXiv preprint arXiv:2203.10983, 2022
2022
系统目前无法执行此操作,请稍后再试。
文章 1–17