Software transactional memory for GPU architectures Y Xu, R Wang, N Goswami, T Li, L Gao, D Qian Proceedings of Annual IEEE/ACM International Symposium on Code Generation …, 2014 | 77 | 2014 |
Lock-based synchronization for GPU architectures Y Xu, L Gao, R Wang, Z Luan, W Wu, D Qian Proceedings of the ACM International Conference on Computing Frontiers, 205-213, 2016 | 43 | 2016 |
PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems Y Zhang, L Chen, S Yang, M Yuan, H Yi, J Zhang, J Wang, J Dong, Y Xu, ... 2022 IEEE 38th International Conference on Data Engineering (ICDE), 3453-3466, 2022 | 30 | 2022 |
Improving MapReduce performance by balancing skewed loads Y Fan, W Wu, Y Xu, H Chen China Communications 11 (8), 85-108, 2014 | 28 | 2014 |
Bridging the semantic gaps of GPU acceleration for scale-out CNN-based big data processing: Think big, see small M Song, Y Hu, Y Xu, C Li, H Chen, J Yuan, T Li Proceedings of the 2016 International Conference on Parallel Architectures …, 2016 | 27 | 2016 |
Scheduling tasks with mixed timing constraints in GPU-powered real-time systems Y Xu, R Wang, T Li, M Song, L Gao, Z Luan, D Qian Proceedings of the 2016 International Conference on Supercomputing, 1-13, 2016 | 23 | 2016 |
Approximate nearest neighbor search under neural similarity metric for large-scale recommendation R Chen, B Liu, H Zhu, Y Wang, Q Li, B Ma, Q Hua, J Jiang, Y Xu, H Deng, ... Proceedings of the 31st ACM International Conference on Information …, 2022 | 11 | 2022 |
SRAM-and STT-RAM-based hybrid, shared last-level cache for on-chip CPU–GPU heterogeneous architectures L Gao, R Wang, Y Xu, H Yang, Z Luan, D Qian, H Zhang, J Cai The Journal of Supercomputing 74, 3388-3414, 2018 | 9 | 2018 |
Performance prediction model in heterogeneous MapReduce environments Y Fan, W Wu, Y Xu, Y Cao, Q Li, J Cui, Z Duan 2014 IEEE International Conference on Computer and Information Technology …, 2014 | 5 | 2014 |
Load balancing in heterogeneous mapreduce environments Y Fan, W Wu, D Qian, Y Xu, W Wei 2013 IEEE 10th International Conference on High Performance Computing and …, 2013 | 5 | 2013 |
Thread-level locking for simt architectures L Gao, Y Xu, R Wang, Z Luan, Z Yu, D Qian IEEE Transactions on Parallel and Distributed Systems 31 (5), 1121-1136, 2019 | 4 | 2019 |
Towards a general and efficient linked-list hash table on gpus L Gao, Y Xu, C Xu, R Wang, H Yang, Z Luan, D Qian 2019 IEEE 21st International Conference on High Performance Computing and …, 2019 | 4 | 2019 |
RPPA: A Remote Parallel Program Performance Analysis Tool. Y Xu, Z Zhao, W Wu, Y Zhao J. Softw. 6 (12), 2399-2406, 2011 | 3 | 2011 |
Practice on pruning CTR models for real-world systems R Chen, Y Zheng, G Zhou, X Luo, J Zhuo, X Qiao, Y Xu, X Zhu Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021 | 1 | 2021 |
Research and Design of a Remote Visualization Parallel Program Performance Analysis Tool Y Xu, Z Zhao, W Wu, Y Zhao 2010 3rd International Symposium on Parallel Architectures, Algorithms and …, 2010 | 1 | 2010 |
Accelerating in-memory transaction processing using general purpose graphics processing units L Gao, Y Xu, R Wang, H Yang, Z Luan, D Qian Future Generation Computer Systems 97, 836-848, 2019 | | 2019 |
A power estimation method based on performance features in MapReduce environments Y Fan, W Wu, Y Xu, Y Gao | | 2015 |