Analytical characterization and design space exploration for optimization of CNNs R Li, Y Xu, A Sukumaran-Rajam, A Rountev, P Sadayappan Proceedings of the 26th ACM International Conference on Architectural …, 2021 | 59 | 2021 |
Dependence-aware, unbounded sound predictive race detection K Genç, J Roemer, Y Xu, MD Bond Proceedings of the ACM on Programming Languages 3 (OOPSLA), 1-30, 2019 | 19 | 2019 |
Efficient distributed algorithms for convolutional neural networks R Li, Y Xu, A Sukumaran-Rajam, A Rountev, P Sadayappan Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and …, 2021 | 4 | 2021 |
Effective Performance Modeling and Domain-Specific Compiler Optimization of CNNs for GPUs Y Xu, Q Yuan, EC Barton, R Li, P Sadayappan, A Sukumaran-Rajam Proceedings of the International Conference on Parallel Architectures and …, 2022 | 3 | 2022 |
Training of deep learning pipelines on memory-constrained gpus via segmented fused-tiled execution Y Xu, S Raje, A Rountev, G Sabin, A Sukumaran-Rajam, P Sadayappan Proceedings of the 31st ACM SIGPLAN International Conference on Compiler …, 2022 | 2 | 2022 |
Accelerated Auto-Tuning of GPU Kernels for Tensor Computations C Li, Y Xu, SM Saravani, P Sadayappan Proceedings of the 38th ACM International Conference on Supercomputing, 549-561, 2024 | 1 | 2024 |
CoNST: Code Generator for Sparse Tensor Networks S Raje, Y Xu, A Rountev, EF Valeev, S Sadayappan arXiv preprint arXiv:2401.04836, 2024 | 1 | 2024 |
PEAK: Generating High-Performance Schedules in MLIR AM Tavakkoli, S Joshi, S Singh, Y Xu, P Sadayappan, M Hall | | |