An efficient parallel algorithm for Caputo fractional reaction-diffusion equation with implicit finite-difference method Q Wang, J Liu, C Gong, X Tang, G Fu, Z Xing Advances in Difference Equations 2016 (1), 207, 2016 | 31 | 2016 |
Optimizing FFT-based convolution on ARMv8 multi-core CPUs Q Wang, D Li, X Huang, S Shen, S Mei, J Liu European Conference on Parallel Processing, 248-262, 2020 | 25 | 2020 |
Locality based warp scheduling in GPGPUs Y Zhang, Z Xing, C Liu, C Tang, Q Wang Future Generation Computer Systems 82, 520-527, 2018 | 24 | 2018 |
Parallel convolution algorithm using implicit matrix multiplication on multi-core CPUs Q Wang, S Mei, J Liu, C Gong 2019 International Joint Conference on Neural Networks (IJCNN), 1-7, 2019 | 20 | 2019 |
HPDL: towards a general framework for high-performance distributed deep learning D Li, Z Lai, K Ge, Y Zhang, Z Zhang, Q Wang, H Wang 2019 IEEE 39th International Conference on Distributed Computing Systems …, 2019 | 19 | 2019 |
Evaluating FFT-based algorithms for strided convolutions on ARMv8 architectures X Huang, Q Wang, S Lu, R Hao, S Mei, J Liu Performance Evaluation 152, 102248, 2021 | 18 | 2021 |
FPGA Implementation for the Sigmoid with Piecewise Linear Fitting Method Based on Curvature Analysis Z Li, Y Zhang, B Sui, Z Xing, Q Wang Electronics 11 (9), 1365, 2022 | 17 | 2022 |
Optimizing winograd-based fast convolution algorithm on phytium multi-core cpus Q Wang, D Li, S Mei, Z Lai, Y Dou Journal of Computer Research and Development 57 (6), 1140-1151, 2020 | 17 | 2020 |
Accelerating embarrassingly parallel algorithm on Intel MIC Q Wang, J Liu, X Tang, F Wang, G Fu, Z Xing 2014 IEEE International Conference on Progress in Informatics and Computing …, 2014 | 17 | 2014 |
Optimizing Irregular-Shaped Matrix-Matrix Multiplication on Multi-Core DSPs S Yin, Q Wang, R Hao, T Zhou, S Mei, J Liu arXiv preprint arXiv:2208.05872, 2022 | 16 | 2022 |
TAMM: a new topology-aware mapping method for parallel applications on the Tianhe-2A supercomputer X Chen, J Liu, S Li, P Xie, L Chi, Q Wang Algorithms and Architectures for Parallel Processing: 18th International …, 2018 | 15 | 2018 |
NUMA-aware FFT-based Convolution on ARMv8 Many-core CPUs X Huang, Q Wang, S Lu, R Hao, S Mei, J Liu 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications …, 2021 | 12 | 2021 |
Model provenance management in MLOps Pipeline S Mei, C Liu, Q Wang, H Su Proceedings of the 2022 8th International Conference on Computing and Data …, 2022 | 10 | 2022 |
An Overview on the Convergence of High Performance Computing and Big Data Processing S Mei, H Guan, Q Wang 2018 IEEE 24th International Conference on Parallel and Distributed Systems …, 2018 | 8 | 2018 |
Scalability of 3D deterministic particle transport on the Intel MIC architecture Q Wang, J Liu, Z Xing, C Gong Nuclear Science and Techniques 26 (5), 2015 | 7 | 2015 |
The acceleration of turbo decoder on the newest GPGPU of kepler architecture Y Zhang, Z Xing, L Yuan, C Liu, Q Wang 2014 14th International Symposium on Communications and Information …, 2014 | 7 | 2014 |
Flexible virtual channel power-gating for high-throughput and low-power network-on-chip F Wang, X Tang, Q Wang, Z Xing, H Liu 2014 17th Euromicro Conference on Digital System Design, 504-511, 2014 | 7 | 2014 |
Parallel 3D deterministic particle transport on Intel MIC architecture Q Wang, Z Xing, J Liu, X Qiang, C Gong, J Jiang 2014 International Conference on High Performance Computing & Simulation …, 2014 | 7 | 2014 |
An Area-Efficient Hybrid Polar Decoder With Pipelined Architecture Y Wang, Q Wang, Y Zhang, S Qiu, Z Xing IEEE Access 8, 68068-68082, 2020 | 6 | 2020 |
Parallel Sn Sweep Scheduling Algorithm on Unstructured Grids for Multigroup Time-Dependent Particle Transport Equations J Liu, C Lihua, W Qing Lin, G Chunye, J Jie, G Xinbiao, L Shengguo, Q Hu, ... Nuclear Science and Engineering 184 (4), 527-536, 2016 | 6 | 2016 |