Accelerating transformer-based deep learning models on fpgas using column balanced block pruning H Peng, S Huang, T Geng, A Li, W Jiang, H Liu, S Wang, C Ding 2021 22nd International Symposium on Quality Electronic Design (ISQED), 142-148, 2021 | 77 | 2021 |
Accommodating transformer onto fpga: Coupling the balanced model compression and fpga-implementation optimization P Qi, Y Song, H Peng, S Huang, Q Zhuge, EHM Sha Proceedings of the 2021 on Great Lakes Symposium on VLSI, 163-168, 2021 | 40 | 2021 |
A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining H Peng*, S Huang*, S Chen, B Li, T Geng, A Li, W Jiang, W Wen, J Bi, ... DAC'2022 (Publicity paper): Proceedings of the 59th ACM/IEEE Design …, 2022 | 39 | 2022 |
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization P Qi, EHM Sha, Q Zhuge, H Peng, S Huang, Z Kong, Y Song, B Li ICCAD'2021: IEEE/ACM International Conference On Computer Aided Design, 2021 | 36* | 2021 |
ET: re-thinking self-attention for transformer models on GPUs S Chen*, S Huang*, S Pandey, B Li, GR Gao, L Zheng, C Ding, H Liu SC'2021: Proceedings of the International Conference for High Performance …, 2021 | 33 | 2021 |
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm S Huang*, D Xu*, IEH Yen, S Chang, B Li, S Chen, M Xie, H Liu, C Ding ACL'2022: Proceedings of the 60th Annual Meeting of the Association for …, 2022 | 27 | 2022 |
Towards sparsification of graph neural networks H Peng, D Gurevin, S Huang, T Geng, W Jiang, O Khan, C Ding 40th IEEE International Conference on Computer Design (ICCD), 2022 | 26 | 2022 |
An automatic and efficient BERT pruning for edge AI systems S Huang, N Liu, Y Liang, H Peng, H Li, D Xu, M Xie, C Ding 2022 23rd International Symposium on Quality Electronic Design (ISQED), 1-6, 2022 | 15 | 2022 |
AutoReP: Automatic ReLU Replacement for Fast Private Network Inference H Peng*, S Huang*, T Zhou*, Y Luo, C Wang, Z Wang, J Zhao, X Xie, A Li, ... ICCV'2023: Proceedings of the IEEE/CVF International Conference on Computer …, 2023 | 14 | 2023 |
RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference H Peng, S Zhou, Y Luo, N Xu, S Duan, R Ran, J Zhao, S Huang, X Xie, ... AAAI'2023 Workshop on DL-Hardware Co-Design for AI Acceleration, 2023 | 14* | 2023 |
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off S Huang, B Lei, D Xu, H Peng, Y Sun, M Xie, C Ding DAC'2023: Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023 | 13 | 2023 |
HMC-Tran: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU S Huang, S Chen, H Peng, D Manu, Z Kong, G Yuan, L Yang, S Wang, ... Proceedings of the 2021 on Great Lakes Symposium on VLSI, 169-174, 2021 | 13* | 2021 |
CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM Y Luo, P Behnam, K Thorat, Z Liu, H Peng, S Huang, S Zhou, O Khan, ... 40th IEEE International Conference on Computer Design (ICCD), 2022 | 11 | 2022 |
Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks X Xie, H Peng, A Hasan, S Huang, J Zhao, H Fang, W Zhang, T Geng, ... ICCAD'2023: IEEE/ACM International Conference on Computer Aided Design, 2023 | 10 | 2023 |
Co-Exploration of Graph Neural Network and Network-on-Chip Design Using AutoML D Manu, S Huang, C Ding, L Yang Proceedings of the 2021 on Great Lakes Symposium on VLSI, 175-180, 2021 | 9 | 2021 |
LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference H Peng, R Ran, Y Luo, J Zhao, S Huang, K Thorat, T Geng, C Wang, X Xu, ... NeurIPS'2023: Thirty-seventh Conference on Neural Information Processing Systems, 2023 | 8 | 2023 |
MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training H Peng, X Xie, K Shivdikar, MA Hasan, J Zhao, S Huang, O Khan, D Kaeli, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 7* | 2024 |
Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration S Huang, H Fang, K Mahmood, B Lei, N Xu, B Lei, Y Sun, D Xu, W Wen, ... DAC'2023: Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023 | 7 | 2023 |
Analyzing and defending against membership inference attacks in natural language processing classification Y Wang, N Xu, S Huang, K Mahmood, D Guo, C Ding, W Wen, ... 2022 IEEE International Conference on Big Data (Big Data), 5823-5832, 2022 | 7 | 2022 |
Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM B Li, G Yuan, Z Wang, S Huang, H Peng, P Behnam, W Wen, H Liu, ... arXiv preprint arXiv:2401.11664, 2024 | 2 | 2024 |