A survey on model compression for large language models X Zhu, J Li, Y Liu, C Ma, W Wang arXiv preprint arXiv:2308.07633, 2023 | 96 | 2023 |
Operation-level progressive differentiable architecture search X Zhu, J Li, Y Liu, J Liao, W Wang 2021 IEEE International Conference on Data Mining (ICDM), 1559-1564, 2021 | 7 | 2021 |
Improving Differentiable Architecture Search via self-distillation X Zhu, J Li, Y Liu, W Wang Neural Networks 167, 656-667, 2023 | 6 | 2023 |
Robust Neural Architecture Search X Zhu, J Li, Y Liu, W Wang arXiv preprint arXiv:2304.02845, 2023 | 1 | 2023 |
Improving Small Language Models' Mathematical Reasoning via Mix Thoughts Distillation X Zhu, J Li, Y Liu, C Ma, W Wang arXiv preprint arXiv:2401.11864, 2024 | | 2024 |