SF-Net: Structured feature network for continuous sign language recognition Z Yang*, Z Shi*, X Shen, YW Tai arXiv preprint arXiv:1908.01341, 2019 | 79 | 2019 |
A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features Z Shi*, J Wei*, Y Liang ICLR 2022: International Conference on Learning Representations, 2022 | 45 | 2022 |
Deep Online Fused Video Stabilization Z Shi, F Shi, WS Lai, CK Liang, Y Liang WACV 2022: Winter Conference on Applications of Computer Vision, 2022 | 22 | 2022 |
The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning Z Shi*, J Chen*, K Li, J Raghuram, X Wu, Y Liang, S Jha ICLR 2023 (Spotlight): International Conference on Learning Representations, 2023 | 20 | 2023 |
Attentive walk-aggregating graph neural networks MF Demirel, S Liu, S Garg, Z Shi, Y Liang Transactions on Machine Learning Research, 2022 | 13* | 2022 |
When and How Does Known Class Help Discover Unknown Ones? Provable Understandings Through Spectral Analysis Y Sun, Z Shi, Y Liang, Y Li ICML 2023: International Conference on Machine Learning, 2023 | 12 | 2023 |
Domain generalization via nuclear norm regularization Z Shi, Y Ming, Y Fan, F Sala, Y Liang Conference on Parsimony and Learning, 179-201, 2024 | 11* | 2024 |
Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning Z Xu, Z Shi, J Wei, F Mu, Y Li, Y Liang ICLR 2024: International Conference on Learning Representations, 2024 | 10* | 2024 |
Provable Guarantees for Neural Networks via Gradient Feature Learning Z Shi*, J Wei*, Y Liang NeurIPS 2023: Neural Information Processing Systems, 2023 | 5 | 2023 |
Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers J Gu*, Y Liang*, H Liu*, Z Shi*, Z Song*, J Yin* arXiv preprint arXiv:2405.05219, 2024 | 4 | 2024 |
Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic J Gu*, C Li*, Y Liang*, Z Shi*, Z Song*, T Zhou* arXiv preprint arXiv:2402.09469, 2024 | 4 | 2024 |
A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised Learning Y Sun, Z Shi, Y Li NeurIPS 2023 (Spotlight): Neural Information Processing Systems, 2023 | 4 | 2023 |
Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers J Gu*, Y Liang*, Z Shi*, Z Song*, Y Zhou* arXiv preprint arXiv:2405.16411, 2024 | 2 | 2024 |
Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond J Gu*, C Li*, Y Liang*, Z Shi*, Z Song* arXiv preprint arXiv:2405.03251, 2024 | 2 | 2024 |
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability Z Xu*, Z Shi*, Y Liang ME-FoMo: Mathematical and Empirical Understanding of Foundation Models, 2024 | 2 | 2024 |
DAWN: Dual Augmented Memory Network for Unsupervised Video Object Tracking Z Shi*, H Fang*, YW Tai, CK Tang arXiv preprint arXiv:1908.00777, 2019 | 2 | 2019 |
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective J Gu*, Y Liang*, Z Shi*, Z Song*, Y Zhou* arXiv preprint arXiv:2405.16418, 2024 | | 2024 |
Why Larger Language Models Do In-context Learning Differently? Z Shi, J Wei, Z Xu, Y Liang ICML 2024: International Conference on Machine Learning, 2024 | | 2024 |