C-mixup: Improving generalization in regression H Yao, Y Wang, L Zhang, JY Zou, C Finn Advances in neural information processing systems 35, 3361-3376, 2022 | 43 | 2022 |
Scan and snap: Understanding training dynamics and token composition in 1-layer transformer Y Tian, Y Wang, B Chen, SS Du Advances in Neural Information Processing Systems 36, 71911-71947, 2023 | 36 | 2023 |
Joma: Demystifying multilayer transformers via joint dynamics of mlp and attention Y Tian, Y Wang, Z Zhang, B Chen, S Du arXiv preprint arXiv:2310.00535, 2023 | 17 | 2023 |
Improved active multi-task representation learning via lasso Y Wang, Y Chen, K Jamieson, SS Du International Conference on Machine Learning, 35548-35578, 2023 | 8 | 2023 |
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning Y Wang, Y Chen, W Yan, A Fang, W Zhou, K Jamieson, SS Du arXiv preprint arXiv:2405.19547, 2024 | 1* | 2024 |