关注
Yutao Sun
Yutao Sun
在 mails.tsinghua.edu.cn 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers
D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui, F Wei
arXiv preprint arXiv:2212.10559, 2022
2892022
Retentive network: A successor to transformer for large language models
Y Sun, L Dong, S Huang, S Ma, Y Xia, J Xue, J Wang, F Wei
arXiv preprint arXiv:2307.08621, 2023
1962023
A length-extrapolatable transformer
Y Sun, L Dong, B Patra, S Ma, S Huang, A Benhaim, V Chaudhary, ...
arXiv preprint arXiv:2212.10554, 2022
1122022
Structured prompting: Scaling in-context learning to 1,000 examples
Y Hao, Y Sun, L Dong, Z Han, Y Gu, F Wei
arXiv preprint arXiv:2212.06713, 2022
322022
Prototypical calibration for few-shot learning of language models
Z Han, Y Hao, L Dong, Y Sun, F Wei
The Eleventh International Conference on Learning Representations, 2023
292023
You only cache once: Decoder-decoder architectures for language models
Y Sun, L Dong, Y Zhu, S Huang, W Wang, S Ma, Q Zhang, J Wang, F Wei
arXiv preprint arXiv:2405.05254, 2024
72024
系统目前无法执行此操作,请稍后再试。
文章 1–6