Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 921 | 2022 |
BBQ: A hand-built bias benchmark for question answering A Parrish, A Chen, N Nangia, V Padmakumar, J Phang, J Thompson, ... arXiv preprint arXiv:2110.08193, 2021 | 211 | 2021 |
Pretraining language models with human preferences T Korbak, K Shi, A Chen, RV Bhalerao, C Buckley, J Phang, SR Bowman, ... International Conference on Machine Learning, 17506-17533, 2023 | 138 | 2023 |
Learning from Natural Language Feedback A Chen, J Scheurer, JA Campos, T Korbak, JS Chan, SR Bowman, K Cho, ... Transactions on Machine Learning Research, 2024 | 117* | 2024 |
Seasonal dynamics of bacterial meningitis: a time-series analysis J Paireau, A Chen, H Broutin, B Grenfell, NE Basta The Lancet global health 4 (6), e370-e377, 2016 | 101 | 2016 |
QuALITY: Question Answering with Long Input Texts, Yes! SR Bowman, A Chen, H He, N Joshi, J Ma, N Nangia, V Padmakumar, ... NAACL 2022, 2022 | 93* | 2022 |
EvoPrompting: language models for code-level neural architecture search A Chen, D Dohan, D So Advances in Neural Information Processing Systems 36, 2024 | 46 | 2024 |
Improving code generation by training with natural language feedback A Chen, J Scheurer, T Korbak, JA Campos, JS Chan, SR Bowman, K Cho, ... arXiv preprint arXiv:2303.16749, 2023 | 46 | 2023 |
Generating logical forms from graph representations of text and entities P Shaw, P Massey, A Chen, F Piccinno, Y Altun arXiv preprint arXiv:1905.08407, 2019 | 42 | 2019 |
Squality: Building a long-document summarization dataset the hard way A Wang, RY Pang, A Chen, J Phang, SR Bowman arXiv preprint arXiv:2205.11465, 2022 | 37 | 2022 |
What do nlp researchers believe? results of the nlp community metasurvey J Michael, A Holtzman, A Parrish, A Mueller, A Wang, A Chen, D Madaan, ... arXiv preprint arXiv:2208.12852, 2022 | 28 | 2022 |
Sudden drops in the loss: Syntax acquisition, phase transitions, and simplicity bias in MLMs A Chen, R Schwartz-Ziv, K Cho, ML Leavitt, N Saphra arXiv preprint arXiv:2309.07311, 2023 | 21 | 2023 |
Reasoning from radically incomplete information: The case of containers E Davis, G Marcus, A Chen Proceedings of the second annual conference on advances in cognitive systems …, 2013 | 20 | 2013 |
Two failures of self-consistency in the multi-step reasoning of LLMs A Chen, J Phang, A Parrish, V Padmakumar, C Zhao, SR Bowman, K Cho arXiv preprint arXiv:2305.14279, 2023 | 14 | 2023 |
Teaching BERT to wait: Balancing accuracy and latency for streaming disfluency detection A Chen, V Zayats, DD Walker, D Padfield arXiv preprint arXiv:2205.00620, 2022 | 12 | 2022 |
Single-turn debate does not help humans answer hard reading-comprehension questions A Parrish, H Trivedi, E Perez, A Chen, N Nangia, J Phang, SR Bowman arXiv preprint arXiv:2204.05212, 2022 | 12 | 2022 |
Adversarially constructed evaluation sets are more challenging, but may not be fair J Phang, A Chen, W Huang, SR Bowman arXiv preprint arXiv:2111.08181, 2021 | 11 | 2021 |
Preference Learning Algorithms Do Not Learn Preference Rankings A Chen, S Malladi, LH Zhang, X Chen, Q Zhang, R Ranganath, K Cho arXiv preprint arXiv:2405.19534, 2024 | 2 | 2024 |
Latent state models of training dynamics MY Hu, A Chen, N Saphra, K Cho arXiv preprint arXiv:2308.09543, 2023 | 2 | 2023 |
Playing large games with oracles and ai debate X Chen, A Chen, D Foster, E Hazan Agentic Markets Workshop at ICML 2024, 2023 | 2 | 2023 |