Language models are few-shot learners T Brown, B Mann, N Ryder, M Subbiah, JD Kaplan, P Dhariwal, ... Advances in neural information processing systems 33, 1877-1901, 2020 | 26975 | 2020 |
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021 | 2219 | 2021 |
Dota 2 with large scale deep reinforcement learning C Berner, G Brockman, B Chan, V Cheung, P Dębiak, C Dennison, ... arXiv preprint arXiv:1912.06680, 2019 | 1758 | 2019 |
Gpt-4 technical report J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ... arXiv preprint arXiv:2303.08774, 2023 | 1567 | 2023 |
Training verifiers to solve math word problems K Cobbe, V Kosaraju, M Bavarian, M Chen, H Jun, L Kaiser, M Plappert, ... arXiv preprint arXiv:2110.14168, 2021 | 1247 | 2021 |
Openai baselines P Dhariwal, C Hesse, O Klimov, A Nichol, M Plappert, A Radford, ... | 1005 | 2017 |
Stable baselines A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ... | 878 | 2018 |
Webgpt: Browser-assisted question-answering with human feedback R Nakano, J Hilton, S Balaji, J Wu, L Ouyang, C Kim, C Hesse, S Jain, ... arXiv preprint arXiv:2112.09332, 2021 | 749 | 2021 |
Quantifying generalization in reinforcement learning K Cobbe, O Klimov, C Hesse, T Kim, J Schulman International conference on machine learning, 1282-1289, 2019 | 678 | 2019 |
Leveraging procedural generation to benchmark reinforcement learning K Cobbe, C Hesse, J Hilton, J Schulman International conference on machine learning, 2048-2056, 2020 | 520 | 2020 |
Scaling laws for autoregressive generative modeling T Henighan, J Kaplan, M Katz, M Chen, C Hesse, J Jackson, H Jun, ... arXiv preprint arXiv:2010.14701, 2020 | 265 | 2020 |
Gotta learn fast: A new benchmark for generalization in rl A Nichol, V Pfau, C Hesse, O Klimov, J Schulman arXiv preprint arXiv:1804.03720, 2018 | 205 | 2018 |
Language Models are Few-Shot Learners. 2020. doi: 10.48550 TB Brown, B Mann, N Ryder, M Subbiah, J Kaplan, P Dhariwal, ... arxiv, 5-7, 2005 | 163 | 2005 |
Language models are few-shot learners. arXiv TB Brown, B Mann, N Ryder, M Subbiah, J Kaplan, P Dhariwal, ... Computer Science, Computation and Language, 2005 | 151 | 2005 |
Dota 2 with large scale deep reinforcement learning CB OpenAI, G Brockman, B Chan, V Cheung, P Debiak, C Dennison, ... arXiv preprint arXiv:1912.06680 2, 2019 | 108 | 2019 |
Language models are few-shot learners. CoRR abs/2005.14165 (2020) TB Brown, B Mann, N Ryder, M Subbiah, J Kaplan, P Dhariwal, ... URL: https://arxiv. org/abs/2005.14165, 2005 | 74 | 2005 |
Language models are few-shot learners B Mann, N Ryder, M Subbiah, J Kaplan, P Dhariwal, A Neelakantan, ... arXiv preprint arXiv:2005.14165, 2020 | 64 | 2020 |
Openai baselines (2017) P Dhariwal, C Hesse, O Klimov, A Nichol, M Plappert, A Radford, ... URL https://github. com/openai/baselines, 2016 | 60 | 2016 |
Evaluating large language models trained on code. arXiv 2021 M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374 10, 2021 | 50 | 2021 |
Dota 2 with large scale deep reinforcement learning. arXiv 2019 C Berner, G Brockman, B Chan, V Cheung, P Debiak, C Dennison, ... arXiv preprint arXiv:1912.06680, 0 | 43 | |