Large language models are human-level prompt engineers Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba arXiv preprint arXiv:2211.01910, 2022 | 672 | 2022 |
Llemma: An open language model for mathematics Z Azerbayev, H Schoelkopf, K Paster, MD Santos, S McAleer, AQ Jiang, ... arXiv preprint arXiv:2310.10631, 2023 | 126 | 2023 |
You can’t count on luck: Why decision transformers and rvs fail in stochastic environments K Paster, S McIlraith, J Ba Advances in neural information processing systems 35, 38966-38979, 2022 | 52 | 2022 |
Planning from pixels using inverse dynamics models K Paster, SA McIlraith, J Ba arXiv preprint arXiv:2012.02419, 2020 | 38 | 2020 |
Steve-1: A generative model for text-to-behavior in minecraft S Lifshitz, K Paster, H Chan, J Ba, S McIlraith Advances in Neural Information Processing Systems 36, 2024 | 33 | 2024 |
Openwebmath: An open dataset of high-quality mathematical web text K Paster, MD Santos, Z Azerbayev, J Ba arXiv preprint arXiv:2310.06786, 2023 | 18 | 2023 |
Learning domain invariant representations in goal-conditioned block mdps B Han, C Zheng, H Chan, K Paster, M Zhang, J Ba Advances in Neural Information Processing Systems 34, 764-776, 2021 | 15 | 2021 |
Large language models are human-level prompt engineers. arXiv 2022 Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba arXiv preprint arXiv:2211.01910, 0 | 8 | |
Blast: Latent dynamics models from bootstrapping K Paster, LE McKinney, SA McIlraith, J Ba Deep RL Workshop NeurIPS 2021, 2021 | 5 | 2021 |
Equilibrium finding via asymmetric self-play reinforcement learning J Tang, K Paster, P Abbeel Deep Reinforcement Learning Workshop NeurIPS 2018, 2018 | 5 | 2018 |
Return augmentation gives supervised RL temporal compositionality K Paster, S Pitis, SA McIlraith, J Ba Deep Reinforcement Learning Workshop NeurIPS 2022, 2022 | 4 | 2022 |
Hierarchical deep reinforcement learning agent with counter self-play on competitive games H Xu, K Paster, Q Chen, H Tang, P Abbeel, T Darrell, S Levine | 3 | 2018 |