Jumanji: a diverse suite of scalable reinforcement learning environments in jax, 2023

M Towers, A Kwiatkowski, J Terry, JU Balis… - arXiv preprint arXiv …, 2024 - arxiv.org

Gymnasium is an open-source library providing an API for reinforcement learning
environments. Its main contribution is a central abstraction for wide interoperability between …

被引用次数：17 相关文章所有 4 个版本

[PDF] arxiv.org

Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

M Matthews, M Beukman, B Ellis, M Samvelyan… - arXiv preprint arXiv …, 2024 - arxiv.org

Benchmarks play a crucial role in the development and analysis of reinforcement learning
(RL) algorithms. We identify that existing benchmarks used for research into open-ended …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

BlackJAX: Composable Bayesian inference in JAX

A Cabezas, A Corenflos, J Lao, R Louf - arXiv preprint arXiv:2402.10797, 2024 - arxiv.org

BlackJAX is a library implementing sampling and variational inference algorithms commonly
used in Bayesian computation. It is designed for ease of use, speed, and modularity by …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

NAVIX: Scaling MiniGrid Environments with JAX

E Pignatelli, J Liesen, RT Lange, C Lu… - arXiv preprint arXiv …, 2024 - arxiv.org

As Deep Reinforcement Learning (Deep RL) research moves towards solving large-scale
worlds, efficient environment simulations become crucial for rapid experimentation …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Can we hop in general? A discussion of benchmark selection and design using the Hopper environment

CA Voelcker, M Hussing - arXiv preprint arXiv:2410.08870, 2024 - arxiv.org

Empirical, benchmark-driven testing is a fundamental paradigm in the current RL
community. While using off-the-shelf benchmarks in reinforcement learning (RL) research is …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

SYMPOL: Symbolic Tree-Based On-Policy Reinforcement Learning

S Marton, T Grams, F Vogt, S Lüdtke, C Bartelt… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement learning (RL) has seen significant success across various domains, but its
adoption is often limited by the black-box nature of neural network policies, making them …

Dantzig-Wolfe Decomposition and Deep Reinforcement Learning

F Chouaki, RS Jeffers - openreview.net

The 3D bin packing problem is an NP-hard optimisation problem. RL solutions found in the
literature tackle simplified versions of the full problem due to its large action space and long …