From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

Walk these ways: Tuning robot control for generalization with multiplicity of behavior

GB Margolis, P Agrawal - Conference on Robot Learning, 2023 - proceedings.mlr.press
Learned locomotion policies can rapidly adapt to diverse environments similar to those
experienced during training but lack a mechanism for fast tuning when they fail in an out-of …

Tgrl: An algorithm for teacher guided reinforcement learning

I Shenfeld, ZW Hong, A Tamar… - … on Machine Learning, 2023 - proceedings.mlr.press
We consider solving sequential decision-making problems in the scenario where the agent
has access to two supervision sources: $\textit {reward signal} $ and a $\textit {teacher} …

Automatic intrinsic reward shaping for exploration in deep reinforcement learning

M Yuan, B Li, X Jin, W Zeng - International Conference on …, 2023 - proceedings.mlr.press
Abstract We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and
adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement …

An invitation to deep reinforcement learning

B Jaeger, A Geiger - arXiv preprint arXiv:2312.08365, 2023 - arxiv.org
Training a deep neural network to maximize a target objective has become the standard
recipe for successful machine learning over the last decade. These networks can be …

A DRL-based path planning method for wheeled mobile robots in unknown environments

T Wen, X Wang, Z Zheng, Z Sun - Computers and Electrical Engineering, 2024 - Elsevier
Deep reinforcement learning-based (DRL-based) path planning in the unknown
environment is studied under continuous action space. We extend the TD3 (twin-delayed …

Automatic Environment Shaping is the Next Frontier in RL

Y Park, GB Margolis, P Agrawal - arXiv preprint arXiv:2407.16186, 2024 - arxiv.org
Many roboticists dream of presenting a robot with a task in the evening and returning the
next morning to find the robot capable of solving the task. What is preventing us from …

Tgrl: Teacher guided reinforcement learning algorithm for pomdps

I Shenfeld, ZW Hong, A Tamar… - … Reinforcement Learning at …, 2023 - openreview.net
In many real-world problems, an agent must operate in an uncertain and partially
observable environment. Due to partial information, a policy directly trained to operate from …

Pareto Envelope Augmented with Reinforcement Learning: Multi-objective reinforcement learning-based approach for Large-Scale Constrained Pressurized Water …

P Seurin, K Seurin - arXiv preprint arXiv:2312.10194, 2023 - arxiv.org
A novel method, the Pareto Envelope Augmented with Reinforcement Learning (PEARL),
has been developed to address the challenges posed by multi-objective problems …

Random Latent Exploration for Deep Reinforcement Learning

S Mahankali, ZW Hong, A Sekhari, A Rakhlin… - arXiv preprint arXiv …, 2024 - arxiv.org
The ability to efficiently explore high-dimensional state spaces is essential for the practical
success of deep Reinforcement Learning (RL). This paper introduces a new exploration …