GEAR: a GPU-centric experience replay system for large reinforcement learning models
International Conference on Machine Learning, 2023•proceedings.mlr.press
This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed
to perform scalable reinforcement learning (RL) with large sequence models (such as
transformers). With such models, existing systems such as Reverb face considerable
bottlenecks in memory, computation, and communication. GEAR, however, optimizes
memory efficiency by enabling the memory resources on GPU servers (including host
memory and device memory) to manage trajectory data. Furthermore, it facilitates …
to perform scalable reinforcement learning (RL) with large sequence models (such as
transformers). With such models, existing systems such as Reverb face considerable
bottlenecks in memory, computation, and communication. GEAR, however, optimizes
memory efficiency by enabling the memory resources on GPU servers (including host
memory and device memory) to manage trajectory data. Furthermore, it facilitates …
Abstract
This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning (RL) with large sequence models (such as transformers). With such models, existing systems such as Reverb face considerable bottlenecks in memory, computation, and communication. GEAR, however, optimizes memory efficiency by enabling the memory resources on GPU servers (including host memory and device memory) to manage trajectory data. Furthermore, it facilitates decentralized GPU devices to expedite various trajectory selection strategies, circumventing computational bottlenecks. GEAR is equipped with GPU kernels capable of collecting trajectories using zero-copy access to host memory, along with remote-directed-memory access over InfiniBand, improving communication efficiency. Cluster experiments have shown that GEAR can achieve performance levels up to 6× greater than Reverb when training state-of-the-art large RL models. GEAR is open-sourced at https://github. com/bigrl-team/gear.
proceedings.mlr.press
以上显示的是最相近的搜索结果。 查看全部搜索结果