Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles

L Zhang, R Zhang, T Wu, R Weng… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
L Zhang, R Zhang, T Wu, R Weng, M Han, Y Zhao
IEEE transactions on neural networks and learning systems, 2021ieeexplore.ieee.org
Reinforcement learning with safety constraints is promising for autonomous vehicles, of
which various failures may result in disastrous losses. In general, a safe policy is trained by
constrained optimization algorithms, in which the average constraint return as a function of
states and actions should be lower than a predefined bound. However, most existing safe
learning-based algorithms capture states via multiple high-precision sensors, which
complicates the hardware systems and is power-consuming. This article is focused on safe …
Reinforcement learning with safety constraints is promising for autonomous vehicles, of which various failures may result in disastrous losses. In general, a safe policy is trained by constrained optimization algorithms, in which the average constraint return as a function of states and actions should be lower than a predefined bound. However, most existing safe learning-based algorithms capture states via multiple high-precision sensors, which complicates the hardware systems and is power-consuming. This article is focused on safe motion planning with the stability guarantee for autonomous vehicles with limited size and power. To this end, the risk-identification method and the Lyapunov function are integrated with the well-known soft actor–critic (SAC) algorithm. By borrowing the concept of Lyapunov functions in the control theory, the learned policy can theoretically guarantee that the state trajectory always stays in a safe area. A novel risk-sensitive learning-based algorithm with the stability guarantee is proposed to train policies for the motion planning of autonomous vehicles. The learned policy is implemented on a differential drive vehicle in a simulation environment. The experimental results show that the proposed algorithm achieves a higher success rate than the SAC.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
查找
获取 PDF 文件
引用
References