Clipped action policy gradient

Y Liu, A Halev, X Liu - The 30th international joint conference on artificial …, 2021 - par.nsf.gov

Reinforcement Learning (RL) algorithms have had tremendous success in simulated
domains. These algorithms, however, often cannot be directly applied to physical systems …

被引用次数：144 相关文章所有 6 个版本

[PDF] mlr.press

Understanding the impact of entropy on policy optimization

Z Ahmed, N Le Roux, M Norouzi… - … on machine learning, 2019 - proceedings.mlr.press

Entropy regularization is commonly used to improve policy optimization in reinforcement
learning. It is believed to help with exploration by encouraging the selection of more …

被引用次数：273 相关文章所有 11 个版本

An application of deep reinforcement learning and vendor-managed inventory in perishable supply chain management

N Mohamadi, STA Niaki, M Taher… - Engineering Applications of …, 2024 - Elsevier

This article delves into the challenging supply chain management domain, explicitly
addressing the intricate issue of perishable inventory allocation within a two-echelon supply …

被引用次数：25 相关文章所有 2 个版本

IRS-aided energy-efficient secure WBAN transmission based on deep reinforcement learning

L Xiao, S Hong, S Xu, H Yang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Wireless body area networks (WBANs) are vulnerable to active eavesdropping that
simultaneously perform sniffing and jamming to raise the sensor transmit power, and thus …

被引用次数：46 相关文章

[PDF] springer.com

Optimizing adaptive notifications in mobile health interventions systems: reinforcement learning from a data-driven behavioral simulator

S Wang, C Zhang, B Kröse, H van Hoof - Journal of medical systems, 2021 - Springer

Mobile health (mHealth) intervention systems can employ adaptive strategies to interact with
users. Instead of designing such complex strategies manually, reinforcement learning (RL) …

被引用次数：38 相关文章所有 14 个版本

Reinforcement learning architecture for cyber–physical–social AI: state-of-the-art and perspectives

X Li, P Wang, X Jin, Q Jiang, W Zhou, S Yao - Artificial Intelligence Review, 2023 - Springer

As the extension of cyber–physical systems (CPSs), cyber–physical–social systems
(CPSSs) seamlessly integrate cyber space, physical space, and social space. CPSS provide …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Benchmarking actor-critic deep reinforcement learning algorithms for robotics control with action constraints

K Kasaura, S Miura, T Kozuno… - IEEE Robotics and …, 2023 - ieeexplore.ieee.org

This study presents a benchmark for evaluating action-constrained reinforcement learning
(RL) algorithms. In action-constrained RL, each action taken by the learning system must …

被引用次数：12 相关文章所有 3 个版本

[PDF] mlr.press

Striving for simplicity and performance in off-policy DRL: Output normalization and non-uniform sampling

C Wang, Y Wu, Q Vuong… - … Conference on Machine …, 2020 - proceedings.mlr.press

We aim to develop off-policy DRL algorithms that not only exceed state-of-the-art
performance but are also simple and minimalistic. For standard continuous control …

被引用次数：48 相关文章所有 9 个版本

[PDF] arxiv.org

Pseudo-labeled auto-curriculum learning for semi-supervised keypoint localization

C Wang, S Jin, Y Guan, W Liu, C Qian, P Luo… - arXiv preprint arXiv …, 2022 - arxiv.org

Localizing keypoints of an object is a basic visual problem. However, supervised learning of
a keypoint localization network often requires a large amount of data, which is expensive …

被引用次数：19 相关文章所有 4 个版本

[PDF] openreview.net

KnowGPT: Knowledge Graph based Prompting for Large Language Models

Q Zhang, J Dong, H Chen, D Zha, Z Yu… - The Thirty-eighth …, 2024 - openreview.net

Large Language Models (LLMs) have demonstrated remarkable capabilities in many real-
world applications. Nonetheless, LLMs are often criticized for their tendency to produce …

被引用次数：2 相关文章

[PDF][PDF] Policy learning with constraints in model-free reinforcement learning: A survey