related:G0TyhYP7ni0J:scholar.lanfanshu.cn/

Learning human rewards by inferring their latent intelligence levels in multi-agent games: A theory-of-mind approach with application to driving data

R Tian, M Tomizuka, L Sun - 2021 IEEE/RSJ International …, 2021 - ieeexplore.ieee.org

Reward function, as an incentive representation that recognizes humans' agency and
rationalizes humans' actions, is particularly appealing for modeling human behavior in …

被引用次数：13 相关文章所有 4 个版本

[PDF] arxiv.org

Expressing diverse human driving behavior with probabilistic rewards and online inference

L Sun, Z Wu, H Ma, M Tomizuka - 2020 IEEE/RSJ International …, 2020 - ieeexplore.ieee.org

In human-robot interaction (HRI) systems, such as autonomous vehicles, understanding and
representing human behavior are important. Human behavior is naturally rich and diverse …

被引用次数：8 相关文章所有 5 个版本

[PDF] acm.org

Joint goal and strategy inference across heterogeneous demonstrators via reward network distillation

L Chen, R Paleja, M Ghuy, M Gombolay - Proceedings of the 2020 ACM …, 2020 - dl.acm.org

Reinforcement learning (RL) has achieved tremendous success as a general framework for
learning how to make decisions. However, this success relies on the interactive hand-tuning …

被引用次数：42 相关文章所有 5 个版本

Analyzing the suitability of cost functions for explaining and imitating human driving behavior based on inverse reinforcement learning

M Naumann, L Sun, W Zhan… - 2020 IEEE international …, 2020 - ieeexplore.ieee.org

Autonomous vehicles are sharing the road with human drivers. In order to facilitate
interactive driving and cooperative behavior in dense traffic, a thorough understanding and …

被引用次数：60 相关文章

[PDF] google.com

Inferring non-stationary human preferences for human-agent teams

D Hughes, A Agarwal, Y Guo… - 2020 29th IEEE …, 2020 - ieeexplore.ieee.org

One main challenge to robot decision making in human-robot teams involves predicting the
intents of a human team member through observations of the human's behavior. Inverse …

被引用次数：9 相关文章所有 3 个版本

Learning task-relevant representations via rewards and real actions for reinforcement learning

L Yuan, X Lu, Y Liu - Knowledge-Based Systems, 2024 - Elsevier

The input of visual reinforcement learning often contains redundant information, which will
reduce the decision efficiency and decrease the performance of the agent. To address this …

被引用次数：1 相关文章

[PDF] arxiv.org

Evaluating agents without rewards

B Matusch, J Ba, D Hafner - arXiv preprint arXiv:2012.11538, 2020 - arxiv.org

Reinforcement learning has enabled agents to solve challenging tasks in unknown
environments. However, manually crafting reward functions can be time consuming …

被引用次数：16 相关文章所有 4 个版本

[PDF] arxiv.org

Multiagent inverse reinforcement learning via theory of mind reasoning

H Wu, P Sequeira, DV Pynadath - arXiv preprint arXiv:2302.10238, 2023 - arxiv.org

We approach the problem of understanding how people interact with each other in
collaborative settings, especially when individuals know little about their teammates, via …

被引用次数：8 相关文章所有 6 个版本

[PDF] arxiv.org

Human-level reinforcement learning through theory-based modeling, exploration, and planning

PA Tsividis, J Loula, J Burga, N Foss… - arXiv preprint arXiv …, 2021 - arxiv.org

Reinforcement learning (RL) studies how an agent comes to achieve reward in an
environment through interactions over time. Recent advances in machine RL have …

被引用次数：64 相关文章所有 2 个版本

[PDF] arxiv.org

Preferences implicit in the state of the world

R Shah, D Krasheninnikov, J Alexander… - arXiv preprint arXiv …, 2019 - arxiv.org

Reinforcement learning (RL) agents optimize only the features specified in a reward function
and are indifferent to anything left out inadvertently. This means that we must not only …

被引用次数：71 相关文章所有 6 个版本

Learning human rewards by inferring their latent intelligence levels in multi-agent games: A theory-of-mind approach with application to driving data

Expressing diverse human driving behavior with probabilistic rewards and online inference

Joint goal and strategy inference across heterogeneous demonstrators via reward network distillation

Analyzing the suitability of cost functions for explaining and imitating human driving behavior based on inverse reinforcement learning

Inferring non-stationary human preferences for human-agent teams

Learning task-relevant representations via rewards and real actions for reinforcement learning

Evaluating agents without rewards

Multiagent inverse reinforcement learning via theory of mind reasoning

Human-level reinforcement learning through theory-based modeling, exploration, and planning

Preferences implicit in the state of the world

相关搜索

高级搜索

引用