所有版本 - 学术资源搜索

Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning

N Kallus, M Uehara - Advances in neural information …, 2019 - proceedings.neurips.cc

Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows
one to evaluate novel decision policies without needing to conduct exploration, which is …

被引用次数：54 相关文章

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

N Kallus, M Uehara - arXiv preprint arXiv:1906.03735, 2019 - arxiv.org

Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows
one to evaluate novel decision policies without needing to conduct exploration, which is …

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

N Kallus, M Uehara - Advances in Neural Information …, 2019 - proceedings.neurips.cc

Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows
one to evaluate novel decision policies without needing to conduct exploration, which is …

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

N Kallus, M Uehara - Advances in neural information processing systems, 2019 - par.nsf.gov

Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows
one to evaluate novel decision policies without needing to conduct exploration, which is …

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

N Kallus, M Uehara - openreview.net

Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows
one to evaluate novel decision policies without needing to conduct exploration, which is …

[PDF] neurips.cc

Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

[PDF][PDF] Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning

引用

高级搜索

引用