Doubly robust policy evaluation and optimization

S Levine, A Kumar, G Tucker, J Fu - arXiv preprint arXiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

被引用次数：1775 相关文章所有 3 个版本

[PDF] annualreviews.org

Machine learning methods that economists should know about

S Athey, GW Imbens - Annual Review of Economics, 2019 - annualreviews.org

We discuss the relevance of the recent machine learning (ML) literature for economics and
econometrics. First we discuss the differences in goals, methods, and settings between the …

被引用次数：988 相关文章所有 11 个版本

[PDF] nber.org

The impact of machine learning on economics

S Athey - The economics of artificial intelligence: An agenda, 2018 - degruyter.com

I believe that machine learning (ML) will have a dramatic impact on the field of economics
within a short time frame. Indeed, the impact of ML on economics is already well underway …

被引用次数：963 相关文章所有 18 个版本

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

被引用次数：1055 相关文章所有 7 个版本

[PDF] nowpublishers.com

Neurosymbolic programming

S Chaudhuri, K Ellis, O Polozov, R Singh… - … and Trends® in …, 2021 - nowpublishers.com

We survey recent work on neurosymbolic programming, an emerging area that bridges the
areas of deep learning and program synthesis. Like in classic machine learning, the goal …

被引用次数：84 相关文章所有 8 个版本

[PDF] archive.org

Q-learning: Theory and applications

J Clifton, E Laber - Annual Review of Statistics and Its …, 2020 - annualreviews.org

Q-learning, originally an incremental algorithm for estimating an optimal decision strategy in
an infinite-horizon decision problem, now refers to a general class of reinforcement learning …

被引用次数：235 相关文章所有 4 个版本

[PDF] projecteuclid.org

Constrained Bayesian optimization with noisy experiments

B Letham, B Karrer, G Ottoni, E Bakshy - 2019 - projecteuclid.org

Constrained Bayesian Optimization with Noisy Experiments Page 1 Bayesian Analysis (2019)
14, Number 2, pp. 495–519 Constrained Bayesian Optimization with Noisy Experiments …

被引用次数：348 相关文章所有 8 个版本

[PDF] jmlr.org

Double reinforcement learning for efficient off-policy evaluation in markov decision processes

N Kallus, M Uehara - Journal of Machine Learning Research, 2020 - jmlr.org

Off-policy evaluation (OPE) in reinforcement learning allows one to evaluate novel decision
policies without needing to conduct exploration, which is often costly or otherwise infeasible …

被引用次数：188 相关文章所有 7 个版本

[PDF] mlr.press

Kinematic state abstraction and provably efficient rich-observation reinforcement learning

D Misra, M Henaff, A Krishnamurthy… - … on machine learning, 2020 - proceedings.mlr.press

We present an algorithm, HOMER, for exploration and reinforcement learning in rich
observation environments that are summarizable by an unknown latent state space. The …

被引用次数：183 相关文章所有 8 个版本

[PDF] aaai.org

Balanced linear contextual bandits

M Dimakopoulou, Z Zhou, S Athey… - Proceedings of the AAAI …, 2019 - ojs.aaai.org

Contextual bandit algorithms are sensitive to the estimation method of the outcome model as
well as the exploration method used, particularly in the presence of rich heterogeneity or …

被引用次数：209 相关文章所有 12 个版本