Behaviour policy estimation in off-policy policy evaluation: Calibration matters

A Raghu, O Gottesman, Y Liu, M Komorowski… - arXiv preprint arXiv …, 2018 - arxiv.org
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …

[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - finale.seas.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …

Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

A Raghu, O Gottesman, Y Liu, M Komorowski… - arXiv e …, 2018 - ui.adsabs.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …

[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - scholar.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …

[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - finale.seas.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …

[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - scholar.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …