Behaviour policy estimation in off-policy policy evaluation: Calibration matters
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters
A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - finale.seas.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters
A Raghu, O Gottesman, Y Liu, M Komorowski… - arXiv e …, 2018 - ui.adsabs.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters
A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - scholar.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters
A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - finale.seas.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
[PDF][PDF] Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters
A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal… - scholar.harvard.edu
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …
Policy Evaluation (OPE) when the true behaviour policy is unknown. Via a series of …