Diagnosing missing always at random in multivariate data

II Bojinov, NS Pillai, DB Rubin - Biometrika, 2020 - academic.oup.com
Biometrika, 2020academic.oup.com
Models for analysing multivariate datasets with missing values require strong, often
unassessable, assumptions. The most common of these is that the mechanism that created
the missing data is ignorable, which is a two-fold assumption dependent on the mode of
inference. The first part, which is the focus here, under the Bayesian and direct-likelihood
paradigms requires that the missing data be missing at random; in contrast, the frequentist-
likelihood paradigm demands that the missing data mechanism always produce missing at …
Summary
Models for analysing multivariate datasets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable, which is a two-fold assumption dependent on the mode of inference. The first part, which is the focus here, under the Bayesian and direct-likelihood paradigms requires that the missing data be missing at random; in contrast, the frequentist-likelihood paradigm demands that the missing data mechanism always produce missing at random data, a condition known as missing always at random. Under certain regularity conditions, assuming missing always at random leads to a condition that can be tested using the observed data alone, namely that the missing data indicators depend only on fully observed variables. In this note we propose three different diagnostic tests that not only indicate when this assumption is incorrect but also suggest which variables are the most likely culprits. Although missing always at random is not a necessary condition to ensure validity under the Bayesian and direct-likelihood paradigms, it is sufficient, and evidence of its violation should encourage the careful statistician to conduct targeted sensitivity analyses.
Oxford University Press
以上显示的是最相近的搜索结果。 查看全部搜索结果