Misguided Use of Observed Covariates to Impute Missing Covariates in Conditional Prediction: A Shrinkage Problem

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Michael Gmeiner, Charles F Manski, Anat Tamburc

Ngôn ngữ: eng

Ký hiệu phân loại: 511.4 Approximations formerly also 513.24 and expansions

Thông tin xuất bản: 2021

Mô tả vật lý:

Bộ sưu tập: Metadata

ID: 166333

Researchers regularly perform conditional prediction using imputed values of missing data. However, applications of imputation often lack a firm foundation in statistical theory. This paper originated when we were unable to find analysis substantiating claims that imputation of missing data has good frequentist properties when data are missing at random (MAR). We focused on the use of observed covariates to impute missing covariates when estimating conditional means of the form E(y|x, w). Here y is an outcome whose realizations are always observed, x is a covariate whose realizations are always observed, and w is a covariate whose realizations are sometimes unobserved. We examine the probability limit of simple imputation estimates of E(y|x, w) as sample size goes to infinity. We find that these estimates are not consistent when covariate data are MAR. To the contrary, the estimates suffer from a shrinkage problem. They converge to points intermediate between the conditional mean of interest, E(y|x, w), and the mean E(y|x) that conditions only on x. We use a type of genotype imputation to illustrate.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH