19 research outputs found

    Variance estimation after Kernel Ridge Regression Imputation

    Get PDF
    Imputation is a popular technique for handling missing data. Variance estimation after imputation is an important practical problem in statistics. In this paper, we consider variance estimation of the imputed mean estimator under the kernel ridge regression imputation. We consider a linearization approach which employs the covariate balancing idea to estimate the inverse of propensity scores. The statistical guarantee of our proposed variance estimation is studied when a Sobolev space is utilized to do the imputation, where n-consistency can be obtained. Synthetic data experiments are presented to confirm our theory

    Triply robust estimation under missing at random

    Full text link
    Missing data is frequently encountered in many areas of statistics. Imputation and propensity score weighting are two popular methods for handling missing data. These methods employ some model assumptions, either the outcome regression or the response propensity model. However, correct specification of the statistical model can be challenging in the presence of missing data. Doubly robust estimation is attractive as the consistency of the estimator is guaranteed when either the outcome regression model or the propensity score model is correctly specified. In this paper, we first employ information projection to develop an efficient and doubly robust estimator under indirect model calibration constraints. The resulting propensity score estimator can be equivalently expressed as a doubly robust regression imputation estimator by imposing the internal bias calibration condition in estimating the regression parameters. In addition, we generalize the information projection to allow for outlier-robust estimation. Thus, we achieve triply robust estimation by adding the outlier robustness condition to the double robustness condition. Some asymptotic properties are presented. The simulation study confirms that the proposed method allows robust inference against not only the violation of various model assumptions, but also outliers

    Prediction of spatial distribution characteristics of ecosystem functions based on a minimum data set of functional traits of desert plants

    Get PDF
    The relationship between plant functional traits and ecosystem function is a hot topic in current ecological research, and community-level traits based on individual plant functional traits play important roles in ecosystem function. In temperate desert ecosystems, which functional trait to use to predict ecosystem function is an important scientific question. In this study, the minimum data sets of functional traits of woody (wMDS) and herbaceous (hMDS) plants were constructed and used to predict the spatial distribution of C, N, and P cycling in ecosystems. The results showed that the wMDS included plant height, specific leaf area, leaf dry weight, leaf water content, diameter at breast height (DBH), leaf width, and leaf thickness, and the hMDS included plant height, specific leaf area, leaf fresh weight, leaf length, and leaf width. The linear regression results based on the cross-validations (FTEIW - L, FTEIA - L, FTEIW - NL, and FTEIA - NL) for the MDS and TDS (total data set) showed that the R2 (coefficients of determination) for wMDS were 0.29, 0.34, 0.75, and 0.57, respectively, and those for hMDS were 0.82, 0.75, 0.76, and 0.68, respectively, proving that the MDSs can replace the TDS in predicting ecosystem function. Then, the MDSs were used to predict the C, N, and P cycling in the ecosystem. The results showed that non-linear models RF and BPNN were able to predict the spatial distributions of C, N and P cycling, and the distributions showed inconsistent patterns between different life forms under moisture restrictions. The C, N, and P cycling showed strong spatial autocorrelation and were mainly influenced by structural factors. Based on the non-linear models, the MDSs can be used to accurately predict the C, N, and P cycling, and the predicted values of woody plant functional traits visualized by regression kriging were closer to the kriging results based on raw values. This study provides a new perspective for exploring the relationship between biodiversity and ecosystem function

    Topics on nonparametric calibration, kernel ridge regression imputation\\ and nonparametric propensity score estimation

    No full text
    This dissertation focuses on statistical issues arising in survey data and item nonresponse. In particular, it covers topics on nonparametric calibration in survey data, kernel ridge regression imputation and density ratio estimation in propensity score approach. The first project is about nonparametric calibration in survey sampling. Estimation of a finite population mean or total is important in survey sampling. Calibration estimation is a popular method to address this issue by adjusting the sampling weights to match the unknown population totals of auxiliary variables. When the auxiliary vairbales are observed for all units in the finite population, one can apply the model calibration using the working outcome model. Traditional parametric calibration approach might not be robust in practice. We develope a nonparametric calibration method employing infinite-dimensional reproducing kernel Hilbert space (RKHS) that does not require an explicit outcome model. Under mild assumptions, the proposed calibration estimator attains the Godambe-Joshi lower bound asymptotically. The second project is about handling missing data using kernel ridge regression method. Missing data is frequently encountered in practice. In some cases, missingness is planned to reduce the cost or the response burden. Ignoring the cases with missing values can lead to misleading results. To avoid the potential problem with missing data, imputation is commonly used. Kernel Ridge Regression (KRR) is a modern nonparametric regression technique based on the theory of Reproducing Kernel Hilbert Space, which enjoys the model robustness. We consider such method to imputation. Specifically, we establish the root-n consistency of the KRR imputation estimators and show that it is optimal in the sense that it achieves the lower bound of the semiparametric asymptotic variance. We further consider propensity score weighting method using kernel ridge regression and discuss its asymptotic properties. The third project is about propensity score estimation using density ration function approach. The propensity score approach is also a popular tool for handling item nonresponse. The propensity score is often developed using the model for the response probability. In practice, regression models for binary response, e.g., logistic regression, can be utilized to model the response probability given the observed auxiliary information. An inverse probability weighting estimator can then be constructed to get an unbiased estimation of the target parameter. We consider an alternative approach of estimating the inverse of the propensity scores using density ratio function. Density ratio estimation can be obtained by applying the maximum entropy method which uses the Kullback-Leibler distance measure. By including the covariates for the outcome regression models only into the density ratio model, we can achieve efficient propensity score estimation. We further extend the proposed approach to handling the multivariate missing case

    Topics on nonparametric calibration, kernel ridge regression imputation\\ and nonparametric propensity score estimation

    Get PDF
    This dissertation focuses on statistical issues arising in survey data and item nonresponse. In particular, it covers topics on nonparametric calibration in survey data, kernel ridge regression imputation and density ratio estimation in propensity score approach. The first project is about nonparametric calibration in survey sampling. Estimation of a finite population mean or total is important in survey sampling. Calibration estimation is a popular method to address this issue by adjusting the sampling weights to match the unknown population totals of auxiliary variables. When the auxiliary vairbales are observed for all units in the finite population, one can apply the model calibration using the working outcome model. Traditional parametric calibration approach might not be robust in practice. We develope a nonparametric calibration method employing infinite-dimensional reproducing kernel Hilbert space (RKHS) that does not require an explicit outcome model. Under mild assumptions, the proposed calibration estimator attains the Godambe-Joshi lower bound asymptotically. The second project is about handling missing data using kernel ridge regression method. Missing data is frequently encountered in practice. In some cases, missingness is planned to reduce the cost or the response burden. Ignoring the cases with missing values can lead to misleading results. To avoid the potential problem with missing data, imputation is commonly used. Kernel Ridge Regression (KRR) is a modern nonparametric regression technique based on the theory of Reproducing Kernel Hilbert Space, which enjoys the model robustness. We consider such method to imputation. Specifically, we establish the root-n consistency of the KRR imputation estimators and show that it is optimal in the sense that it achieves the lower bound of the semiparametric asymptotic variance. We further consider propensity score weighting method using kernel ridge regression and discuss its asymptotic properties. The third project is about propensity score estimation using density ration function approach. The propensity score approach is also a popular tool for handling item nonresponse. The propensity score is often developed using the model for the response probability. In practice, regression models for binary response, e.g., logistic regression, can be utilized to model the response probability given the observed auxiliary information. An inverse probability weighting estimator can then be constructed to get an unbiased estimation of the target parameter. We consider an alternative approach of estimating the inverse of the propensity scores using density ratio function. Density ratio estimation can be obtained by applying the maximum entropy method which uses the Kullback-Leibler distance measure. By including the covariates for the outcome regression models only into the density ratio model, we can achieve efficient propensity score estimation. We further extend the proposed approach to handling the multivariate missing case.</p

    Statistical inference using Regularized M-estimation in the reproducing kernel Hilbert space for handling missing data

    No full text
    Imputation and propensity score weighting are two popular techniques for handling missing data. We address these problems using the regularized M-estimation techniques in the reproducing kernel Hilbert space. Specifically, we first use the kernel ridge regression to develop imputation for handling item nonresponse. While this nonparametric approach is potentially promising for imputation, its statistical properties are not investigated in the literature. Under some conditions on the order of the tuning parameter, we first establish the root-n consistency of the kernel ridge regression imputation estimator and show that it achieves the lower bound of the semiparametric asymptotic variance. A nonparametric propensity score estimator using the reproducing kernel Hilbert space is also developed by a novel application of the maximum entropy method for the density ratio function estimation. We show that the resulting propensity score estimator is asymptotically equivalent to the kernel ridge regression imputation estimator. Results from a limited simulation study are also presented to confirm our theory. The proposed method is applied to analyze the air pollution data measured in Beijing, China.This is a pre-print of the article Wang, Hengfang, and Jae Kwang Kim. "Statistical inference using Regularized M-estimation in the reproducing kernel Hilbert space for handling missing data." arXiv preprint arXiv:2107.07371 (2021). DOI: 10.48550/arXiv.2107.07371. Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0). Copyright 2021 The Authors. Posted with permission

    Statistical inference using Regularized M-estimation in the reproducing kernel Hilbert space for handling missing data

    No full text
    Imputation and propensity score weighting are two popular techniques for handling missing data. We address these problems using the regularized M-estimation techniques in the reproducing kernel Hilbert space. Specifically, we first use the kernel ridge regression to develop imputation for handling item nonresponse. While this nonparametric approach is potentially promising for imputation, its statistical properties are not investigated in the literature. Under some conditions on the order of the tuning parameter, we first establish the root-n consistency of the kernel ridge regression imputation estimator and show that it achieves the lower bound of the semiparametric asymptotic variance. A nonparametric propensity score estimator using the reproducing kernel Hilbert space is also developed by a novel application of the maximum entropy method for the density ratio function estimation. We show that the resulting propensity score estimator is asymptotically equivalent to the kernel ridge regression imputation estimator. Results from a limited simulation study are also presented to confirm our theory. The proposed method is applied to analyze the air pollution data measured in Beijing, China.This is a pre-print of the article Wang, Hengfang, and Jae Kwang Kim. "Statistical inference using Regularized M-estimation in the reproducing kernel Hilbert space for handling missing data." arXiv preprint arXiv:2107.07371 (2021). DOI: 10.48550/arXiv.2107.07371. Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0). Copyright 2021 The Authors. Posted with permission

    Variance estimation after Kernel Ridge Regression Imputation

    No full text
    Imputation is a popular technique for handling missing data. Variance estimation after imputation is an important practical problem in statistics. In this paper, we consider variance estimation of the imputed mean estimator under the kernel ridge regression imputation. We consider a linearization approach which employs the covariate balancing idea to estimate the inverse of propensity scores. The statistical guarantee of our proposed variance estimation is studied when a Sobolev space is utilized to do the imputation, where n-consistency can be obtained. Synthetic data experiments are presented to confirm our theory.This paper was presented at the first Workshop on the Art of Learning with Missing Values (Artemiss) hosted by the 37th International Conference on Machine Learning (ICML), July 17, 2020. Posted with permission.</p

    Vitamin K2 Modulates Mitochondrial Dysfunction Induced by 6-Hydroxydopamine in SH-SY5Y Cells via Mitochondrial Quality-Control Loop

    No full text
    Vitamin K2, a natural fat-soluble vitamin, is a potent neuroprotective molecule, owing to its antioxidant effect, but its mechanism has not been fully elucidated. Therefore, we stimulated SH-SY5Y cells with 6-hydroxydopamine (6-OHDA) in a proper dose-dependent manner, followed by a treatment of vitamin K2. In the presence of 6-OHDA, cell viability was reduced, the mitochondrial membrane potential was decreased, and the accumulation of reactive oxygen species (ROS) was increased. Moreover, the treatment of 6-OHDA promoted mitochondria-mediated apoptosis and abnormal mitochondrial fission and fusion. However, vitamin K2 significantly suppressed 6-OHDA-induced changes. Vitamin K2 played a significant part in apoptosis by upregulating and downregulating Bcl-2 and Bax protein expressions, respectively, which inhibited mitochondrial depolarization, and ROS accumulation to maintain mitochondrial structure and functional stabilities. Additionally, vitamin K2 significantly inhibited the 6-OHDA-induced downregulation of the MFN1/2 level and upregulation of the DRP1 level, respectively, and this enabled cells to maintain the dynamic balance of mitochondrial fusion and fission. Furthermore, vitamin K2 treatments downregulated the expression level of p62 and upregulated the expression level of LC3A in 6-OHDA-treated cells via the PINK1/Parkin signaling pathway, thereby promoting mitophagy. Moreover, it induced mitochondrial biogenesis in 6-OHDA damaged cells by promoting the expression of PGC-1α, NRF1, and TFAM. These indicated that vitamin K2 can release mitochondrial damage, and that this effect is related to the participation of vitamin K2 in the regulation of the mitochondrial quality-control loop, through the maintenance of the mitochondrial quality-control system, and repair mitochondrial dysfunction, thereby alleviating neuronal cell death mediated by mitochondrial damage
    corecore