Enhancing Prediction Efficacy with High-Dimensional Input Via Structural Mixture Modeling of Local Linear Mappings

Abstract

Regression is a widely used statistical tool to discover associations between variables. Estimated relationships can be further utilized for predicting new observations. Obtaining reliable prediction outcomes is a challenging task. When building a regression model, several difficulties such as high dimensionality in predictors, non-linearity of the associations and outliers could reduce the quality of results. Furthermore, the prediction error increases if the newly acquired data is not processed carefully. In this dissertation, we aim at improving prediction performance by enhancing the model robustness at the training stage and duly handling the query data at the testing stage. We propose two methods to build robust models. One focuses on adopting a parsimonious model to limit the number of parameters and a refinement technique to enhance model robustness. We design the procedure to be carried out on parallel systems and further extend their ability to handle complex and large-scale datasets. The other method restricts the parameter space to avoid the singularity issue and takes up trimming techniques to limit the influence of outlying observations. We build both approaches by using the mixture-modeling principle to accommodate data heterogeneity without uncontrollably increasing model complexity. The proposed procedures for suitably choosing tuning parameters further enhance the ability to determine the sizes of the models according to the richness of the available data. Both methods show their ability to improve prediction performance, compared to existing approaches, in applications such as magnetic resonance vascular fingerprinting and source separation in single-channel polyphonic music, among others. To evaluate model robustness, we develop an efficient approach to generating adversarial samples, which could induce large prediction errors yet are difficult to detect visually. Finally, we propose a preprocessing system to detect and repair different kinds of abnormal testing samples for prediction efficacy, when testing samples are either corrupted or adversarially perturbed.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/149938/1/timtu_1.pd

    Similar works

    Full text

    thumbnail-image