33 research outputs found

    Kernel conditional quantile estimation via reduction revisited

    Get PDF
    Quantile regression refers to the process of estimating the quantiles of a conditional distribution and has many important applications within econometrics and data mining, among other domains. In this paper, we show how to estimate these conditional quantile functions within a Bayes risk minimization framework using a Gaussian process prior. The resulting non-parametric probabilistic model is easy to implement and allows non-crossing quantile functions to be enforced. Moreover, it can directly be used in combination with tools and extensions of standard Gaussian Processes such as principled hyperparameter estimation, sparsification, and quantile regression with input-dependent noise rates. No existing approach enjoys all of these desirable properties. Experiments on benchmark datasets show that our method is competitive with state-of-the-art approaches.

    Contributions to Penalized Estimation

    Get PDF
    Penalized estimation is a useful statistical technique to prevent overfitting problems. In penalized methods, the common objective function is in the form of a loss function for goodness of fit plus a penalty function for complexity control. In this dissertation, we develop several new penalization approaches for various statistical models. These methods aim for effective model selection and accurate parameter estimation. The first part introduces the notion of partially overlapping models across multiple regression models on the same dataset. Such underlying models have at least one overlapping structure sharing the same parameter value. To recover the sparse and overlapping structure, we develop adaptive composite M-estimation (ACME) by doubly penalizing a composite loss function, as a weighted linear combination of the loss functions. ACME automatically circumvents the model misspecification issues inherent in other composite-loss-based estimators. The second part proposes a new refit method and its applications in the regression setting through model combination: ensemble variable selection (EVS) and ensemble variable selection and estimation (EVE). The refit method estimates the regression parameters restricted to the selected covariates by a penalization method. EVS combines model selection decisions from multiple penalization methods and selects the optimal model via the refit and a model selection criterion. EVE considers a factorizable likelihood-based model whose full likelihood is the multiplication of likelihood factors. EVE is shown to have asymptotic efficiency and computational efficiency. The third part studies a sparse undirected Gaussian graphical model (GGM) to explain conditional dependence patterns among variables. The edge set consists of conditionally dependent variable pairs and corresponds to nonzero elements of the inverse covariance matrix under the Gaussian assumption. We propose a consistent validation method for edge selection (CoVES) in the penalization framework. CoVES selects candidate edge sets along the solution path and finds the optimal set via repeated subsampling. CoVES requires simple computation and delivers excellent performance in our numerical studies.Doctor of Philosoph

    Novel Computational Methods for Censored Data and Regression

    Get PDF
    This dissertation can be divided into three topics. In the first topic, we derived a recursive algorithm for the constrained Kaplan-Meier estimator, which promotes the computation speed up to fifty times compared to the current method that uses EM algorithm. We also showed how this leads to the vast improvement of empirical likelihood analysis with right censored data. After a brief review of regularized regressions, we investigated the computational problems in the parametric/non-parametric hybrid accelerated failure time models and its regularization in a high dimensional setting. We also illustrated that, when the number of pieces increases, the discussed models are close to a nonparametric one. In the last topic, we discussed a semi-parametric approach of hypothesis testing problem in the binary choice model. The major tools used are Buckley-James like algorithm and empirical likelihood. The essential idea, which is similar to the first topic, is iteratively computing linear constrained empirical likelihood using optimization algorithms including EM, and iterative convex minorant algorithm

    Vol. 13, No. 2 (Full Issue)

    Get PDF

    Efficiency and Robustness in Individualized Decision Making

    Get PDF
    Recent development in data-driven decision science has seen great advances in individualized decision making. Given data with covariates, treatment assignments and outcomes, one common goal is to find individualized decision rules that map the individual characteristics or contextual information to the treatment assignment, such that the overall expected outcome can be optimized. In this dissertation, we propose several new approaches to learn efficient and robust individualized decision rules. In the first project, we consider the robust learning problem when training and testing distributions can be different. A novel framework of the Distributionally Robust Individualized Treatment Rule (DR-ITR) is proposed to maximize the worst-case value function under distributional changes. The testing performance among a set of distributions close to training can be guaranteed reasonably well. For the second project, we consider the problem of treatment-free effect misspecification and heteroscedasticity. We propose an Efficient Learning (E-Learning) framework for finding an optimal ITR with improved efficiency in the multiple treatment setting. The proposed E-Learning is optimal among a regular class of semiparametric estimates that can allow treatment-free effect misspecification and heteroscedasticity. We demonstrate its effectiveness when one of or both misspecified treatment-free effect and heteroscedasticity exist. For the third project, we study the multi-stage multi-treatment decision problem. A new Backward Change Point Structural Nested Mean Model (BCP-SNMM) is developed to allow an unknown backward change point of the SNMM. We further propose the Dynamic Efficient Learning (DE-Learning) framework that is optimal under the BCP-SNMM and enjoys more robustness. Compared with the existing G-Estimation, DE-Learning is a tractable procedure for rigorous semiparametric efficient estimation, with much fewer nuisance functions to estimate and can be implemented in a backward stagewise manner.Doctor of Philosoph

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts
    corecore