1,698,396 research outputs found

    Sparse Conformal Predictors

    Get PDF
    Conformal predictors, introduced by Vovk et al. (2005), serve to build prediction intervals by exploiting a notion of conformity of the new data point with previously observed data. In the present paper, we propose a novel method for constructing prediction intervals for the response variable in multivariate linear models. The main emphasis is on sparse linear models, where only few of the covariates have significant influence on the response variable even if their number is very large. Our approach is based on combining the principle of conformal prediction with the 1\ell_1 penalized least squares estimator (LASSO). The resulting confidence set depends on a parameter ϵ>0\epsilon>0 and has a coverage probability larger than or equal to 1ϵ1-\epsilon. The numerical experiments reported in the paper show that the length of the confidence set is small. Furthermore, as a by-product of the proposed approach, we provide a data-driven procedure for choosing the LASSO penalty. The selection power of the method is illustrated on simulated data

    Memory bank predictors

    Get PDF
    Cache memories are commonly implemented through multiple memory banks to improve bandwidth and latency. The early knowledge of the data cache bank that an instruction will access can help to improve the performance in several ways. One scenario that is likely to become increasingly important is clustered microprocessors with a distributed cache. This work presents a study of different cache bank predictors. We show that effective bank predictors can be implemented with relatively low cost. For instance, a predictor of approximately 4 Kbytes is shown to achieve an average hit rate of 78% for SPECint2000 when used to predict accesses to an 8-bank cache memory in a contemporary superscalar processor. We also show how a predictor can be used to reduce the communication latency caused by memory accesses in a clustered microarchitecture with a distributed cache design.Peer ReviewedPostprint (published version

    Multinomial Logit Models with Implicit Variable Selection

    Get PDF
    Multinomial logit models which are most commonly used for the modeling of unordered multi-category responses are typically restricted to the use of few predictors. In the high-dimensional case maximum likelihood estimates frequently do not exist. In this paper we are developing a boosting technique called multinomBoost that performs variable selection and fits the multinomial logit model also when predictors are high-dimensional. Since in multicategory models the effect of one predictor variable is represented by several parameters one has to distinguish between variable selection and parameter selection. A special feature of the approach is that, in contrast to existing approaches, it selects variables not parameters. The method can distinguish between mandatory predictors and optional predictors. Moreover, it adapts to metric, binary, nominal and ordinal predictors. Regularization within the algorithm allows to include nominal and ordinal variables which have many categories. In the case of ordinal predictors the order information is used. The performance of the boosting technique with respect to mean squared error, prediction error and the identification of relevant variables is investigated in a simulation study. For two real life data sets the results are also compared with the Lasso approach which selects parameters

    Nonparametric Estimation of the Link Function Including Variable Selection

    Get PDF
    Nonparametric methods for the estimation of the link function in generalized linear models are able to avoid bias in the regression parameters. But for the estimation of the link typically the full model, which includes all predictors, has been used. When the number of predictors is large these methods fail since the full model can not be estimated. In the present article a boosting type method is proposed that simultaneously selects predictors and estimates the link function. The method performs quite well in simulations and real data examples

    Simultaneous Prediction of Actual and Average Values of Study Variable Using Stein-rule Estimators

    Get PDF
    The simultaneous prediction of average and actual values of study variable in a linear regression model is considered in this paper. Generally, either of the ordinary least squares estimator or Stein-rule estimators are employed for the construction of predictors for the simultaneous prediction. A linear combination of ordinary least squares and Stein-rule predictors are utilized in this paper to construct an improved predictors. Their efficiency properties are derived using the small disturbance asymptotic theory and dominance conditions for the superiority of predictors over each other are analyzed

    Bayesian Conditional Tensor Factorizations for High-Dimensional Classification

    Full text link
    In many application areas, data are collected on a categorical response and high-dimensional categorical predictors, with the goals being to build a parsimonious model for classification while doing inferences on the important predictors. In settings such as genomics, there can be complex interactions among the predictors. By using a carefully-structured Tucker factorization, we define a model that can characterize any conditional probability, while facilitating variable selection and modeling of higher-order interactions. Following a Bayesian approach, we propose a Markov chain Monte Carlo algorithm for posterior computation accommodating uncertainty in the predictors to be included. Under near sparsity assumptions, the posterior distribution for the conditional probability is shown to achieve close to the parametric rate of contraction even in ultra high-dimensional settings. The methods are illustrated using simulation examples and biomedical applications

    Predictors of Post Prandial Glucose Level in Diabetic Elderly

    Full text link
    Post prandial glucose (PPG) level describes the speed of glucose absorption after 2 hours of macronutrient consumption. By knowing this, we could get the big picture of insulin regulation function and macronutrient metabolism in our body. In elderly, age-related slower glucose metabolism leads to diabetes mellitus (DM) in older age. This study aimed to analyze the predictors of PPG level in diabetics elderly which consist of functional status, self-care activity, sleep quality, and stress level. Cross-sectional study design was applied in this study. There were 45 diabetic elderly participated by filling in study instruments. Pearson and Spearman Rank correlation test were used in data analysis (α<.05). Results showed that most respondents were female elderly, 60-74 years old, had DM for 1-5 years with no family history, and only 33.33% respondents reported regular consumption of oral anti diabetes (OAD). Hypertension was found to be frequent comorbidity. Statistical analysis results showed that functional status, self-care activity, sleep quality, and stress level were not significantly correlated with PPG level in diabetic elderly (all p>α), therefore these variables could not be PPG level predictors. Other factors may play a more important role in predicting PPG level in diabetic elderly
    corecore