1,698,396 research outputs found
Sparse Conformal Predictors
Conformal predictors, introduced by Vovk et al. (2005), serve to build
prediction intervals by exploiting a notion of conformity of the new data point
with previously observed data. In the present paper, we propose a novel method
for constructing prediction intervals for the response variable in multivariate
linear models. The main emphasis is on sparse linear models, where only few of
the covariates have significant influence on the response variable even if
their number is very large. Our approach is based on combining the principle of
conformal prediction with the penalized least squares estimator
(LASSO). The resulting confidence set depends on a parameter and
has a coverage probability larger than or equal to . The numerical
experiments reported in the paper show that the length of the confidence set is
small. Furthermore, as a by-product of the proposed approach, we provide a
data-driven procedure for choosing the LASSO penalty. The selection power of
the method is illustrated on simulated data
Memory bank predictors
Cache memories are commonly implemented through multiple memory banks to improve bandwidth and latency. The early knowledge of the data cache bank that an instruction will access can help to improve the performance in several ways. One scenario that is likely to become increasingly important is clustered microprocessors with a distributed cache. This work presents a study of different cache bank predictors. We show that effective bank predictors can be implemented with relatively low cost. For instance, a predictor of approximately 4 Kbytes is shown to achieve an average hit rate of 78% for SPECint2000 when used to predict accesses to an 8-bank cache memory in a contemporary superscalar processor. We also show how a predictor can be used to reduce the communication latency caused by memory accesses in a clustered microarchitecture with a distributed cache design.Peer ReviewedPostprint (published version
Multinomial Logit Models with Implicit Variable Selection
Multinomial logit models which are most commonly used for the modeling of unordered multi-category responses are typically restricted to the use of few predictors. In the high-dimensional case maximum likelihood estimates frequently do not exist. In this paper we are developing a boosting technique called multinomBoost that performs variable selection and fits the multinomial logit model also when predictors are high-dimensional. Since in multicategory models the effect of one predictor variable is represented by several parameters one has to distinguish between variable selection and parameter selection. A special feature of the approach is that, in contrast to existing approaches, it selects variables not parameters. The method can distinguish between mandatory predictors and optional predictors. Moreover, it adapts to metric, binary, nominal and ordinal predictors. Regularization within the algorithm allows to include nominal and ordinal variables which have many categories. In the case of ordinal predictors the order information is used. The performance of the boosting technique with respect to mean squared error, prediction error and the identification of relevant variables is investigated in a simulation study. For two real life data sets the results are also compared with the Lasso approach which selects parameters
Nonparametric Estimation of the Link Function Including Variable Selection
Nonparametric methods for the estimation of the link function in generalized linear models are able to avoid bias in the regression parameters. But for the estimation of the link typically the full model, which includes all predictors, has been used. When the number of predictors is large these methods fail since the full model can not be estimated. In the present article a boosting type method is proposed that simultaneously selects predictors and estimates the link function. The method performs quite well in simulations and real data examples
Simultaneous Prediction of Actual and Average Values of Study Variable Using Stein-rule Estimators
The simultaneous prediction of average and actual values of study variable in a linear regression model is considered in this paper. Generally, either of the ordinary least squares estimator or Stein-rule estimators are employed for the construction of predictors for the simultaneous prediction. A linear combination of ordinary least squares and Stein-rule predictors are utilized in this paper to construct an improved predictors. Their efficiency properties are derived using the small disturbance asymptotic theory and dominance conditions for the superiority of predictors over each other are analyzed
Bayesian Conditional Tensor Factorizations for High-Dimensional Classification
In many application areas, data are collected on a categorical response and
high-dimensional categorical predictors, with the goals being to build a
parsimonious model for classification while doing inferences on the important
predictors. In settings such as genomics, there can be complex interactions
among the predictors. By using a carefully-structured Tucker factorization, we
define a model that can characterize any conditional probability, while
facilitating variable selection and modeling of higher-order interactions.
Following a Bayesian approach, we propose a Markov chain Monte Carlo algorithm
for posterior computation accommodating uncertainty in the predictors to be
included. Under near sparsity assumptions, the posterior distribution for the
conditional probability is shown to achieve close to the parametric rate of
contraction even in ultra high-dimensional settings. The methods are
illustrated using simulation examples and biomedical applications
Predictors of Post Prandial Glucose Level in Diabetic Elderly
Post prandial glucose (PPG) level describes the speed of glucose absorption after 2 hours of macronutrient consumption. By knowing this, we could get the big picture of insulin regulation function and macronutrient metabolism in our body. In elderly, age-related slower glucose metabolism leads to diabetes mellitus (DM) in older age. This study aimed to analyze the predictors of PPG level in diabetics elderly which consist of functional status, self-care activity, sleep quality, and stress level. Cross-sectional study design was applied in this study. There were 45 diabetic elderly participated by filling in study instruments. Pearson and Spearman Rank correlation test were used in data analysis (α<.05). Results showed that most respondents were female elderly, 60-74 years old, had DM for 1-5 years with no family history, and only 33.33% respondents reported regular consumption of oral anti diabetes (OAD). Hypertension was found to be frequent comorbidity. Statistical analysis results showed that functional status, self-care activity, sleep quality, and stress level were not significantly correlated with PPG level in diabetic elderly (all p>α), therefore these variables could not be PPG level predictors. Other factors may play a more important role in predicting PPG level in diabetic elderly
- …
