1,050 research outputs found
A group-based approach to the least squares regression for handling multicollinearity from strongly correlated variables
Multicollinearity due to strongly correlated predictor variables is a
long-standing problem in regression analysis. It leads to difficulties in
parameter estimation, inference, variable selection and prediction for the
least squares regression. To deal with these difficulties, we propose a
group-based approach to the least squares regression centered on the collective
impact of the strongly correlated variables. We discuss group effects of such
variables that represent their collective impact, and present the group-based
approach through real and simulated data examples. We also give a condition
more precise than what is available in the literature under which predictions
by the least squares estimated model are accurate. This approach is a natural
way of working with multicollinearity which resolves the difficulties without
altering the least squares method. It has several advantages over alternative
methods such as ridge regression and principal component regression.Comment: 36 pages, 1 figur
Bounds on coverage probabilities of the empirical likelihood ratio confidence regions
This paper studies the least upper bounds on coverage probabilities of the
empirical likelihood ratio confidence regions based on estimating equations.
The implications of the bounds on empirical likelihood inference are also
discussed
Average group effect of strongly correlated predictor variables is estimable
It is well known that individual parameters of strongly correlated predictor
variables in a linear model cannot be accurately estimated by the least squares
regression due to multicollinearity generated by such variables. Surprisingly,
an average of these parameters can be extremely accurately estimated. We find
this average and briefly discuss its applications in the least squares
regression.Comment: 1
Empirical likelihood on the full parameter space
We extend the empirical likelihood of Owen [Ann. Statist. 18 (1990) 90-120]
by partitioning its domain into the collection of its contours and mapping the
contours through a continuous sequence of similarity transformations onto the
full parameter space. The resulting extended empirical likelihood is a natural
generalization of the original empirical likelihood to the full parameter
space; it has the same asymptotic properties and identically shaped contours as
the original empirical likelihood. It can also attain the second order accuracy
of the Bartlett corrected empirical likelihood of DiCiccio, Hall and Romano
[Ann. Statist. 19 (1991) 1053-1061]. A simple first order extended empirical
likelihood is found to be substantially more accurate than the original
empirical likelihood. It is also more accurate than available second order
empirical likelihood methods in most small sample situations and competitive in
accuracy in large sample situations. Importantly, in many one-dimensional
applications this first order extended empirical likelihood is accurate for
sample sizes as small as ten, making it a practical and reliable choice for
small sample empirical likelihood inference.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1143 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Efficient Portfolio Selection
Merak believed that an efficient frontier analysis method that combined the robustness of the Monte Carlo approach with the confidence of the Markowitz approach would be a very powerful tool for any industry. However, it soon became clear that there are other ways to address the problem that do not require a Monte Carlo component.
Three subgroups were formed, and each developed a different approach for solving the problem. These were the Portfolio Selection Algorithm Approach, the Statistical Inference Approach, and the Integer Programming Approach
Sparse maximum likelihood estimation for regression models
For regression model selection via maximum likelihood estimation, we adopt a
vector representation of candidate models and study the likelihood ratio
confidence region for the regression parameter vector of a full model. We show
that when its confidence level increases with the sample size at a certain
speed, with probability tending to one, the confidence region consists of
vectors representing models containing all active variables, including the true
parameter vector of the full model. Using this result, we examine the
asymptotic composition of models of maximum likelihood and find the subset of
such models that contain all active variables. We then devise a consistent
model selection criterion which has a sparse maximum likelihood estimation
interpretation and certain advantages over popular information criteria.Comment: 13 page
- …