11,992 research outputs found
Flexible Tweedie regression models for continuous data
Tweedie regression models provide a flexible family of distributions to deal
with non-negative highly right-skewed data as well as symmetric and heavy
tailed data and can handle continuous data with probability mass at zero. The
estimation and inference of Tweedie regression models based on the maximum
likelihood method are challenged by the presence of an infinity sum in the
probability function and non-trivial restrictions on the power parameter space.
In this paper, we propose two approaches for fitting Tweedie regression models,
namely, quasi- and pseudo-likelihood. We discuss the asymptotic properties of
the two approaches and perform simulation studies to compare our methods with
the maximum likelihood method. In particular, we show that the quasi-likelihood
method provides asymptotically efficient estimation for regression parameters.
The computational implementation of the alternative methods is faster and
easier than the orthodox maximum likelihood, relying on a simple Newton scoring
algorithm. Simulation studies showed that the quasi- and pseudo-likelihood
approaches present estimates, standard errors and coverage rates similar to the
maximum likelihood method. Furthermore, the second-moment assumptions required
by the quasi- and pseudo-likelihood methods enables us to extend the Tweedie
regression models to the class of quasi-Tweedie regression models in the
Wedderburn's style. Moreover, it allows to eliminate the non-trivial
restriction on the power parameter space, and thus provides a flexible
regression model to deal with continuous data. We provide \texttt{R}
implementation and illustrate the application of Tweedie regression models
using three data sets.Comment: 34 pages, 8 figure
An asymptotically superlinearly convergent semismooth Newton augmented Lagrangian method for Linear Programming
Powerful interior-point methods (IPM) based commercial solvers, such as
Gurobi and Mosek, have been hugely successful in solving large-scale linear
programming (LP) problems. The high efficiency of these solvers depends
critically on the sparsity of the problem data and advanced matrix
factorization techniques. For a large scale LP problem with data matrix
that is dense (possibly structured) or whose corresponding normal matrix
has a dense Cholesky factor (even with re-ordering), these solvers may require
excessive computational cost and/or extremely heavy memory usage in each
interior-point iteration. Unfortunately, the natural remedy, i.e., the use of
iterative methods based IPM solvers, although can avoid the explicit
computation of the coefficient matrix and its factorization, is not practically
viable due to the inherent extreme ill-conditioning of the large scale normal
equation arising in each interior-point iteration. To provide a better
alternative choice for solving large scale LPs with dense data or requiring
expensive factorization of its normal equation, we propose a semismooth Newton
based inexact proximal augmented Lagrangian ({\sc Snipal}) method. Different
from classical IPMs, in each iteration of {\sc Snipal}, iterative methods can
efficiently be used to solve simpler yet better conditioned semismooth Newton
linear systems. Moreover, {\sc Snipal} not only enjoys a fast asymptotic
superlinear convergence but is also proven to enjoy a finite termination
property. Numerical comparisons with Gurobi have demonstrated encouraging
potential of {\sc Snipal} for handling large-scale LP problems where the
constraint matrix has a dense representation or has a dense
factorization even with an appropriate re-ordering.Comment: Due to the limitation "The abstract field cannot be longer than 1,920
characters", the abstract appearing here is slightly shorter than that in the
PDF fil
New Quasi-Newton Equation And Method Via Higher Order Tensor Models
This thesis introduces a general approach by proposing a new quasi-Newton
(QN) equation via fourth order tensor model. To approximate the curvature
of the objective function, more available information from the function-values
and gradient is employed. The efficiency of the usual QN methods is improved
by accelerating the performance of the algorithms without causing more storage
demand.
The presented equation allows the modification of several algorithms involving
QN equations for practical optimization that possess superior convergence prop-
erty. By using a new equation, the BFGS method is modified. This is done
twice by employing two different strategies proposed by Zhang and Xu (2001)
and Wei et al. (2006) to generate positive definite updates. The superiority of
these methods compared to the standard BFGS and the modification proposed
by Wei et al. (2006) is shown. Convergence analysis that gives the local and global convergence property of these methods and numerical results that shows
the advantage of the modified QN methods are presented.
Moreover, a new limited memory QN method to solve large scale unconstrained
optimization is developed based on the modified BFGS updated formula. The
comparison between this new method with that of the method developed by Xiao
et al. (2008) shows better performance in numerical results for the new method.
The global and local convergence properties of the new method on uniformly
convex problems are also analyzed.
The compact limited memory BFGS method is modified to solve the large scale
unconstrained optimization problems. This method is derived from the proposed
new QN update formula. The new method yields a more efficient algorithm
compared to the standard limited memory BFGS with simple bounds (L-BFGS-B) method in the case of solving unconstrained problems. The implementation of
the new proposed method on a set of test problems highlights that the derivation
of this new method is more efficient in performing the standard algorithm
Variable selection in semiparametric regression modeling
In this paper, we are concerned with how to select significant variables in
semiparametric modeling. Variable selection for semiparametric regression
models consists of two components: model selection for nonparametric components
and selection of significant variables for the parametric portion. Thus,
semiparametric variable selection is much more challenging than parametric
variable selection (e.g., linear and generalized linear models) because
traditional variable selection procedures including stepwise regression and the
best subset selection now require separate model selection for the
nonparametric components for each submodel. This leads to a very heavy
computational burden. In this paper, we propose a class of variable selection
procedures for semiparametric regression models using nonconcave penalized
likelihood. We establish the rate of convergence of the resulting estimate.
With proper choices of penalty functions and regularization parameters, we show
the asymptotic normality of the resulting estimate and further demonstrate that
the proposed procedures perform as well as an oracle procedure. A
semiparametric generalized likelihood ratio test is proposed to select
significant variables in the nonparametric component. We investigate the
asymptotic behavior of the proposed test and demonstrate that its limiting null
distribution follows a chi-square distribution which is independent of the
nuisance parameters. Extensive Monte Carlo simulation studies are conducted to
examine the finite sample performance of the proposed variable selection
procedures.Comment: Published in at http://dx.doi.org/10.1214/009053607000000604 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …