93,861 research outputs found
A Generic Path Algorithm for Regularized Statistical Estimation
Regularization is widely used in statistics and machine learning to prevent
overfitting and gear solution towards prior information. In general, a
regularized estimation problem minimizes the sum of a loss function and a
penalty term. The penalty term is usually weighted by a tuning parameter and
encourages certain constraints on the parameters to be estimated. Particular
choices of constraints lead to the popular lasso, fused-lasso, and other
generalized penalized regression methods. Although there has been a lot
of research in this area, developing efficient optimization methods for many
nonseparable penalties remains a challenge. In this article we propose an exact
path solver based on ordinary differential equations (EPSODE) that works for
any convex loss function and can deal with generalized penalties as well
as more complicated regularization such as inequality constraints encountered
in shape-restricted regressions and nonparametric density estimation. In the
path following process, the solution path hits, exits, and slides along the
various constraints and vividly illustrates the tradeoffs between goodness of
fit and model parsimony. In practice, the EPSODE can be coupled with AIC, BIC,
or cross-validation to select an optimal tuning parameter. Our
applications to generalized regularized generalized linear models,
shape-restricted regressions, Gaussian graphical models, and nonparametric
density estimation showcase the potential of the EPSODE algorithm.Comment: 28 pages, 5 figure
Short and long-term wind turbine power output prediction
In the wind energy industry, it is of great importance to develop models that
accurately forecast the power output of a wind turbine, as such predictions are
used for wind farm location assessment or power pricing and bidding,
monitoring, and preventive maintenance. As a first step, and following the
guidelines of the existing literature, we use the supervisory control and data
acquisition (SCADA) data to model the wind turbine power curve (WTPC). We
explore various parametric and non-parametric approaches for the modeling of
the WTPC, such as parametric logistic functions, and non-parametric piecewise
linear, polynomial, or cubic spline interpolation functions. We demonstrate
that all aforementioned classes of models are rich enough (with respect to
their relative complexity) to accurately model the WTPC, as their mean squared
error (MSE) is close to the MSE lower bound calculated from the historical
data. We further enhance the accuracy of our proposed model, by incorporating
additional environmental factors that affect the power output, such as the
ambient temperature, and the wind direction. However, all aforementioned
models, when it comes to forecasting, seem to have an intrinsic limitation, due
to their inability to capture the inherent auto-correlation of the data. To
avoid this conundrum, we show that adding a properly scaled ARMA modeling layer
increases short-term prediction performance, while keeping the long-term
prediction capability of the model
Optimal experiment design revisited: fair, precise and minimal tomography
Given an experimental set-up and a fixed number of measurements, how should
one take data in order to optimally reconstruct the state of a quantum system?
The problem of optimal experiment design (OED) for quantum state tomography was
first broached by Kosut et al. [arXiv:quant-ph/0411093v1]. Here we provide
efficient numerical algorithms for finding the optimal design, and analytic
results for the case of 'minimal tomography'. We also introduce the average
OED, which is independent of the state to be reconstructed, and the optimal
design for tomography (ODT), which minimizes tomographic bias. We find that
these two designs are generally similar. Monte-Carlo simulations confirm the
utility of our results for qubits. Finally, we adapt our approach to deal with
constrained techniques such as maximum likelihood estimation. We find that
these are less amenable to optimization than cruder reconstruction methods,
such as linear inversion.Comment: 16 pages, 7 figure
- …