12,110 research outputs found
Dropout Training as Adaptive Regularization
Dropout and other feature noising schemes control overfitting by artificially
corrupting the training data. For generalized linear models, dropout performs a
form of adaptive regularization. Using this viewpoint, we show that the dropout
regularizer is first-order equivalent to an L2 regularizer applied after
scaling the features by an estimate of the inverse diagonal Fisher information
matrix. We also establish a connection to AdaGrad, an online learning
algorithm, and find that a close relative of AdaGrad operates by repeatedly
solving linear dropout-regularized problems. By casting dropout as
regularization, we develop a natural semi-supervised algorithm that uses
unlabeled data to create a better adaptive regularizer. We apply this idea to
document classification tasks, and show that it consistently boosts the
performance of dropout training, improving on state-of-the-art results on the
IMDB reviews dataset.Comment: 11 pages. Advances in Neural Information Processing Systems (NIPS),
201
Multiplicative Noise Removal Using Variable Splitting and Constrained Optimization
Multiplicative noise (also known as speckle noise) models are central to the
study of coherent imaging systems, such as synthetic aperture radar and sonar,
and ultrasound and laser imaging. These models introduce two additional layers
of difficulties with respect to the standard Gaussian additive noise scenario:
(1) the noise is multiplied by (rather than added to) the original image; (2)
the noise is not Gaussian, with Rayleigh and Gamma being commonly used
densities. These two features of multiplicative noise models preclude the
direct application of most state-of-the-art algorithms, which are designed for
solving unconstrained optimization problems where the objective has two terms:
a quadratic data term (log-likelihood), reflecting the additive and Gaussian
nature of the noise, plus a convex (possibly nonsmooth) regularizer (e.g., a
total variation or wavelet-based regularizer/prior). In this paper, we address
these difficulties by: (1) converting the multiplicative model into an additive
one by taking logarithms, as proposed by some other authors; (2) using variable
splitting to obtain an equivalent constrained problem; and (3) dealing with
this optimization problem using the augmented Lagrangian framework. A set of
experiments shows that the proposed method, which we name MIDAL (multiplicative
image denoising by augmented Lagrangian), yields state-of-the-art results both
in terms of speed and denoising performance.Comment: 11 pages, 7 figures, 2 tables. To appear in the IEEE Transactions on
Image Processing
An update on statistical boosting in biomedicine
Statistical boosting algorithms have triggered a lot of research during the
last decade. They combine a powerful machine-learning approach with classical
statistical modelling, offering various practical advantages like automated
variable selection and implicit regularization of effect estimates. They are
extremely flexible, as the underlying base-learners (regression functions
defining the type of effect for the explanatory variables) can be combined with
any kind of loss function (target function to be optimized, defining the type
of regression setting). In this review article, we highlight the most recent
methodological developments on statistical boosting regarding variable
selection, functional regression and advanced time-to-event modelling.
Additionally, we provide a short overview on relevant applications of
statistical boosting in biomedicine
- …