3 research outputs found
Image Segmentation by Iterative Inference from Conditional Score Estimation
Inspired by the combination of feedforward and iterative computations in the
virtual cortex, and taking advantage of the ability of denoising autoencoders
to estimate the score of a joint distribution, we propose a novel approach to
iterative inference for capturing and exploiting the complex joint distribution
of output variables conditioned on some input variables. This approach is
applied to image pixel-wise segmentation, with the estimated conditional score
used to perform gradient ascent towards a mode of the estimated conditional
distribution. This extends previous work on score estimation by denoising
autoencoders to the case of a conditional distribution, with a novel use of a
corrupted feedforward predictor replacing Gaussian corruption. An advantage of
this approach over more classical ways to perform iterative inference for
structured outputs, like conditional random fields (CRFs), is that it is not
any more necessary to define an explicit energy function linking the output
variables. To keep computations tractable, such energy function
parametrizations are typically fairly constrained, involving only a few
neighbors of each of the output variables in each clique. We experimentally
find that the proposed iterative inference from conditional score estimation by
conditional denoising autoencoders performs better than comparable models based
on CRFs or those not using any explicit modeling of the conditional joint
distribution of outputs
Learning Discriminators as Energy Networks in Adversarial Learning
We propose a novel framework for structured prediction via adversarial
learning. Existing adversarial learning methods involve two separate networks,
i.e., the structured prediction models and the discriminative models, in the
training. The information captured by discriminative models complements that in
the structured prediction models, but few existing researches have studied on
utilizing such information to improve structured prediction models at the
inference stage. In this work, we propose to refine the predictions of
structured prediction models by effectively integrating discriminative models
into the prediction. Discriminative models are treated as energy-based models.
Similar to the adversarial learning, discriminative models are trained to
estimate scores which measure the quality of predicted outputs, while
structured prediction models are trained to predict contrastive outputs with
maximal energy scores. In this way, the gradient vanishing problem is
ameliorated, and thus we are able to perform inference by following the ascent
gradient directions of discriminative models to refine structured prediction
models. The proposed method is able to handle a range of tasks, e.g.,
multi-label classification and image segmentation. Empirical results on these
two tasks validate the effectiveness of our learning method
AttentionBoost: Learning What to Attend by Boosting Fully Convolutional Networks
Dense prediction models are widely used for image segmentation. One important
challenge is to sufficiently train these models to yield good generalizations
for hard-to-learn pixels. A typical group of such hard-to-learn pixels are
boundaries between instances. Many studies have proposed to give specific
attention to learning the boundary pixels. They include designing multi-task
networks with an additional task of boundary prediction and increasing the
weights of boundary pixels' predictions in the loss function. Such strategies
require defining what to attend beforehand and incorporating this defined
attention to the learning model. However, there may exist other groups of
hard-to-learn pixels and manually defining and incorporating the appropriate
attention for each group may not be feasible. In order to provide a more
attainable and scalable solution, this paper proposes AttentionBoost, which is
a new multi-attention learning model based on adaptive boosting. AttentionBoost
designs a multi-stage network and introduces a new loss adjustment mechanism
for a dense prediction model to adaptively learn what to attend at each stage
directly on image data without necessitating any prior definition about what to
attend. This mechanism modulates the attention of each stage to correct the
mistakes of previous stages, by adjusting the loss weight of each pixel
prediction separately with respect to how accurate the previous stages are on
this pixel. This mechanism enables AttentionBoost to learn different attentions
for different pixels at the same stage, according to difficulty of learning
these pixels, as well as multiple attentions for the same pixel at different
stages, according to confidence of these stages on their predictions for this
pixel. Using gland segmentation as a showcase application, our experiments
demonstrate that AttentionBoost improves the results of its counterparts.Comment: This work has been submitted to the IEEE for possible publicatio