120,980 research outputs found
Sharp Attention Network via Adaptive Sampling for Person Re-identification
In this paper, we present novel sharp attention networks by adaptively
sampling feature maps from convolutional neural networks (CNNs) for person
re-identification (re-ID) problem. Due to the introduction of sampling-based
attention models, the proposed approach can adaptively generate sharper
attention-aware feature masks. This greatly differs from the gating-based
attention mechanism that relies soft gating functions to select the relevant
features for person re-ID. In contrast, the proposed sampling-based attention
mechanism allows us to effectively trim irrelevant features by enforcing the
resultant feature masks to focus on the most discriminative features. It can
produce sharper attentions that are more assertive in localizing subtle
features relevant to re-identifying people across cameras. For this purpose, a
differentiable Gumbel-Softmax sampler is employed to approximate the Bernoulli
sampling to train the sharp attention networks. Extensive experimental
evaluations demonstrate the superiority of this new sharp attention model for
person re-ID over the other state-of-the-art methods on three challenging
benchmarks including CUHK03, Market-1501, and DukeMTMC-reID.Comment: accepted by IEEE Transactions on Circuits and Systems for Video
Technology(T-CSVT
Online Learning to Sample
Stochastic Gradient Descent (SGD) is one of the most widely used techniques
for online optimization in machine learning. In this work, we accelerate SGD by
adaptively learning how to sample the most useful training examples at each
time step. First, we show that SGD can be used to learn the best possible
sampling distribution of an importance sampling estimator. Second, we show that
the sampling distribution of an SGD algorithm can be estimated online by
incrementally minimizing the variance of the gradient. The resulting algorithm
- called Adaptive Weighted SGD (AW-SGD) - maintains a set of parameters to
optimize, as well as a set of parameters to sample learning examples. We show
that AWSGD yields faster convergence in three different applications: (i) image
classification with deep features, where the sampling of images depends on
their labels, (ii) matrix factorization, where rows and columns are not sampled
uniformly, and (iii) reinforcement learning, where the optimized and
exploration policies are estimated at the same time, where our approach
corresponds to an off-policy gradient algorithm.Comment: Update: removed convergence theorem and proof as there is an error.
Submitted to UAI 201
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Model-based reinforcement learning (RL) algorithms can attain excellent
sample efficiency, but often lag behind the best model-free algorithms in terms
of asymptotic performance. This is especially true with high-capacity
parametric function approximators, such as deep networks. In this paper, we
study how to bridge this gap, by employing uncertainty-aware dynamics models.
We propose a new algorithm called probabilistic ensembles with trajectory
sampling (PETS) that combines uncertainty-aware deep network dynamics models
with sampling-based uncertainty propagation. Our comparison to state-of-the-art
model-based and model-free deep RL algorithms shows that our approach matches
the asymptotic performance of model-free algorithms on several challenging
benchmark tasks, while requiring significantly fewer samples (e.g., 8 and 125
times fewer samples than Soft Actor Critic and Proximal Policy Optimization
respectively on the half-cheetah task).Comment: NIPS 2018, video and code available at
https://sites.google.com/view/drl-in-a-handful-of-trials
Visually-aware Recommendation with Aesthetic Features
Visual information plays a critical role in human decision-making process.
While recent developments on visually-aware recommender systems have taken the
product image into account, none of them has considered the aesthetic aspect.
We argue that the aesthetic factor is very important in modeling and predicting
users' preferences, especially for some fashion-related domains like clothing
and jewelry. This work addresses the need of modeling aesthetic information in
visually-aware recommender systems. Technically speaking, we make three key
contributions in leveraging deep aesthetic features: (1) To describe the
aesthetics of products, we introduce the aesthetic features extracted from
product images by a deep aesthetic network. We incorporate these features into
recommender system to model users' preferences in the aesthetic aspect. (2)
Since in clothing recommendation, time is very important for users to make
decision, we design a new tensor decomposition model for implicit feedback
data. The aesthetic features are then injected to the basic tensor model to
capture the temporal dynamics of aesthetic preferences (e.g., seasonal
patterns). (3) We also use the aesthetic features to optimize the learning
strategy on implicit feedback data. We enrich the pairwise training samples by
considering the similarity among items in the visual space and graph space; the
key idea is that a user may likely have similar perception on similar items. We
perform extensive experiments on several real-world datasets and demonstrate
the usefulness of aesthetic features and the effectiveness of our proposed
methods.Comment: Accepted by VLDBJ. arXiv admin note: substantial text overlap with
arXiv:1809.0582
Unsupervised Person Re-identification by Deep Learning Tracklet Association
Mostexistingpersonre-identification(re-id)methods relyon supervised model
learning on per-camera-pair manually labelled pairwise training data. This
leads to poor scalability in practical re-id deployment due to the lack of
exhaustive identity labelling of image positive and negative pairs for every
camera pair. In this work, we address this problem by proposing an unsupervised
re-id deep learning approach capable of incrementally discovering and
exploiting the underlying re-id discriminative information from automatically
generated person tracklet data from videos in an end-to-end model optimisation.
We formulate a Tracklet Association Unsupervised Deep Learning (TAUDL)
framework characterised by jointly learning per-camera (within-camera) tracklet
association (labelling) and cross-camera tracklet correlation by maximising the
discovery of most likely tracklet relationships across camera views. Extensive
experiments demonstrate the superiority of the proposed TAUDL model over the
state-of-the-art unsupervised and domain adaptation re- id methods using six
person re-id benchmarking datasets.Comment: ECCV 2018 Ora
Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-Aware Dynamics Modeling and Control
Koopman theory asserts that a nonlinear dynamical system can be mapped to a
linear system, where the Koopman operator advances observations of the state
forward in time. However, the observable functions that map states to
observations are generally unknown. We introduce the Deep Variational Koopman
(DVK) model, a method for inferring distributions over observations that can be
propagated linearly in time. By sampling from the inferred distributions, we
obtain a distribution over dynamical models, which in turn provides a
distribution over possible outcomes as a modeled system advances in time.
Experiments show that the DVK model is effective at long-term prediction for a
variety of dynamical systems. Furthermore, we describe how to incorporate the
learned models into a control framework, and demonstrate that accounting for
the uncertainty present in the distribution over dynamical models enables more
effective control.Comment: Accepted to the 2019 International Joint Conference on Artificial
Intelligence (IJCAI). 8 pages, 3 figure
Estimating Risk and Uncertainty in Deep Reinforcement Learning
Reinforcement learning agents are faced with two types of uncertainty.
Epistemic uncertainty stems from limited data and is useful for exploration,
whereas aleatoric uncertainty arises from stochastic environments and must be
accounted for in risk-sensitive applications. We highlight the challenges
involved in simultaneously estimating both of them, and propose a framework for
disentangling and estimating these uncertainties on learned Q-values. We derive
unbiased estimators of these uncertainties and introduce an uncertainty-aware
DQN algorithm, which we show exhibits safe learning behavior and outperforms
other DQN variants on the MinAtar testbed.Comment: Work presented at the ICML 2020 Workshop on Uncertainty and
Robustness in Deep Learnin
Deep Learning from Noisy Image Labels with Quality Embedding
There is an emerging trend to leverage noisy image datasets in many visual
recognition tasks. However, the label noise among the datasets severely
degenerates the \mbox{performance of deep} learning approaches. Recently, one
mainstream is to introduce the latent label to handle label noise, which has
shown promising improvement in the network designs. Nevertheless, the mismatch
between latent labels and noisy labels still affects the predictions in such
methods. To address this issue, we propose a quality embedding model, which
explicitly introduces a quality variable to represent the trustworthiness of
noisy labels. Our key idea is to identify the mismatch between the latent and
noisy labels by embedding the quality variables into different subspaces, which
effectively minimizes the noise effect. At the same time, the high-quality
labels is still able to be applied for training. To instantiate the model, we
further propose a Contrastive-Additive Noise network (CAN), which consists of
two important layers: (1) the contrastive layer estimates the quality variable
in the embedding space to reduce noise effect; and (2) the additive layer
aggregates the prior predictions and noisy labels as the posterior to train the
classifier. Moreover, to tackle the optimization difficulty, we deduce an SGD
algorithm with the reparameterization tricks, which makes our method scalable
to big data. We conduct the experimental evaluation of the proposed method over
a range of noisy image datasets. Comprehensive results have demonstrated CAN
outperforms the state-of-the-art deep learning approaches.Comment: Under review for Transactions on Image Processin
Non-Intrusive Reduced-Order Modeling Using Uncertainty-Aware Deep Neural Networks and Proper Orthogonal Decomposition: Application to Flood Modeling
Deep Learning research is advancing at a fantastic rate, and there is much to
gain from transferring this knowledge to older fields like Computational Fluid
Dynamics in practical engineering contexts. This work compares state-of-the-art
methods that address uncertainty quantification in Deep Neural Networks,
pushing forward the reduced-order modeling approach of Proper Orthogonal
Decomposition-Neural Networks (POD-NN) with Deep Ensembles and Variational
Inference-based Bayesian Neural Networks on two-dimensional problems in space.
These are first tested on benchmark problems, and then applied to a real-life
application: flooding predictions in the Mille \^Iles river in the Montreal,
Quebec, Canada metropolitan area. Our setup involves a set of input parameters,
with a potentially noisy distribution, and accumulates the simulation data
resulting from these parameters. The goal is to build a non-intrusive surrogate
model that is able to know when it does not know, which is still an open
research area in Neural Networks (and in AI in general). With the help of this
model, probabilistic flooding maps are generated, aware of the model
uncertainty. These insights on the unknown are also utilized for an uncertainty
propagation task, allowing for flooded area predictions that are broader and
safer than those made with a regular uncertainty-uninformed surrogate model.
Our study of the time-dependent and highly nonlinear case of a dam break is
also presented. Both the ensembles and the Bayesian approach lead to reliable
results for multiple smooth physical solutions, providing the correct warning
when going out-of-distribution. However, the former, referred to as POD-EnsNN,
proved much easier to implement and showed greater flexibility than the latter
in the case of discontinuities, where standard algorithms may oscillate or fail
to converge.Comment: To be published in the Journal of Computational Physic
Label-aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification
Extreme multi-label text classification (XMTC) aims at tagging a document
with most relevant labels from an extremely large-scale label set. It is a
challenging problem especially for the tail labels because there are only few
training documents to build classifier. This paper is motivated to better
explore the semantic relationship between each document and extreme labels by
taking advantage of both document content and label correlation. Our objective
is to establish an explicit label-aware representation for each document with a
hybrid attention deep neural network model(LAHA). LAHA consists of three parts.
The first part adopts a multi-label self-attention mechanism to detect the
contribution of each word to labels. The second part exploits the label
structure and document content to determine the semantic connection between
words and labels in a same latent space. An adaptive fusion strategy is
designed in the third part to obtain the final label-aware document
representation so that the essence of previous two parts can be sufficiently
integrated. Extensive experiments have been conducted on six benchmark datasets
by comparing with the state-of-the-art methods. The results show the
superiority of our proposed LAHA method, especially on the tail labels
- …