441 research outputs found
Deep attentive video summarization with distribution consistency learning
This article studies supervised video summarization by formulating it into a sequence-to-sequence learning framework, in which the input and output are sequences of original video frames and their predicted importance scores, respectively. Two critical issues are addressed in this article: short-term contextual attention insufficiency and distribution inconsistency. The former lies in the insufficiency of capturing the short-term contextual attention information within the video sequence itself since the existing approaches focus a lot on the long-term encoder-decoder attention. The latter refers to the distributions of predicted importance score sequence and the ground-truth sequence is inconsistent, which may lead to a suboptimal solution. To better mitigate the first issue, we incorporate a self-attention mechanism in the encoder to highlight the important keyframes in a short-term context. The proposed approach alongside the encoder-decoder attention constitutes our deep attentive models for video summarization. For the second one, we propose a distribution consistency learning method by employing a simple yet effective regularization loss term, which seeks a consistent distribution for the two sequences. Our final approach is dubbed as Attentive and Distribution consistent video Summarization (ADSum). Extensive experiments on benchmark data sets demonstrate the superiority of the proposed ADSum approach against state-of-the-art approaches
A Semi-Supervised Learning Approach for Ranging Error Mitigation Based on UWB Waveform
Localization systems based on ultra-wide band (UWB) measurements can have
unsatisfactory performance in harsh environments due to the presence of
non-line-of-sight (NLOS) errors. Learning-based methods for error mitigation
have shown great performance improvement via directly exploiting the wideband
waveform instead of handcrafted features. However, these methods require data
samples fully labeled with actual measurement errors for training, which leads
to time-consuming data collection. In this paper, we propose a semi-supervised
learning method based on variational Bayes for UWB ranging error mitigation.
Combining deep learning techniques and statistic tools, our method can
efficiently accumulate knowledge from both labeled and unlabeled data samples.
Extensive experiments illustrate the effectiveness of the proposed method under
different supervision rates, and the superiority compared to other fully
supervised methods even at a low supervision rate.Comment: 5 pages, 3 figures, Published in: MILCOM 2021 - 2021 IEEE Military
Communications Conference (MILCOM
Deep Generative Model for Simultaneous Range Error Mitigation and Environment Identification
Received waveforms contain rich information for both range information and
environment semantics. However, its full potential is hard to exploit under
multipath and non-line-of-sight conditions. This paper proposes a deep
generative model (DGM) for simultaneous range error mitigation and environment
identification. In particular, we present a Bayesian model for the generative
process of the received waveform composed by latent variables for both
range-related features and environment semantics. The simultaneous range error
mitigation and environment identification is interpreted as an inference
problem based on the DGM, and implemented in a unique end-to-end learning
scheme. Comprehensive experiments on a general Ultra-wideband dataset
demonstrate the superior performance on range error mitigation, scalability to
different environments, and novel capability on simultaneous environment
identification.Comment: 6 pages, 5 figures, Published in: 2021 IEEE Global Communications
Conference (GLOBECOM
Variational Bayesian Framework for Advanced Image Generation with Domain-Related Variables
Deep generative models (DGMs) and their conditional counterparts provide a
powerful ability for general-purpose generative modeling of data distributions.
However, it remains challenging for existing methods to address advanced
conditional generative problems without annotations, which can enable multiple
applications like image-to-image translation and image editing. We present a
unified Bayesian framework for such problems, which introduces an inference
stage on latent variables within the learning process. In particular, we
propose a variational Bayesian image translation network (VBITN) that enables
multiple image translation and editing tasks. Comprehensive experiments show
the effectiveness of our method on unsupervised image-to-image translation, and
demonstrate the novel advanced capabilities for semantic editing and mixed
domain translation.Comment: 5 pages, 2 figures
Generalized Expectation Maximization Framework for Blind Image Super Resolution
Learning-based methods for blind single image super resolution (SISR) conduct
the restoration by a learned mapping between high-resolution (HR) images and
their low-resolution (LR) counterparts degraded with arbitrary blur kernels.
However, these methods mostly require an independent step to estimate the blur
kernel, leading to error accumulation between steps. We propose an end-to-end
learning framework for the blind SISR problem, which enables image restoration
within a unified Bayesian framework with either full- or semi-supervision. The
proposed method, namely SREMN, integrates learning techniques into the
generalized expectation-maximization (GEM) algorithm and infers HR images from
the maximum likelihood estimation (MLE). Extensive experiments show the
superiority of the proposed method with comparison to existing work and novelty
in semi-supervised learning
DeepKriging: Spatially Dependent Deep Neural Networks for Spatial Prediction
In spatial statistics, a common objective is to predict the values of a
spatial process at unobserved locations by exploiting spatial dependence. In
geostatistics, Kriging provides the best linear unbiased predictor using
covariance functions and is often associated with Gaussian processes. However,
when considering non-linear prediction for non-Gaussian and categorical data,
the Kriging prediction is not necessarily optimal, and the associated variance
is often overly optimistic. We propose to use deep neural networks (DNNs) for
spatial prediction. Although DNNs are widely used for general classification
and prediction, they have not been studied thoroughly for data with spatial
dependence. In this work, we propose a novel neural network structure for
spatial prediction by adding an embedding layer of spatial coordinates with
basis functions. We show in theory that the proposed DeepKriging method has
multiple advantages over Kriging and classical DNNs only with spatial
coordinates as features. We also provide density prediction for uncertainty
quantification without any distributional assumption and apply the method to
PM concentrations across the continental United States
- …