194,002 research outputs found
Context-Aware Deep Sequence Learning with Multi-View Factor Pooling for Time Series Classification
In this paper, we propose an effective, multi-view, multivariate deep classification model for time-series data. Multi-view methods show promise in their ability to learn correlation and exclusivity properties across different independent information resources. However, most current multi-view integration schemes employ only a linear model and, therefore, do not extensively utilize the relationships observed across different view-specific representations. Moreover, the majority of these methods rely exclusively on sophisticated, handcrafted features to capture local data patterns and, thus, depend heavily on large collections of labeled data. The multi-view, multivariate deep classification model for time-series data proposed in this paper makes important contributions to address these limitations. The proposed model derives a LSTM-based, deep feature descriptor to model both the view-specific data characteristics and cross-view interaction in an integrated deep architecture while driving the learning phase in a data-driven manner. The proposed model employs a compact context descriptor to exploit view-specific affinity information to design a more insightful context representation. Finally, the model uses a multi-view factor-pooling scheme for a context-driven attention learning strategy to weigh the most relevant feature dimensions while eliminating noise from the resulting fused descriptor. As shown by experiments, compared to the existing multi-view methods, the proposed multi-view deep sequential learning approach improves classification performance by roughly 4% in the UCI multi-view activity recognition dataset, while also showing significantly robust generalized representation capacity against its single-view counterparts, in classifying several large-scale multi-view light curve collections
A Clustering-guided Contrastive Fusion for Multi-view Representation Learning
The past two decades have seen increasingly rapid advances in the field of
multi-view representation learning due to it extracting useful information from
diverse domains to facilitate the development of multi-view applications.
However, the community faces two challenges: i) how to learn robust
representations from a large amount of unlabeled data to against noise or
incomplete views setting, and ii) how to balance view consistency and
complementary for various downstream tasks. To this end, we utilize a deep
fusion network to fuse view-specific representations into the view-common
representation, extracting high-level semantics for obtaining robust
representation. In addition, we employ a clustering task to guide the fusion
network to prevent it from leading to trivial solutions. For balancing
consistency and complementary, then, we design an asymmetrical contrastive
strategy that aligns the view-common representation and each view-specific
representation. These modules are incorporated into a unified method known as
CLustering-guided cOntrastiVE fusioN (CLOVEN). We quantitatively and
qualitatively evaluate the proposed method on five datasets, demonstrating that
CLOVEN outperforms 11 competitive multi-view learning methods in clustering and
classification. In the incomplete view scenario, our proposed method resists
noise interference better than those of our competitors. Furthermore, the
visualization analysis shows that CLOVEN can preserve the intrinsic structure
of view-specific representation while also improving the compactness of
view-commom representation. Our source code will be available soon at
https://github.com/guanzhou-ke/cloven.Comment: 13 pages, 9 figure
Context-Aware Deep Sequence Learning with Multi-View Factor Pooling for Time Series Classification
In this paper, we propose an effective, multi-view, multivariate deep classification model for time-series data. Multi-view methods show promise in their ability to learn correlation and exclusivity properties across different independent information resources. However, most current multi-view integration schemes employ only a linear model and, therefore, do not extensively utilize the relationships observed across different view-specific representations. Moreover, the majority of these methods rely exclusively on sophisticated, handcrafted features to capture local data patterns and, thus, depend heavily on large collections of labeled data. The multi-view, multivariate deep classification model for time-series data proposed in this paper makes important contributions to address these limitations. The proposed model derives a LSTM-based, deep feature descriptor to model both the view-specific data characteristics and cross-view interaction in an integrated deep architecture while driving the learning phase in a data-driven manner. The proposed model employs a compact context descriptor to exploit view-specific affinity information to design a more insightful context representation. Finally, the model uses a multi-view factor-pooling scheme for a context-driven attention learning strategy to weigh the most relevant feature dimensions while eliminating noise from the resulting fused descriptor. As shown by experiments, compared to the existing multi-view methods, the proposed multi-view deep sequential learning approach improves classification performance by roughly 4% in the UCI multi-view activity recognition dataset, while also showing significantly robust generalized representation capacity against its single-view counterparts, in classifying several large-scale multi-view light curve collections
Object Detection and Classification in Occupancy Grid Maps using Deep Convolutional Networks
A detailed environment perception is a crucial component of automated
vehicles. However, to deal with the amount of perceived information, we also
require segmentation strategies. Based on a grid map environment
representation, well-suited for sensor fusion, free-space estimation and
machine learning, we detect and classify objects using deep convolutional
neural networks. As input for our networks we use a multi-layer grid map
efficiently encoding 3D range sensor information. The inference output consists
of a list of rotated bounding boxes with associated semantic classes. We
conduct extensive ablation studies, highlight important design considerations
when using grid maps and evaluate our models on the KITTI Bird's Eye View
benchmark. Qualitative and quantitative benchmark results show that we achieve
robust detection and state of the art accuracy solely using top-view grid maps
from range sensor data.Comment: 6 pages, 4 tables, 4 figure
Multi Branch Siamese Network For Person Re-Identification
To capture robust person features, learning discriminative,
style and view invariant descriptors is a key challenge in person Re-Identification (re-id). Most deep Re-ID models learn
single scale feature representation which are unable to grasp
compact and style invariant representations. In this paper,
we present a multi branch Siamese Deep Neural Network
with multiple classifiers to overcome the above issues. The
multi-branch learning of the network creates a stronger descriptor with fine-grained information from global features of
a person. Camera to camera image translation is performed
with generative adversarial network to generate diverse data
and add style invariance in learned features. Experimental
results on benchmark datasets demonstrate that the proposed
method performs better than other state of the arts methods
Uncertainty-guided Boundary Learning for Imbalanced Social Event Detection
Real-world social events typically exhibit a severe class-imbalance
distribution, which makes the trained detection model encounter a serious
generalization challenge. Most studies solve this problem from the frequency
perspective and emphasize the representation or classifier learning for tail
classes. While in our observation, compared to the rarity of classes, the
calibrated uncertainty estimated from well-trained evidential deep learning
networks better reflects model performance. To this end, we propose a novel
uncertainty-guided class imbalance learning framework - UCL, and its
variant - UCL-EC, for imbalanced social event detection tasks. We aim
to improve the overall model performance by enhancing model generalization to
those uncertain classes. Considering performance degradation usually comes from
misclassifying samples as their confusing neighboring classes, we focus on
boundary learning in latent space and classifier learning with high-quality
uncertainty estimation. First, we design a novel uncertainty-guided contrastive
learning loss, namely UCL and its variant - UCL-EC, to manipulate
distinguishable representation distribution for imbalanced data. During
training, they force all classes, especially uncertain ones, to adaptively
adjust a clear separable boundary in the feature space. Second, to obtain more
robust and accurate class uncertainty, we combine the results of multi-view
evidential classifiers via the Dempster-Shafer theory under the supervision of
an additional calibration method. We conduct experiments on three severely
imbalanced social event datasets including Events2012\_100, Events2018\_100,
and CrisisLexT\_7. Our model significantly improves social event representation
and classification tasks in almost all classes, especially those uncertain
ones.Comment: Accepted by TKDE 202
- …