194,002 research outputs found

    Context-Aware Deep Sequence Learning with Multi-View Factor Pooling for Time Series Classification

    Get PDF
    In this paper, we propose an effective, multi-view, multivariate deep classification model for time-series data. Multi-view methods show promise in their ability to learn correlation and exclusivity properties across different independent information resources. However, most current multi-view integration schemes employ only a linear model and, therefore, do not extensively utilize the relationships observed across different view-specific representations. Moreover, the majority of these methods rely exclusively on sophisticated, handcrafted features to capture local data patterns and, thus, depend heavily on large collections of labeled data. The multi-view, multivariate deep classification model for time-series data proposed in this paper makes important contributions to address these limitations. The proposed model derives a LSTM-based, deep feature descriptor to model both the view-specific data characteristics and cross-view interaction in an integrated deep architecture while driving the learning phase in a data-driven manner. The proposed model employs a compact context descriptor to exploit view-specific affinity information to design a more insightful context representation. Finally, the model uses a multi-view factor-pooling scheme for a context-driven attention learning strategy to weigh the most relevant feature dimensions while eliminating noise from the resulting fused descriptor. As shown by experiments, compared to the existing multi-view methods, the proposed multi-view deep sequential learning approach improves classification performance by roughly 4% in the UCI multi-view activity recognition dataset, while also showing significantly robust generalized representation capacity against its single-view counterparts, in classifying several large-scale multi-view light curve collections

    A Clustering-guided Contrastive Fusion for Multi-view Representation Learning

    Full text link
    The past two decades have seen increasingly rapid advances in the field of multi-view representation learning due to it extracting useful information from diverse domains to facilitate the development of multi-view applications. However, the community faces two challenges: i) how to learn robust representations from a large amount of unlabeled data to against noise or incomplete views setting, and ii) how to balance view consistency and complementary for various downstream tasks. To this end, we utilize a deep fusion network to fuse view-specific representations into the view-common representation, extracting high-level semantics for obtaining robust representation. In addition, we employ a clustering task to guide the fusion network to prevent it from leading to trivial solutions. For balancing consistency and complementary, then, we design an asymmetrical contrastive strategy that aligns the view-common representation and each view-specific representation. These modules are incorporated into a unified method known as CLustering-guided cOntrastiVE fusioN (CLOVEN). We quantitatively and qualitatively evaluate the proposed method on five datasets, demonstrating that CLOVEN outperforms 11 competitive multi-view learning methods in clustering and classification. In the incomplete view scenario, our proposed method resists noise interference better than those of our competitors. Furthermore, the visualization analysis shows that CLOVEN can preserve the intrinsic structure of view-specific representation while also improving the compactness of view-commom representation. Our source code will be available soon at https://github.com/guanzhou-ke/cloven.Comment: 13 pages, 9 figure

    Context-Aware Deep Sequence Learning with Multi-View Factor Pooling for Time Series Classification

    Get PDF
    In this paper, we propose an effective, multi-view, multivariate deep classification model for time-series data. Multi-view methods show promise in their ability to learn correlation and exclusivity properties across different independent information resources. However, most current multi-view integration schemes employ only a linear model and, therefore, do not extensively utilize the relationships observed across different view-specific representations. Moreover, the majority of these methods rely exclusively on sophisticated, handcrafted features to capture local data patterns and, thus, depend heavily on large collections of labeled data. The multi-view, multivariate deep classification model for time-series data proposed in this paper makes important contributions to address these limitations. The proposed model derives a LSTM-based, deep feature descriptor to model both the view-specific data characteristics and cross-view interaction in an integrated deep architecture while driving the learning phase in a data-driven manner. The proposed model employs a compact context descriptor to exploit view-specific affinity information to design a more insightful context representation. Finally, the model uses a multi-view factor-pooling scheme for a context-driven attention learning strategy to weigh the most relevant feature dimensions while eliminating noise from the resulting fused descriptor. As shown by experiments, compared to the existing multi-view methods, the proposed multi-view deep sequential learning approach improves classification performance by roughly 4% in the UCI multi-view activity recognition dataset, while also showing significantly robust generalized representation capacity against its single-view counterparts, in classifying several large-scale multi-view light curve collections

    Object Detection and Classification in Occupancy Grid Maps using Deep Convolutional Networks

    Full text link
    A detailed environment perception is a crucial component of automated vehicles. However, to deal with the amount of perceived information, we also require segmentation strategies. Based on a grid map environment representation, well-suited for sensor fusion, free-space estimation and machine learning, we detect and classify objects using deep convolutional neural networks. As input for our networks we use a multi-layer grid map efficiently encoding 3D range sensor information. The inference output consists of a list of rotated bounding boxes with associated semantic classes. We conduct extensive ablation studies, highlight important design considerations when using grid maps and evaluate our models on the KITTI Bird's Eye View benchmark. Qualitative and quantitative benchmark results show that we achieve robust detection and state of the art accuracy solely using top-view grid maps from range sensor data.Comment: 6 pages, 4 tables, 4 figure

    Multi Branch Siamese Network For Person Re-Identification

    Get PDF
    To capture robust person features, learning discriminative, style and view invariant descriptors is a key challenge in person Re-Identification (re-id). Most deep Re-ID models learn single scale feature representation which are unable to grasp compact and style invariant representations. In this paper, we present a multi branch Siamese Deep Neural Network with multiple classifiers to overcome the above issues. The multi-branch learning of the network creates a stronger descriptor with fine-grained information from global features of a person. Camera to camera image translation is performed with generative adversarial network to generate diverse data and add style invariance in learned features. Experimental results on benchmark datasets demonstrate that the proposed method performs better than other state of the arts methods

    Uncertainty-guided Boundary Learning for Imbalanced Social Event Detection

    Full text link
    Real-world social events typically exhibit a severe class-imbalance distribution, which makes the trained detection model encounter a serious generalization challenge. Most studies solve this problem from the frequency perspective and emphasize the representation or classifier learning for tail classes. While in our observation, compared to the rarity of classes, the calibrated uncertainty estimated from well-trained evidential deep learning networks better reflects model performance. To this end, we propose a novel uncertainty-guided class imbalance learning framework - UCLSED_{SED}, and its variant - UCL-ECSED_{SED}, for imbalanced social event detection tasks. We aim to improve the overall model performance by enhancing model generalization to those uncertain classes. Considering performance degradation usually comes from misclassifying samples as their confusing neighboring classes, we focus on boundary learning in latent space and classifier learning with high-quality uncertainty estimation. First, we design a novel uncertainty-guided contrastive learning loss, namely UCL and its variant - UCL-EC, to manipulate distinguishable representation distribution for imbalanced data. During training, they force all classes, especially uncertain ones, to adaptively adjust a clear separable boundary in the feature space. Second, to obtain more robust and accurate class uncertainty, we combine the results of multi-view evidential classifiers via the Dempster-Shafer theory under the supervision of an additional calibration method. We conduct experiments on three severely imbalanced social event datasets including Events2012\_100, Events2018\_100, and CrisisLexT\_7. Our model significantly improves social event representation and classification tasks in almost all classes, especially those uncertain ones.Comment: Accepted by TKDE 202
    corecore