298 research outputs found

    A Hybrid Siamese Neural Network for Natural Language Inference in Cyber-Physical Systems

    Get PDF
    Cyber-Physical Systems (CPS), as a multi-dimensional complex system that connects the physical world and the cyber world, has a strong demand for processing large amounts of heterogeneous data. These tasks also include Natural Language Inference (NLI) tasks based on text from different sources. However, the current research on natural language processing in CPS does not involve exploration in this field. Therefore, this study proposes a Siamese Network structure that combines Stacked Residual Long Short-Term Memory (bidirectional) with the Attention mechanism and Capsule Network for the NLI module in CPS, which is used to infer the relationship between text/language data from different sources. This model is mainly used to implement NLI tasks and conduct a detailed evaluation in three main NLI benchmarks as the basic semantic understanding module in CPS. Comparative experiments prove that the proposed method achieves competitive performance, has a certain generalization ability, and can balance the performance and the number of trained parameters

    Intention Detection Based on Siamese Neural Network With Triplet Loss

    Get PDF
    Understanding the user's intention is an essential task for the spoken language understanding (SLU) module in the dialogue system, which further illustrates vital information for managing and generating future action and response. In this paper, we propose a triplet training framework based on the multiclass classification approach to conduct the training for the intention detection task. Precisely, we utilize a Siamese neural network architecture with metric learning to construct a robust and discriminative utterance feature embedding model. We modified the RMCNN model and fine-tuned BERT model as Siamese encoders to train utterance triplets from different semantic aspects. The triplet loss can effectively distinguish the details of two input data by learning a mapping from sequence utterances to a compact Euclidean space. After generating the mapping, the intention detection task can be easily implemented using standard techniques with pre-trained embeddings as feature vectors. Besides, we use the fusion strategy to enhance utterance feature representation in the downstream of intention detection task. We conduct experiments on several benchmark datasets of intention detection task: Snips dataset, ATIS dataset, Facebook multilingual task-oriented datasets, Daily Dialogue dataset, and MRDA dataset. The results illustrate that the proposed method can effectively improve the recognition performance of these datasets and achieves new state-of-the-art results on single-turn task-oriented datasets (Snips dataset, Facebook dataset), and a multi-turn dataset (Daily Dialogue dataset)

    Leveraging Expert Models for Training Deep Neural Networks in Scarce Data Domains: Application to Offline Handwritten Signature Verification

    Full text link
    This paper introduces a novel approach to leverage the knowledge of existing expert models for training new Convolutional Neural Networks, on domains where task-specific data are limited or unavailable. The presented scheme is applied in offline handwritten signature verification (OffSV) which, akin to other biometric applications, suffers from inherent data limitations due to regulatory restrictions. The proposed Student-Teacher (S-T) configuration utilizes feature-based knowledge distillation (FKD), combining graph-based similarity for local activations with global similarity measures to supervise student's training, using only handwritten text data. Remarkably, the models trained using this technique exhibit comparable, if not superior, performance to the teacher model across three popular signature datasets. More importantly, these results are attained without employing any signatures during the feature extraction training process. This study demonstrates the efficacy of leveraging existing expert models to overcome data scarcity challenges in OffSV and potentially other related domains

    TransNFCM: Translation-Based Neural Fashion Compatibility Modeling

    Full text link
    Identifying mix-and-match relationships between fashion items is an urgent task in a fashion e-commerce recommender system. It will significantly enhance user experience and satisfaction. However, due to the challenges of inferring the rich yet complicated set of compatibility patterns in a large e-commerce corpus of fashion items, this task is still underexplored. Inspired by the recent advances in multi-relational knowledge representation learning and deep neural networks, this paper proposes a novel Translation-based Neural Fashion Compatibility Modeling (TransNFCM) framework, which jointly optimizes fashion item embeddings and category-specific complementary relations in a unified space via an end-to-end learning manner. TransNFCM places items in a unified embedding space where a category-specific relation (category-comp-category) is modeled as a vector translation operating on the embeddings of compatible items from the corresponding categories. By this way, we not only capture the specific notion of compatibility conditioned on a specific pair of complementary categories, but also preserve the global notion of compatibility. We also design a deep fashion item encoder which exploits the complementary characteristic of visual and textual features to represent the fashion products. To the best of our knowledge, this is the first work that uses category-specific complementary relations to model the category-aware compatibility between items in a translation-based embedding space. Extensive experiments demonstrate the effectiveness of TransNFCM over the state-of-the-arts on two real-world datasets.Comment: Accepted in AAAI 2019 conferenc
    corecore