201 research outputs found
Intention Detection Based on Siamese Neural Network With Triplet Loss
Understanding the user's intention is an essential task for the spoken language understanding (SLU) module in the dialogue system, which further illustrates vital information for managing and generating future action and response. In this paper, we propose a triplet training framework based on the multiclass classification approach to conduct the training for the intention detection task. Precisely, we utilize a Siamese neural network architecture with metric learning to construct a robust and discriminative utterance feature embedding model. We modified the RMCNN model and fine-tuned BERT model as Siamese encoders to train utterance triplets from different semantic aspects. The triplet loss can effectively distinguish the details of two input data by learning a mapping from sequence utterances to a compact Euclidean space. After generating the mapping, the intention detection task can be easily implemented using standard techniques with pre-trained embeddings as feature vectors. Besides, we use the fusion strategy to enhance utterance feature representation in the downstream of intention detection task. We conduct experiments on several benchmark datasets of intention detection task: Snips dataset, ATIS dataset, Facebook multilingual task-oriented datasets, Daily Dialogue dataset, and MRDA dataset. The results illustrate that the proposed method can effectively improve the recognition performance of these datasets and achieves new state-of-the-art results on single-turn task-oriented datasets (Snips dataset, Facebook dataset), and a multi-turn dataset (Daily Dialogue dataset)
Siamese-Network Based Signature Verification using Self Supervised Learning
The use of signatures is often encountered in various public documents ranging from academic documents to business documents that are a sign that the existence of signatures is crucial in various administrative processes. The frequent use of signatures does not mean a procedure without loopholes, but we must remain vigilant against signature falsification carried out with various motives behind it. Therefore, in this study, a signature verification system was developed that could prevent the falsification of signatures in public documents by using digital imagery of existing signatures. This study used neural networks with siamese network-based architectures that also empower self-supervised learning techniques to improve accuracy in the realm of limited data. The final evaluation of the machine learning method used gets a maximum accuracy of 83% and this result is better than the machine learning model that does not involve self-supervised learning methods
Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning
To operate intelligently in domestic environments, robots require the ability
to understand arbitrary spatial relations between objects and to generalize
them to objects of varying sizes and shapes. In this work, we present a novel
end-to-end approach to generalize spatial relations based on distance metric
learning. We train a neural network to transform 3D point clouds of objects to
a metric space that captures the similarity of the depicted spatial relations,
using only geometric models of the objects. Our approach employs gradient-based
optimization to compute object poses in order to imitate an arbitrary target
relation by reducing the distance to it under the learned metric. Our results
based on simulated and real-world experiments show that the proposed method
enables robots to generalize spatial relations to unknown objects over a
continuous spectrum.Comment: Accepted for publication at ICRA2018. Supplementary Video:
http://spatialrelations.cs.uni-freiburg.de
Attention Mechanism for Recognition in Computer Vision
It has been proven that humans do not focus their attention on an entire scene at once when they perform a recognition task. Instead, they pay attention to the most important parts of the scene to extract the most discriminative information. Inspired by this observation, in this dissertation, the importance of attention mechanism in recognition tasks in computer vision is studied by designing novel attention-based models. In specific, four scenarios are investigated that represent the most important aspects of attention mechanism.First, an attention-based model is designed to reduce the visual features\u27 dimensionality by selectively processing only a small subset of the data. We study this aspect of the attention mechanism in a framework based on object recognition in distributed camera networks. Second, an attention-based image retrieval system (i.e., person re-identification) is proposed which learns to focus on the most discriminative regions of the person\u27s image and process those regions with higher computation power using a deep convolutional neural network. Furthermore, we show how visualizing the attention maps can make deep neural networks more interpretable. In other words, by visualizing the attention maps we can observe the regions of the input image where the neural network relies on, in order to make a decision. Third, a model for estimating the importance of the objects in a scene based on a given task is proposed. More specifically, the proposed model estimates the importance of the road users that a driver (or an autonomous vehicle) should pay attention to in a driving scenario in order to have safe navigation. In this scenario, the attention estimation is the final output of the model. Fourth, an attention-based module and a new loss function in a meta-learning based few-shot learning system is proposed in order to incorporate the context of the task into the feature representations of the samples and increasing the few-shot recognition accuracy.In this dissertation, we showed that attention can be multi-facet and studied the attention mechanism from the perspectives of feature selection, reducing the computational cost, interpretable deep learning models, task-driven importance estimation, and context incorporation. Through the study of four scenarios, we further advanced the field of where \u27\u27attention is all you need\u27\u27
Recommended from our members
Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features
In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first prob-lem, claim check-worthiness prediction, we explore the fusion of syntac-tic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similar-ity, and perform KD-search to retrieve verified claims with respect to a query tweet
- …