52,752 research outputs found
Spoken Language Intent Detection using Confusion2Vec
Decoding speaker's intent is a crucial part of spoken language understanding
(SLU). The presence of noise or errors in the text transcriptions, in real life
scenarios make the task more challenging. In this paper, we address the spoken
language intent detection under noisy conditions imposed by automatic speech
recognition (ASR) systems. We propose to employ confusion2vec word feature
representation to compensate for the errors made by ASR and to increase the
robustness of the SLU system. The confusion2vec, motivated from human speech
production and perception, models acoustic relationships between words in
addition to the semantic and syntactic relations of words in human language. We
hypothesize that ASR often makes errors relating to acoustically similar words,
and the confusion2vec with inherent model of acoustic relationships between
words is able to compensate for the errors. We demonstrate through experiments
on the ATIS benchmark dataset, the robustness of the proposed model to achieve
state-of-the-art results under noisy ASR conditions. Our system reduces
classification error rate (CER) by 20.84% and improves robustness by 37.48%
(lower CER degradation) relative to the previous state-of-the-art going from
clean to noisy transcripts. Improvements are also demonstrated when training
the intent detection models on noisy transcripts
DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases
Keyphrase extraction from documents is useful to a variety of applications
such as information retrieval and document summarization. This paper presents
an end-to-end method called DivGraphPointer for extracting a set of diversified
keyphrases from a document. DivGraphPointer combines the advantages of
traditional graph-based ranking methods and recent neural network-based
approaches. Specifically, given a document, a word graph is constructed from
the document based on word proximity and is encoded with graph convolutional
networks, which effectively capture document-level word salience by modeling
long-range dependency between words in the document and aggregating multiple
appearances of identical words into one node. Furthermore, we propose a
diversified point network to generate a set of diverse keyphrases out of the
word graph in the decoding process. Experimental results on five benchmark data
sets show that our proposed method significantly outperforms the existing
state-of-the-art approaches.Comment: Accepted to SIGIR 201
Deep Generative Modeling of LiDAR Data
Building models capable of generating structured output is a key challenge
for AI and robotics. While generative models have been explored on many types
of data, little work has been done on synthesizing lidar scans, which play a
key role in robot mapping and localization. In this work, we show that one can
adapt deep generative models for this task by unravelling lidar scans into a 2D
point map. Our approach can generate high quality samples, while simultaneously
learning a meaningful latent representation of the data. We demonstrate
significant improvements against state-of-the-art point cloud generation
methods. Furthermore, we propose a novel data representation that augments the
2D signal with absolute positional information. We show that this helps
robustness to noisy and imputed input; the learned model can recover the
underlying lidar scan from seemingly uninformative dataComment: Presented at IROS 201
- …