11,041 research outputs found
Image-based Text Classification using 2D Convolutional Neural Networks
We propose a new approach to text classification
in which we consider the input text as an image and apply
2D Convolutional Neural Networks to learn the local and
global semantics of the sentences from the variations of the
visual patterns of words. Our approach demonstrates that
it is possible to get semantically meaningful features from
images with text without using optical character recognition
and sequential processing pipelines, techniques that traditional
natural language processing algorithms require. To validate
our approach, we present results for two applications: text
classification and dialog modeling. Using a 2D Convolutional
Neural Network, we were able to outperform the state-ofart
accuracy results for a Chinese text classification task and
achieved promising results for seven English text classification
tasks. Furthermore, our approach outperformed the memory
networks without match types when using out of vocabulary
entities from Task 4 of the bAbI dialog dataset
Multi-Tier Annotations in the Verbmobil Corpus
In very large and diverse scientific projects where as different groups as linguists and engineers with different intentions work on the same signal data or its orthographic transcript and annotate new valuable information, it will not be easy to build a homogeneous corpus. We will describe how this can be achieved, considering the fact that some of these annotations have not been updated properly, or are based on erroneous or deliberately changed versions of the basis transcription. We used an algorithm similar to dynamic programming to detect differences between the transcription on which the annotation depends and the reference transcription for the whole corpus. These differences are automatically mapped on a set of repair operations for the transcriptions such as splitting compound words and merging neighbouring words. On the basis of these operations the correction process in the annotation is carried out. It always depends on the type of the annotation as well as on the position and the nature of the difference, whether a correction can be carried out automatically or has to be fixed manually. Finally we present a investigation in which we exploit the multi-tier annotations of the Verbmobil corpus to find out how breathing is correlated with prosodic-syntactic boundaries and dialog acts. 1
The Microsoft 2017 Conversational Speech Recognition System
We describe the 2017 version of Microsoft's conversational speech recognition
system, in which we update our 2016 system with recent developments in
neural-network-based acoustic and language modeling to further advance the
state of the art on the Switchboard speech recognition task. The system adds a
CNN-BLSTM acoustic model to the set of model architectures we combined
previously, and includes character-based and dialog session aware LSTM language
models in rescoring. For system combination we adopt a two-stage approach,
whereby subsets of acoustic models are first combined at the senone/frame
level, followed by a word-level voting via confusion networks. We also added a
confusion network rescoring step after system combination. The resulting system
yields a 5.1\% word error rate on the 2000 Switchboard evaluation set
- …