17,694 research outputs found
Learning Dynamic Feature Selection for Fast Sequential Prediction
We present paired learning and inference algorithms for significantly
reducing computation and increasing speed of the vector dot products in the
classifiers that are at the heart of many NLP components. This is accomplished
by partitioning the features into a sequence of templates which are ordered
such that high confidence can often be reached using only a small fraction of
all features. Parameter estimation is arranged to maximize accuracy and early
confidence in this sequence. Our approach is simpler and better suited to NLP
than other related cascade methods. We present experiments in left-to-right
part-of-speech tagging, named entity recognition, and transition-based
dependency parsing. On the typical benchmarking datasets we can preserve POS
tagging accuracy above 97% and parsing LAS above 88.5% both with over a
five-fold reduction in run-time, and NER F1 above 88 with more than 2x increase
in speed.Comment: Appears in The 53rd Annual Meeting of the Association for
Computational Linguistics, Beijing, China, July 201
Deep Learning: Our Miraculous Year 1990-1991
In 2020, we will celebrate that many of the basic ideas behind the deep
learning revolution were published three decades ago within fewer than 12
months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich.
Back then, few people were interested, but a quarter century later, neural
networks based on these ideas were on over 3 billion devices such as
smartphones, and used many billions of times per day, consuming a significant
fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201
Multitask Learning with CTC and Segmental CRF for Speech Recognition
Segmental conditional random fields (SCRFs) and connectionist temporal
classification (CTC) are two sequence labeling methods used for end-to-end
training of speech recognition models. Both models define a transcription
probability by marginalizing decisions about latent segmentation alternatives
to derive a sequence probability: the former uses a globally normalized joint
model of segment labels and durations, and the latter classifies each frame as
either an output symbol or a "continuation" of the previous label. In this
paper, we train a recognition model by optimizing an interpolation between the
SCRF and CTC losses, where the same recurrent neural network (RNN) encoder is
used for feature extraction for both outputs. We find that this multitask
objective improves recognition accuracy when decoding with either the SCRF or
CTC models. Additionally, we show that CTC can also be used to pretrain the RNN
encoder, which improves the convergence rate when learning the joint model.Comment: 5 pages, 2 figures, camera ready version at Interspeech 201
Fast N-Gram Language Model Look-Ahead for Decoders With Static Pronunciation Prefix Trees
Decoders that make use of token-passing restrict their search space by various types of token pruning. With use of the Language Model Look-Ahead (LMLA) technique it is possible to increase the number of tokens that can be pruned without loss of decoding precision. Unfortunately, for token passing decoders that use single static pronunciation prefix trees, full n-gram LMLA increases the needed number of language model probability calculations considerably. In this paper a method for applying full n-gram LMLA in a decoder with a single static pronunciation tree is introduced. The experiments show that this method improves the speed of the decoder without an increase of search errors.\u
A quick search method for audio signals based on a piecewise linear representation of feature trajectories
This paper presents a new method for a quick similarity-based search through
long unlabeled audio streams to detect and locate audio clips provided by
users. The method involves feature-dimension reduction based on a piecewise
linear representation of a sequential feature trajectory extracted from a long
audio stream. Two techniques enable us to obtain a piecewise linear
representation: the dynamic segmentation of feature trajectories and the
segment-based Karhunen-L\'{o}eve (KL) transform. The proposed search method
guarantees the same search results as the search method without the proposed
feature-dimension reduction method in principle. Experiment results indicate
significant improvements in search speed. For example the proposed method
reduced the total search time to approximately 1/12 that of previous methods
and detected queries in approximately 0.3 seconds from a 200-hour audio
database.Comment: 20 pages, to appear in IEEE Transactions on Audio, Speech and
Language Processin
Speeding up Simplification of Polygonal Curves using Nested Approximations
We develop a multiresolution approach to the problem of polygonal curve
approximation. We show theoretically and experimentally that, if the
simplification algorithm A used between any two successive levels of resolution
satisfies some conditions, the multiresolution algorithm MR will have a
complexity lower than the complexity of A. In particular, we show that if A has
a O(N2/K) complexity (the complexity of a reduced search dynamic solution
approach), where N and K are respectively the initial and the final number of
segments, the complexity of MR is in O(N).We experimentally compare the
outcomes of MR with those of the optimal "full search" dynamic programming
solution and of classical merge and split approaches. The experimental
evaluations confirm the theoretical derivations and show that the proposed
approach evaluated on 2D coastal maps either shows a lower complexity or
provides polygonal approximations closer to the initial curves.Comment: 12 pages + figure
- …