1,510,692 research outputs found
Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization
An efficient algorithm for recurrent neural network training is presented.
The approach increases the training speed for tasks where a length of the input
sequence may vary significantly. The proposed approach is based on the optimal
batch bucketing by input sequence length and data parallelization on multiple
graphical processing units. The baseline training performance without sequence
bucketing is compared with the proposed solution for a different number of
buckets. An example is given for the online handwriting recognition task using
an LSTM recurrent neural network. The evaluation is performed in terms of the
wall clock time, number of epochs, and validation loss value.Comment: 4 pages, 5 figures, Comments, 2016 IEEE First International
Conference on Data Stream Mining & Processing (DSMP), Lviv, 201
The Wavelet Trie: Maintaining an Indexed Sequence of Strings in Compressed Space
An indexed sequence of strings is a data structure for storing a string
sequence that supports random access, searching, range counting and analytics
operations, both for exact matches and prefix search. String sequences lie at
the core of column-oriented databases, log processing, and other storage and
query tasks. In these applications each string can appear several times and the
order of the strings in the sequence is relevant. The prefix structure of the
strings is relevant as well: common prefixes are sought in strings to extract
interesting features from the sequence. Moreover, space-efficiency is highly
desirable as it translates directly into higher performance, since more data
can fit in fast memory.
We introduce and study the problem of compressed indexed sequence of strings,
representing indexed sequences of strings in nearly-optimal compressed space,
both in the static and dynamic settings, while preserving provably good
performance for the supported operations.
We present a new data structure for this problem, the Wavelet Trie, which
combines the classical Patricia Trie with the Wavelet Tree, a succinct data
structure for storing a compressed sequence. The resulting Wavelet Trie
smoothly adapts to a sequence of strings that changes over time. It improves on
the state-of-the-art compressed data structures by supporting a dynamic
alphabet (i.e. the set of distinct strings) and prefix queries, both crucial
requirements in the aforementioned applications, and on traditional indexes by
reducing space occupancy to close to the entropy of the sequence
Gradient-based Inference for Networks with Output Constraints
Practitioners apply neural networks to increasingly complex problems in
natural language processing, such as syntactic parsing and semantic role
labeling that have rich output structures. Many such structured-prediction
problems require deterministic constraints on the output values; for example,
in sequence-to-sequence syntactic parsing, we require that the sequential
outputs encode valid trees. While hidden units might capture such properties,
the network is not always able to learn such constraints from the training data
alone, and practitioners must then resort to post-processing. In this paper, we
present an inference method for neural networks that enforces deterministic
constraints on outputs without performing rule-based post-processing or
expensive discrete search. Instead, in the spirit of gradient-based training,
we enforce constraints with gradient-based inference (GBI): for each input at
test-time, we nudge continuous model weights until the network's unconstrained
inference procedure generates an output that satisfies the constraints. We
study the efficacy of GBI on three tasks with hard constraints: semantic role
labeling, syntactic parsing, and sequence transduction. In each case, the
algorithm not only satisfies constraints but improves accuracy, even when the
underlying network is state-of-the-art.Comment: AAAI 201
AER Neuro-Inspired interface to Anthropomorphic Robotic Hand
Address-Event-Representation (AER) is a
communication protocol for transferring asynchronous events
between VLSI chips, originally developed for neuro-inspired
processing systems (for example, image processing). Such
systems may consist of a complicated hierarchical structure
with many chips that transmit data among them in real time,
while performing some processing (for example, convolutions).
The information transmitted is a sequence of spikes coded using
high speed digital buses. These multi-layer and multi-chip AER
systems perform actually not only image processing, but also
audio processing, filtering, learning, locomotion, etc. This paper
present an AER interface for controlling an anthropomorphic
robotic hand with a neuro-inspired system.Unión Europea IST-2001-34124 (CAVIAR)Ministerio de Ciencia y Tecnología TIC-2003-08164-C03-02Ministerio de Ciencia y Tecnología TIC2000-0406-P4- 0
- …
