3,134 research outputs found
Recurrent Highway Networks
Many sequential processing tasks require complex nonlinear transition
functions from one step to the next. However, recurrent neural networks with
'deep' transition functions remain difficult to train, even when using Long
Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of
recurrent networks based on Gersgorin's circle theorem that illuminates several
modeling and optimization issues and improves our understanding of the LSTM
cell. Based on this analysis we propose Recurrent Highway Networks, which
extend the LSTM architecture to allow step-to-step transition depths larger
than one. Several language modeling experiments demonstrate that the proposed
architecture results in powerful and efficient models. On the Penn Treebank
corpus, solely increasing the transition depth from 1 to 10 improves word-level
perplexity from 90.6 to 65.4 using the same number of parameters. On the larger
Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform
all previous results and achieve an entropy of 1.27 bits per character.Comment: 12 pages, 6 figures, 3 table
Modeling Worldwide Highway Networks
This letter addresses the problem of modeling the highway systems of
different countries by using complex networks formalism. More specifically, we
compare two traditional geographical models with a modified geometrical network
model where paths, rather than edges, are incorporated at each step between the
origin and destination nodes. Optimal configurations of parameters are obtained
for each model and used in the comparison. The highway networks of Brazil, the
US and England are considered and shown to be properly modeled by the modified
geographical model. The Brazilian highway network yielded small deviations that
are potentially accountable by specific developing and sociogeographic features
of that country.Comment: 5 pages, 3 figures, 1 tabl
Semi-tied Units for Efficient Gating in LSTM and Highway Networks
Gating is a key technique used for integrating information from multiple
sources by long short-term memory (LSTM) models and has recently also been
applied to other models such as the highway network. Although gating is
powerful, it is rather expensive in terms of both computation and storage as
each gating unit uses a separate full weight matrix. This issue can be severe
since several gates can be used together in e.g. an LSTM cell. This paper
proposes a semi-tied unit (STU) approach to solve this efficiency issue, which
uses one shared weight matrix to replace those in all the units in the same
layer. The approach is termed "semi-tied" since extra parameters are used to
separately scale each of the shared output values. These extra scaling factors
are associated with the network activation functions and result in the use of
parameterised sigmoid, hyperbolic tangent, and rectified linear unit functions.
Speech recognition experiments using British English multi-genre broadcast data
showed that using STUs can reduce the calculation and storage cost by a factor
of three for highway networks and four for LSTMs, while giving similar word
error rates to the original models.Comment: To appear in Proc. INTERSPEECH 2018, September 2-6, 2018, Hyderabad,
Indi
A characteristic particle method for traffic flow simulations on highway networks
A characteristic particle method for the simulation of first order
macroscopic traffic models on road networks is presented. The approach is based
on the method "particleclaw", which solves scalar one dimensional hyperbolic
conservations laws exactly, except for a small error right around shocks. The
method is generalized to nonlinear network flows, where particle approximations
on the edges are suitably coupled together at the network nodes. It is
demonstrated in numerical examples that the resulting particle method can
approximate traffic jams accurately, while only devoting a few degrees of
freedom to each edge of the network.Comment: 15 pages, 5 figures. Accepted to the proceedings of the Sixth
International Workshop Meshfree Methods for PDE 201
- …