3,134 research outputs found

    Recurrent Highway Networks

    Full text link
    Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with 'deep' transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one. Several language modeling experiments demonstrate that the proposed architecture results in powerful and efficient models. On the Penn Treebank corpus, solely increasing the transition depth from 1 to 10 improves word-level perplexity from 90.6 to 65.4 using the same number of parameters. On the larger Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform all previous results and achieve an entropy of 1.27 bits per character.Comment: 12 pages, 6 figures, 3 table

    Modeling Worldwide Highway Networks

    Full text link
    This letter addresses the problem of modeling the highway systems of different countries by using complex networks formalism. More specifically, we compare two traditional geographical models with a modified geometrical network model where paths, rather than edges, are incorporated at each step between the origin and destination nodes. Optimal configurations of parameters are obtained for each model and used in the comparison. The highway networks of Brazil, the US and England are considered and shown to be properly modeled by the modified geographical model. The Brazilian highway network yielded small deviations that are potentially accountable by specific developing and sociogeographic features of that country.Comment: 5 pages, 3 figures, 1 tabl

    Semi-tied Units for Efficient Gating in LSTM and Highway Networks

    Full text link
    Gating is a key technique used for integrating information from multiple sources by long short-term memory (LSTM) models and has recently also been applied to other models such as the highway network. Although gating is powerful, it is rather expensive in terms of both computation and storage as each gating unit uses a separate full weight matrix. This issue can be severe since several gates can be used together in e.g. an LSTM cell. This paper proposes a semi-tied unit (STU) approach to solve this efficiency issue, which uses one shared weight matrix to replace those in all the units in the same layer. The approach is termed "semi-tied" since extra parameters are used to separately scale each of the shared output values. These extra scaling factors are associated with the network activation functions and result in the use of parameterised sigmoid, hyperbolic tangent, and rectified linear unit functions. Speech recognition experiments using British English multi-genre broadcast data showed that using STUs can reduce the calculation and storage cost by a factor of three for highway networks and four for LSTMs, while giving similar word error rates to the original models.Comment: To appear in Proc. INTERSPEECH 2018, September 2-6, 2018, Hyderabad, Indi

    A characteristic particle method for traffic flow simulations on highway networks

    Full text link
    A characteristic particle method for the simulation of first order macroscopic traffic models on road networks is presented. The approach is based on the method "particleclaw", which solves scalar one dimensional hyperbolic conservations laws exactly, except for a small error right around shocks. The method is generalized to nonlinear network flows, where particle approximations on the edges are suitably coupled together at the network nodes. It is demonstrated in numerical examples that the resulting particle method can approximate traffic jams accurately, while only devoting a few degrees of freedom to each edge of the network.Comment: 15 pages, 5 figures. Accepted to the proceedings of the Sixth International Workshop Meshfree Methods for PDE 201
    • …
    corecore