Search CORE

21,440 research outputs found

Recurrent Highway Networks

Author: Koutník Jan
Schmidhuber Jürgen
Srivastava Rupesh Kumar
Zilly Julian Georg
Publication venue
Publication date: 04/07/2017
Field of study

Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with 'deep' transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one. Several language modeling experiments demonstrate that the proposed architecture results in powerful and efficient models. On the Penn Treebank corpus, solely increasing the transition depth from 1 to 10 improves word-level perplexity from 90.6 to 65.4 using the same number of parameters. On the larger Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform all previous results and achieve an entropy of 1.27 bits per character.Comment: 12 pages, 6 figures, 3 table

arXiv.org e-Print Archive

Repository for Publications and Research Data

Interpretable deep learning for guided structure-property explorations in photovoltaics

Author: Ganapathysubramanian Baskar
Ghosal Sambuddha
Kokate Apurva
Pokuri Balaji Sesha Sarath
Sarkar Soumik
Publication venue
Publication date: 11/12/2018
Field of study

The performance of an organic photovoltaic device is intricately connected to its active layer morphology. This connection between the active layer and device performance is very expensive to evaluate, either experimentally or computationally. Hence, designing morphologies to achieve higher performances is non-trivial and often intractable. To solve this, we first introduce a deep convolutional neural network (CNN) architecture that can serve as a fast and robust surrogate for the complex structure-property map. Several tests were performed to gain trust in this trained model. Then, we utilize this fast framework to perform robust microstructural design to enhance device performance.Comment: Workshop on Machine Learning for Molecules and Materials (MLMM), Neural Information Processing Systems (NeurIPS) 2018, Montreal, Canad

arXiv.org e-Print Archive