Search CORE

23,968 research outputs found

Interpretable Structure-Evolving LSTM

Author: Feng Jiashi
Liang Xiaodan
Lin Liang
Shen Xiaohui
Xing Eric P.
Yan Shuicheng
Publication venue
Publication date: 08/03/2017
Field of study

This paper develops a general framework for learning interpretable data representation via Long Short-Term Memory (LSTM) recurrent neural networks over hierarchal graph structures. Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization. We thus call this model the structure-evolving LSTM. In particular, starting with an initial element-level graph representation where each node is a small data element, the structure-evolving LSTM gradually evolves the multi-level graph representations by stochastically merging the graph nodes with high compatibilities along the stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two connected nodes from their corresponding LSTM gate outputs, which is used to generate a merging probability. The candidate graph structures are accordingly generated where the nodes are grouped into cliques with their merging probabilities. We then produce the new graph structure with a Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in local optimums by stochastic sampling with an acceptance probability. Once a graph structure is accepted, a higher-level graph is then constructed by taking the partitioned cliques as its nodes. During the evolving process, representation becomes more abstracted in higher-levels where redundant information is filtered out, allowing more efficient propagation of long-range data dependencies. We evaluate the effectiveness of structure-evolving LSTM in the application of semantic object parsing and demonstrate its advantage over state-of-the-art LSTM models on standard benchmarks.Comment: To appear in CVPR 2017 as a spotlight pape

arXiv.org e-Print Archive

Crossref

Evolving neural networks with genetic algorithms to study the String Landscape

Author: Ruehle Fabian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

We study possible applications of artificial neural networks to examine the string landscape. Since the field of application is rather versatile, we propose to dynamically evolve these networks via genetic algorithms. This means that we start from basic building blocks and combine them such that the neural network performs best for the application we are interested in. We study three areas in which neural networks can be applied: to classify models according to a fixed set of (physically) appealing features, to find a concrete realization for a computation for which the precise algorithm is known in principle but very tedious to actually implement, and to predict or approximate the outcome of some involved mathematical computation which performs too inefficient to apply it, e.g. in model scans within the string landscape. We present simple examples that arise in string phenomenology for all three types of problems and discuss how they can be addressed by evolving neural networks from genetic algorithms.Comment: 17 pages, 7 figures, references added, typos corrected, extended introductory sectio

arXiv.org e-Print Archive

Directory of Open Access Journals

Oxford University Research Archive

Modelling and control of chaotic processes through their Bifurcation Diagrams generated with the help of Recurrent Neural Network models: Part 1—simulation studies

Author: C S Kumar
Faruqi M Aslam
Jallu Krishnaiah
Publication venue
Publication date: 01/01/2006
Field of study

Many real-world processes tend to be chaotic and also do not lead to satisfactory analytical modelling. It has been shown here that for such chaotic processes represented through short chaotic noisy time-series, a multi-input and multi-output recurrent neural networks model can be built which is capable of capturing the process trends and predicting the future values from any given starting condition. It is further shown that this capability can be achieved by the Recurrent Neural Network model when it is trained to very low value of mean squared error. Such a model can then be used for constructing the Bifurcation Diagram of the process leading to determination of desirable operating conditions. Further, this multi-input and multi-output model makes the process accessible for control using open-loop/closed-loop approaches or bifurcation control etc. All these studies have been carried out using a low dimensional discrete chaotic system of Hénon Map as a representative of some real-world processes

CogPrints Cognitive Sciences Eprint Archive