Search CORE

19,063 research outputs found

Deep Markov Random Field for Image Modeling

Author: A Dempster
Alex Graves
C Dong
E Ising
GE Hinton
GR Cross
H Rue
J Duchi
J Pearl
J Portilla
J Shotton
J Wright
James Hays
Jean-François Lalonde
M Schuster
PJ Werbos
R Timofte
S Geman
SC Zhu
SZ Li
WT Freeman
Publication venue
Publication date: 07/09/2016
Field of study

Markov Random Fields (MRFs), a formulation widely used in generative image modeling, have long been plagued by the lack of expressive power. This issue is primarily due to the fact that conventional MRFs formulations tend to use simplistic factors to capture local patterns. In this paper, we move beyond such limitations, and propose a novel MRF model that uses fully-connected neurons to express the complex interactions among pixels. Through theoretical analysis, we reveal an inherent connection between this model and recurrent neural networks, and thereon derive an approximated feed-forward network that couples multiple RNNs along opposite directions. This formulation combines the expressive power of deep neural networks and the cyclic dependency structure of MRF in a unified model, bringing the modeling capability to a new level. The feed-forward approximation also allows it to be efficiently learned from data. Experimental results on a variety of low-level vision tasks show notable improvement over state-of-the-arts.Comment: Accepted at ECCV 201

arXiv.org e-Print Archive

Crossref

PNNARMA model: an alternative to phenomenological models in chemical reactors

Author: Galván Inés M.
Isasi Pedro
Zaldívar J.M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2001
Field of study

This paper is focused on the development of non-linear neural models able to provide appropriate predictions when acting as process simulators. Parallel identification models can be used for this purpose. However, in this work it is shown that since the parameters of parallel identification models are estimated using multilayer feed-forward networks, the approximation of dynamic systems could be not suitable. The solution proposed in this work consists of building up parallel models using a particular recurrent neural network. This network allows to identify the parameter sets of the parallel model in order to generate process simulators. Hence, it is possible to guarantee better dynamic predictions. The dynamic behaviour of the heat transfer fluid temperature in a jacketed chemical reactor has been selected as a case study. The results suggest that parallel models based on the recurrent neural network proposed in this work can be seen as an alternative to phenomenological models for simulating the dynamic behaviour of the heating/cooling circuits.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Liquid Time-constant Networks

Author: Amini Alexander
Grosu Radu
Hasani Ramin
Lechner Mathias
Rus Daniela
Publication venue
Publication date: 14/12/2020
Field of study

We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates. The resulting models represent dynamical systems with varying (i.e., liquid) time-constants coupled to their hidden state, with outputs being computed by numerical differential equation solvers. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations, and give rise to improved performance on time-series prediction tasks. To demonstrate these properties, we first take a theoretical approach to find bounds over their dynamics and compute their expressive power by the trajectory length measure in latent trajectory space. We then conduct a series of time-series prediction experiments to manifest the approximation capability of Liquid Time-Constant Networks (LTCs) compared to classical and modern RNNs. Code and data are available at https://github.com/raminmh/liquid_time_constant_networksComment: Accepted to the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Echo State Networks with Self-Normalizing Activations on the Hyper-Sphere

Author: Alippi Cesare
Livi Lorenzo
Verzelli Pietro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Among the various architectures of Recurrent Neural Networks, Echo State Networks (ESNs) emerged due to their simplified and inexpensive training procedure. These networks are known to be sensitive to the setting of hyper-parameters, which critically affect their behaviour. Results show that their performance is usually maximized in a narrow region of hyper-parameter space called edge of chaos. Finding such a region requires searching in hyper-parameter space in a sensible way: hyper-parameter configurations marginally outside such a region might yield networks exhibiting fully developed chaos, hence producing unreliable computations. The performance gain due to optimizing hyper-parameters can be studied by considering the memory--nonlinearity trade-off, i.e., the fact that increasing the nonlinear behavior of the network degrades its ability to remember past inputs, and vice-versa. In this paper, we propose a model of ESNs that eliminates critical dependence on hyper-parameters, resulting in networks that provably cannot enter a chaotic regime and, at the same time, denotes nonlinear behaviour in phase space characterised by a large memory of past inputs, comparable to the one of linear networks. Our contribution is supported by experiments corroborating our theoretical findings, showing that the proposed model displays dynamics that are rich-enough to approximate many common nonlinear systems used for benchmarking

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Open Research Exeter