19,063 research outputs found
Deep Markov Random Field for Image Modeling
Markov Random Fields (MRFs), a formulation widely used in generative image
modeling, have long been plagued by the lack of expressive power. This issue is
primarily due to the fact that conventional MRFs formulations tend to use
simplistic factors to capture local patterns. In this paper, we move beyond
such limitations, and propose a novel MRF model that uses fully-connected
neurons to express the complex interactions among pixels. Through theoretical
analysis, we reveal an inherent connection between this model and recurrent
neural networks, and thereon derive an approximated feed-forward network that
couples multiple RNNs along opposite directions. This formulation combines the
expressive power of deep neural networks and the cyclic dependency structure of
MRF in a unified model, bringing the modeling capability to a new level. The
feed-forward approximation also allows it to be efficiently learned from data.
Experimental results on a variety of low-level vision tasks show notable
improvement over state-of-the-arts.Comment: Accepted at ECCV 201
PNNARMA model: an alternative to phenomenological models in chemical reactors
This paper is focused on the development of non-linear neural models able to provide appropriate predictions when acting as process simulators. Parallel identification models can be used for this purpose. However, in this work it is shown that since the parameters of parallel identification models are estimated using multilayer feed-forward networks, the approximation of dynamic systems could be not suitable. The solution proposed in this work consists of building up parallel models using a particular recurrent neural network. This network allows to identify the parameter sets of the parallel model in order to generate process simulators. Hence, it is possible to guarantee better dynamic predictions. The dynamic behaviour of the heat transfer fluid temperature in a jacketed chemical reactor has been selected as a case study. The results suggest that parallel models based on the recurrent neural network proposed in this work can be seen as an alternative to phenomenological models for simulating the dynamic behaviour of the heating/cooling circuits.Publicad
Liquid Time-constant Networks
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities,
we construct networks of linear first-order dynamical systems modulated via
nonlinear interlinked gates. The resulting models represent dynamical systems
with varying (i.e., liquid) time-constants coupled to their hidden state, with
outputs being computed by numerical differential equation solvers. These neural
networks exhibit stable and bounded behavior, yield superior expressivity
within the family of neural ordinary differential equations, and give rise to
improved performance on time-series prediction tasks. To demonstrate these
properties, we first take a theoretical approach to find bounds over their
dynamics and compute their expressive power by the trajectory length measure in
latent trajectory space. We then conduct a series of time-series prediction
experiments to manifest the approximation capability of Liquid Time-Constant
Networks (LTCs) compared to classical and modern RNNs. Code and data are
available at https://github.com/raminmh/liquid_time_constant_networksComment: Accepted to the Thirty-Fifth AAAI Conference on Artificial
Intelligence (AAAI-21
Echo State Networks with Self-Normalizing Activations on the Hyper-Sphere
Among the various architectures of Recurrent Neural Networks, Echo State
Networks (ESNs) emerged due to their simplified and inexpensive training
procedure. These networks are known to be sensitive to the setting of
hyper-parameters, which critically affect their behaviour. Results show that
their performance is usually maximized in a narrow region of hyper-parameter
space called edge of chaos. Finding such a region requires searching in
hyper-parameter space in a sensible way: hyper-parameter configurations
marginally outside such a region might yield networks exhibiting fully
developed chaos, hence producing unreliable computations. The performance gain
due to optimizing hyper-parameters can be studied by considering the
memory--nonlinearity trade-off, i.e., the fact that increasing the nonlinear
behavior of the network degrades its ability to remember past inputs, and
vice-versa. In this paper, we propose a model of ESNs that eliminates critical
dependence on hyper-parameters, resulting in networks that provably cannot
enter a chaotic regime and, at the same time, denotes nonlinear behaviour in
phase space characterised by a large memory of past inputs, comparable to the
one of linear networks. Our contribution is supported by experiments
corroborating our theoretical findings, showing that the proposed model
displays dynamics that are rich-enough to approximate many common nonlinear
systems used for benchmarking
- …