397 research outputs found
NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations
This paper introduces Non-Autonomous Input-Output Stable Network (NAIS-Net),
a very deep architecture where each stacked processing block is derived from a
time-invariant non-autonomous dynamical system. Non-autonomy is implemented by
skip connections from the block input to each of the unrolled processing stages
and allows stability to be enforced so that blocks can be unrolled adaptively
to a pattern-dependent processing depth. NAIS-Net induces non-trivial,
Lipschitz input-output maps, even for an infinite unroll length. We prove that
the network is globally asymptotically stable so that for every initial
condition there is exactly one input-dependent equilibrium assuming tanh units,
and multiple stable equilibria for ReL units. An efficient implementation that
enforces the stability under derived conditions for both fully-connected and
convolutional layers is also presented. Experimental results show how NAIS-Net
exhibits stability in practice, yielding a significant reduction in
generalization gap compared to ResNets.Comment: NIPS 201
Machine Learning for Informed Representation Learning
The way we view reality and reason about the processes surrounding us is intimately connected to our perception and the representations we form about our observations and experiences. The popularity of machine learning and deep learning techniques in that regard stems from their ability to form useful representations by learning from large sets of observations. Typical application examples include image recognition or language processing for which artificial neural networks are powerful tools to extract regularity patterns or relevant statistics. In this thesis, we leverage and further develop this representation learning capability to address relevant but challenging real-world problems in geoscience and chemistry, to learn representations in an informed manner relevant to the task at hand, and reason about representation learning in neural networks, in general.
Firstly, we develop an approach for efficient and scalable semantic segmentation of degraded soil in alpine grasslands in remotely-sensed images based on convolutional neural networks. To this end, we consider different grassland erosion phenomena in several Swiss valleys. We find that we are able to monitor soil degradation consistent with state-of-the-art methods in geoscience and can improve detection of affected areas. Furthermore, our approach provides a scalable method for large-scale analysis which is infeasible with established methods.
Secondly, we address the question of how to identify suitable latent representations to enable generation of novel objects with selected properties. For this, we introduce a new deep generative model in the context of manifold learning and disentanglement. Our model improves targeted generation of novel objects by making use of property cycle consistency in property-relevant and property-invariant latent subspaces. We demonstrate the improvements on the generation of molecules with desired physical or chemical properties. Furthermore, we show that our model facilitates interpretability and exploration of the latent representation.
Thirdly, in the context of recent advances in deep learning theory and the neural tangent kernel, we empirically investigate the learning of feature representations in standard convolutional neural networks and corresponding random feature models given by the linearisation of the neural networks. We find that performance differences between standard and linearised networks generally increase with the difficulty of the task but decrease with the considered width or over-parametrisation of these networks. Our results indicate interesting implications for feature learning and random feature models as well as the generalisation performance of highly over-parametrised neural networks.
In summary, we employ and study feature learning in neural networks and review how we may use informed representation learning for challenging tasks
Using Linear Regression for Iteratively Training Neural Networks
We present a simple linear regression based approach for learning the weights
and biases of a neural network, as an alternative to standard gradient based
backpropagation. The present work is exploratory in nature, and we restrict the
description and experiments to (i) simple feedforward neural networks, (ii)
scalar (single output) regression problems, and (iii) invertible activation
functions. However, the approach is intended to be extensible to larger, more
complex architectures. The key idea is the observation that the input to every
neuron in a neural network is a linear combination of the activations of
neurons in the previous layer, as well as the parameters (weights and biases)
of the layer. If we are able to compute the ideal total input values to every
neuron by working backwards from the output, we can formulate the learning
problem as a linear least squares problem which iterates between updating the
parameters and the activation values. We present an explicit algorithm that
implements this idea, and we show that (at least for simple problems) the
approach is more stable and faster than gradient-based backpropagation.Comment: 9 page
Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions
Recent studies of the computational power of recurrent neural networks (RNNs)
reveal a hierarchy of RNN architectures, given real-time and finite-precision
assumptions. Here we study auto-regressive Transformers with linearised
attention, a.k.a. linear Transformers (LTs) or Fast Weight Programmers (FWPs).
LTs are special in the sense that they are equivalent to RNN-like sequence
processors with a fixed-size state, while they can also be expressed as the
now-popular self-attention networks. We show that many well-known results for
the standard Transformer directly transfer to LTs/FWPs. Our formal language
recognition experiments demonstrate how recently proposed FWP extensions such
as recurrent FWPs and self-referential weight matrices successfully overcome
certain limitations of the LT, e.g., allowing for generalisation on the parity
problem. Our code is public.Comment: Accepted to EMNLP 2023 (short paper
Wide Field Imaging. I. Applications of Neural Networks to object detection and star/galaxy classification
[Abriged] Astronomical Wide Field Imaging performed with new large format CCD
detectors poses data reduction problems of unprecedented scale which are
difficult to deal with traditional interactive tools. We present here NExt
(Neural Extractor): a new Neural Network (NN) based package capable to detect
objects and to perform both deblending and star/galaxy classification in an
automatic way. Traditionally, in astronomical images, objects are first
discriminated from the noisy background by searching for sets of connected
pixels having brightnesses above a given threshold and then they are classified
as stars or as galaxies through diagnostic diagrams having variables choosen
accordingly to the astronomer's taste and experience. In the extraction step,
assuming that images are well sampled, NExt requires only the simplest a priori
definition of "what an object is" (id est, it keeps all structures composed by
more than one pixels) and performs the detection via an unsupervised NN
approaching detection as a clustering problem which has been thoroughly studied
in the artificial intelligence literature. In order to obtain an objective and
reliable classification, instead of using an arbitrarily defined set of
features, we use a NN to select the most significant features among the large
number of measured ones, and then we use their selected features to perform the
classification task. In order to optimise the performances of the system we
implemented and tested several different models of NN. The comparison of the
NExt performances with those of the best detection and classification package
known to the authors (SExtractor) shows that NExt is at least as effective as
the best traditional packages.Comment: MNRAS, in press. Paper with higher resolution images is available at
http://www.na.astro.it/~andreon/listapub.htm
Recommended from our members
An intelligent system for risk classification of stock investment projects
The proposed paper demonstrates that a hybrid fuzzy neural network can serve as a risk classifier of stock investment projects. The training algorithm for the regular part of the network is based on bidirectional incremental evolution proving more efficient than direct evolution. The approach is compared with other crisp and soft investment appraisal and trading techniques, while building a multimodel domain representation for an intelligent decision support system. Thus the advantages of each model are utilised while looking at the investment problem from different perspectives. The empirical results are based on UK companies traded on the London Stock Exchange
Stationary solution of the ring-spinning balloon in zero air drag using a RBFN based mesh-free method
A technique for numerical analysis of the dynamics of the ring-spinning balloon based on the Radial Basis Function Networks (RBFNs) is presented in this paper. This method uses a 'universal approximator' based on neural network methodology to solve the differential governing equations which are derived from the conditions of the dynamic equilibrium of the yarn to determine the shape of balloon yarn. The method needs only a coarse finite collocation points without any finite element-type discretisation of the domain and its boundary for numerical solution of the governing differential equations. This paper will report a first assessment of the validity and efficiency of the present mesh-less method in predicting the balloon shape across a wide range of spinning conditions
- …