397 research outputs found

    NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations

    Get PDF
    This paper introduces Non-Autonomous Input-Output Stable Network (NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforced so that blocks can be unrolled adaptively to a pattern-dependent processing depth. NAIS-Net induces non-trivial, Lipschitz input-output maps, even for an infinite unroll length. We prove that the network is globally asymptotically stable so that for every initial condition there is exactly one input-dependent equilibrium assuming tanh units, and multiple stable equilibria for ReL units. An efficient implementation that enforces the stability under derived conditions for both fully-connected and convolutional layers is also presented. Experimental results show how NAIS-Net exhibits stability in practice, yielding a significant reduction in generalization gap compared to ResNets.Comment: NIPS 201

    Machine Learning for Informed Representation Learning

    Get PDF
    The way we view reality and reason about the processes surrounding us is intimately connected to our perception and the representations we form about our observations and experiences. The popularity of machine learning and deep learning techniques in that regard stems from their ability to form useful representations by learning from large sets of observations. Typical application examples include image recognition or language processing for which artificial neural networks are powerful tools to extract regularity patterns or relevant statistics. In this thesis, we leverage and further develop this representation learning capability to address relevant but challenging real-world problems in geoscience and chemistry, to learn representations in an informed manner relevant to the task at hand, and reason about representation learning in neural networks, in general. Firstly, we develop an approach for efficient and scalable semantic segmentation of degraded soil in alpine grasslands in remotely-sensed images based on convolutional neural networks. To this end, we consider different grassland erosion phenomena in several Swiss valleys. We find that we are able to monitor soil degradation consistent with state-of-the-art methods in geoscience and can improve detection of affected areas. Furthermore, our approach provides a scalable method for large-scale analysis which is infeasible with established methods. Secondly, we address the question of how to identify suitable latent representations to enable generation of novel objects with selected properties. For this, we introduce a new deep generative model in the context of manifold learning and disentanglement. Our model improves targeted generation of novel objects by making use of property cycle consistency in property-relevant and property-invariant latent subspaces. We demonstrate the improvements on the generation of molecules with desired physical or chemical properties. Furthermore, we show that our model facilitates interpretability and exploration of the latent representation. Thirdly, in the context of recent advances in deep learning theory and the neural tangent kernel, we empirically investigate the learning of feature representations in standard convolutional neural networks and corresponding random feature models given by the linearisation of the neural networks. We find that performance differences between standard and linearised networks generally increase with the difficulty of the task but decrease with the considered width or over-parametrisation of these networks. Our results indicate interesting implications for feature learning and random feature models as well as the generalisation performance of highly over-parametrised neural networks. In summary, we employ and study feature learning in neural networks and review how we may use informed representation learning for challenging tasks

    Using Linear Regression for Iteratively Training Neural Networks

    Full text link
    We present a simple linear regression based approach for learning the weights and biases of a neural network, as an alternative to standard gradient based backpropagation. The present work is exploratory in nature, and we restrict the description and experiments to (i) simple feedforward neural networks, (ii) scalar (single output) regression problems, and (iii) invertible activation functions. However, the approach is intended to be extensible to larger, more complex architectures. The key idea is the observation that the input to every neuron in a neural network is a linear combination of the activations of neurons in the previous layer, as well as the parameters (weights and biases) of the layer. If we are able to compute the ideal total input values to every neuron by working backwards from the output, we can formulate the learning problem as a linear least squares problem which iterates between updating the parameters and the activation values. We present an explicit algorithm that implements this idea, and we show that (at least for simple problems) the approach is more stable and faster than gradient-based backpropagation.Comment: 9 page

    Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions

    Full text link
    Recent studies of the computational power of recurrent neural networks (RNNs) reveal a hierarchy of RNN architectures, given real-time and finite-precision assumptions. Here we study auto-regressive Transformers with linearised attention, a.k.a. linear Transformers (LTs) or Fast Weight Programmers (FWPs). LTs are special in the sense that they are equivalent to RNN-like sequence processors with a fixed-size state, while they can also be expressed as the now-popular self-attention networks. We show that many well-known results for the standard Transformer directly transfer to LTs/FWPs. Our formal language recognition experiments demonstrate how recently proposed FWP extensions such as recurrent FWPs and self-referential weight matrices successfully overcome certain limitations of the LT, e.g., allowing for generalisation on the parity problem. Our code is public.Comment: Accepted to EMNLP 2023 (short paper

    Wide Field Imaging. I. Applications of Neural Networks to object detection and star/galaxy classification

    Get PDF
    [Abriged] Astronomical Wide Field Imaging performed with new large format CCD detectors poses data reduction problems of unprecedented scale which are difficult to deal with traditional interactive tools. We present here NExt (Neural Extractor): a new Neural Network (NN) based package capable to detect objects and to perform both deblending and star/galaxy classification in an automatic way. Traditionally, in astronomical images, objects are first discriminated from the noisy background by searching for sets of connected pixels having brightnesses above a given threshold and then they are classified as stars or as galaxies through diagnostic diagrams having variables choosen accordingly to the astronomer's taste and experience. In the extraction step, assuming that images are well sampled, NExt requires only the simplest a priori definition of "what an object is" (id est, it keeps all structures composed by more than one pixels) and performs the detection via an unsupervised NN approaching detection as a clustering problem which has been thoroughly studied in the artificial intelligence literature. In order to obtain an objective and reliable classification, instead of using an arbitrarily defined set of features, we use a NN to select the most significant features among the large number of measured ones, and then we use their selected features to perform the classification task. In order to optimise the performances of the system we implemented and tested several different models of NN. The comparison of the NExt performances with those of the best detection and classification package known to the authors (SExtractor) shows that NExt is at least as effective as the best traditional packages.Comment: MNRAS, in press. Paper with higher resolution images is available at http://www.na.astro.it/~andreon/listapub.htm

    Stationary solution of the ring-spinning balloon in zero air drag using a RBFN based mesh-free method

    Get PDF
    A technique for numerical analysis of the dynamics of the ring-spinning balloon based on the Radial Basis Function Networks (RBFNs) is presented in this paper. This method uses a 'universal approximator' based on neural network methodology to solve the differential governing equations which are derived from the conditions of the dynamic equilibrium of the yarn to determine the shape of balloon yarn. The method needs only a coarse finite collocation points without any finite element-type discretisation of the domain and its boundary for numerical solution of the governing differential equations. This paper will report a first assessment of the validity and efficiency of the present mesh-less method in predicting the balloon shape across a wide range of spinning conditions
    • …
    corecore