63 research outputs found

    Function Approximation With Multilayered Perceptrons Using L1 Criterion

    Get PDF
    Kaedah ralat kuasa dua terkecil atau kaedah kriteria L2 biasanya digunakan bagi persoalan penghampiran fungsian dan pengitlakan di dalam algoritma perambatan balik ralat. Tujuan kajian ini adalah untuk mempersembahkan suatu kriteria ralat mutlak terkecil bagi perambatan balik sigmoid selain daripada kriteria ralat kuasa dua terkecil yang biasa digunakan. Kami membentangkan struktur fungsi ralat untuk diminimumkan serta hasil pembezaan terhadap pemberat yang akan dikemaskinikan. Tumpuan ·kajian ini ialah terhadap model perseptron multilapisan yang mempunyai satu lapisan tersembunyi tetapi perlaksanaannya boleh dilanjutkan kepada model yang mempunyai dua atau lebih lapisan tersembunyi. The least squares error or L2 criterion approach has been commonly used in functional approximation and generalization in the error backpropagation algorithm. The purpose of this study is to present an absolute error criterion for the sigmoidal backpropagatioll I rather than the usual least squares error criterion. We present the structure of the error function to be minimized and its derivatives with respect to the weights to be updated. The focus in the study is on the single hidden layer multilayer perceptron (MLP) but the implementation may be extended to include two or more hidden layers

    Learning Smooth Pattern Transformation Manifolds

    Get PDF
    Manifold models provide low-dimensional representations that are useful for processing and analyzing data in a transformation-invariant way. In this paper, we study the problem of learning smooth pattern transformation manifolds from image sets that are observations of geometrically transformed signals. In order to construct a manifold, we build a representative pattern whose transformations accurately fit various input images. The pattern is formed by selecting a good common sparse approximation of the images with parametric and smooth atoms. We examine two aspects of the manifold building problem, where we first target an accurate transformation-invariant approximation of the input images, and then extend this solution for their classification. For the approximation problem, we propose a greedy method that constructs a representative pattern by selecting analytic atoms from a continuous dictionary manifold. We present a DC (Difference-of-Convex) optimization scheme which is applicable for a wide range of transformation and dictionary models, and demonstrate its application to transformation manifolds generated by the rotation, translation and anisotropic scaling of a reference pattern. Then, we generalize this approach to a setting with multiple transformation manifolds, where each manifold represents a different class of signals. We present an iterative multiple manifold building algorithm such that the classification accuracy is promoted in the joint selection of atoms. Experimental results suggest that the proposed methods yield high accuracy in the approximation and classification of data in comparison with some reference methods, while achieving invariance to geometric transformations due to the transformation manifold model

    Neural networks : analog VLSI implementation and learning algorithms

    Get PDF

    Learning to process with spikes and to localise pulses

    Get PDF
    In the last few decades, deep learning with artificial neural networks (ANNs) has emerged as one of the most widely used techniques in tasks such as classification and regression, achieving competitive results and in some cases even surpassing human-level performance. Nonetheless, as ANN architectures are optimised towards empirical results and departed from their biological precursors, how exactly human brains process information using these short electrical pulses called spikes remains a mystery. Hence, in this thesis, we explore the problem of learning to process with spikes and to localise pulses. We first consider spiking neural networks (SNNs), a type of ANN that more closely mimic biological neural networks in that neurons communicate with one another using spikes. This unique architecture allows us to look into the role of heterogeneity in learning. Since it is conjectured that the information is encoded by the timing of spikes, we are particularly interested in the heterogeneity of time constants of neurons. We then trained SNNs for classification tasks on a range of visual and auditory neuromorphic datasets, which contain streams of events (spike times) instead of the conventional frame-based data, and show that the overall performance is improved by allowing the neurons to have different time constants, especially on tasks with richer temporal structure. We also find that the learned time constants are distributed similarly to those experimentally observed in some mammalian cells. Besides, we demonstrate that learning with heterogeneity improves robustness against hyperparameter mistuning. These results suggest that heterogeneity may be more than the byproduct of noisy processes and perhaps serves a key role in learning in changing environments, yet heterogeneity has been overlooked in basic artificial models. While neuromorphic datasets, which are often captured by neuromorphic devices that closely model the corresponding biological systems, have enabled us to explore the more biologically plausible SNNs, there still exists a gap in understanding how spike times encode information in actual biological neural networks like human brains, as such data is difficult to acquire due to the trade-off between the timing precision and the number of cells simultaneously recorded electrically. Instead, what we usually obtain is the low-rate discrete samples of trains of filtered spikes. Hence, in the second part of the thesis, we focus on a different type of problem involving pulses, that is to retrieve the precise pulse locations from these low-rate samples. We make use of the finite rate of innovation (FRI) sampling theory, which states that perfect reconstruction is possible for classes of continuous non-bandlimited signals that have a small number of free parameters. However, existing FRI methods break down under very noisy conditions due to the so-called subspace swap event. Thus, we present two novel model-based learning architectures: Deep Unfolded Projected Wirtinger Gradient Descent (Deep Unfolded PWGD) and FRI Encoder-Decoder Network (FRIED-Net). The former is based on the existing iterative denoising algorithm for subspace-based methods, while the latter models directly the relationship between the samples and the locations of the pulses using an autoencoder-like network. Using a stream of K Diracs as an example, we show that both algorithms are able to overcome the breakdown inherent in the existing subspace-based methods. Moreover, we extend our FRIED-Net framework beyond conventional FRI methods by considering when the shape is unknown. We show that the pulse shape can be learned using backpropagation. This coincides with the application of spike detection from real-world calcium imaging data, where we achieve competitive results. Finally, we explore beyond canonical FRI signals and demonstrate that FRIED-Net is able to reconstruct streams of pulses with different shapes.Open Acces

    On the use of Neural Networks to solve Differential Equations

    Get PDF
    [EN]Artificial neural networks are parametric models, generally adjusted to solve regression and classification problem. For a long time, a question has laid around regarding the possibility of using these types of models to approximate the solutions of initial and boundary value problems, as a means for numerical integration. Recent improvements in deep-learning have made this approach much attainable, and integration methods based on training (fitting) artificial neural networks have begin to spring, motivated mostly by their mesh-free nature and scalability to high dimensions. In this work, we go all the way from the most basic elements, such as the definition of artificial neural networks and well-posedness of the problems, to solving several linear and quasi-linear PDEs using this approach. Throughout this work we explain general theory concerning artificial neural networks, including topics such as vanishing gradients, non-convex optimization or regularization, and we adapt them to better suite the initial and boundary value problems nature. Some of the original contributions in this work include: an analysis of the vanishing gradient problem with respect to the input derivatives, a custom regularization technique based on the network’s parameters derivatives, and a method to rescale the subgradients of the multi-objective of the loss function used to optimize the network.[ES]Las redes neuronales son modelos paramétricos generalmente usados para resolver problemas de regresiones y clasificación. Durante bastante tiempo ha rondado la pregunta de si es posible usar este tipo de modelos para aproximar soluciones de problemas de valores iniciales y de contorno, como un medio de integración numérica. Los cambios recientes en deep-learning han hecho este enfoque más viable, y métodos basados en entrenar (ajustar) redes neuronales han empezado a surgir motivados por su no necesidad de un mallado y su buena escalabilidad a altas dimensiones. En este trabajo, vamos desde los elementos más básicos, como la definición de una red neuronal o la buena definición de los problemas, hasta ser capaces de resolver diversas EDPs lineales y casi-lineales. A lo largo del trabajo explicamos la teoría general relacionada con redes neuronales, que incluyen tópicos como los problemas de desvanecimiento de gradientes (vanishing gradient), optimización no-convexa y técnicas de regularización, y los adaptamos a la naturaleza de los problemas de valores iniciales y de contorno. Algunas de las contribuciones originales de este trabajo incluyen: un análisis del desvanecimiento de gradientes con respecto a las variables de entrada, una técnica de regularización customizada basada en las derivadas de los parámetros de la red neuronal, y un método para rescalar los subgradientes de la función de coste multi-objectivo usada para optimizar la red neuronal

    Role of biases in neural network models

    Get PDF
    corecore