117,088 research outputs found

    Neural network for processing both spatial and temporal data with time based back-propagation

    Get PDF
    Neural networks are computing systems modeled after the paradigm of the biological brain. For years, researchers using various forms of neural networks have attempted to model the brain's information processing and decision-making capabilities. Neural network algorithms have impressively demonstrated the capability of modeling spatial information. On the other hand, the application of parallel distributed models to the processing of temporal data has been severely restricted. The invention introduces a novel technique which adds the dimension of time to the well known back-propagation neural network algorithm. In the space-time neural network disclosed herein, the synaptic weights between two artificial neurons (processing elements) are replaced with an adaptable-adjustable filter. Instead of a single synaptic weight, the invention provides a plurality of weights representing not only association, but also temporal dependencies. In this case, the synaptic weights are the coefficients to the adaptable digital filters. Novelty is believed to lie in the disclosure of a processing element and a network of the processing elements which are capable of processing temporal as well as spacial data

    Synergies between Numerical Methods for Kinetic Equations and Neural Networks

    Get PDF
    The overarching theme of this work is the efficient computation of large-scale systems. Here we deal with two types of mathematical challenges, which are quite different at first glance but offer similar opportunities and challenges upon closer examination. Physical descriptions of phenomena and their mathematical modeling are performed on diverse scales, ranging from nano-scale interactions of single atoms to the macroscopic dynamics of the earth\u27s atmosphere. We consider such systems of interacting particles and explore methods to simulate them efficiently and accurately, with a focus on the kinetic and macroscopic description of interacting particle systems. Macroscopic governing equations describe the time evolution of a system in time and space, whereas the more fine-grained kinetic description additionally takes the particle velocity into account. The study of discretizing kinetic equations that depend on space, time, and velocity variables is a challenge due to the need to preserve physical solution bounds, e.g. positivity, avoiding spurious artifacts and computational efficiency. In the pursuit of overcoming the challenge of computability in both kinetic and multi-scale modeling, a wide variety of approximative methods have been established in the realm of reduced order and surrogate modeling, and model compression. For kinetic models, this may manifest in hybrid numerical solvers, that switch between macroscopic and mesoscopic simulation, asymptotic preserving schemes, that bridge the gap between both physical resolution levels, or surrogate models that operate on a kinetic level but replace computationally heavy operations of the simulation by fast approximations. Thus, for the simulation of kinetic and multi-scale systems with a high spatial resolution and long temporal horizon, the quote by Paul Dirac is as relevant as it was almost a century ago. The first goal of the dissertation is therefore the development of acceleration strategies for kinetic discretization methods, that preserve the structure of their governing equations. Particularly, we investigate the use of convex neural networks, to accelerate the minimal entropy closure method. Further, we develop a neural network-based hybrid solver for multi-scale systems, where kinetic and macroscopic methods are chosen based on local flow conditions. Furthermore, we deal with the compression and efficient computation of neural networks. In the meantime, neural networks are successfully used in different forms in countless scientific works and technical systems, with well-known applications in image recognition, and computer-aided language translation, but also as surrogate models for numerical mathematics. Although the first neural networks were already presented in the 1950s, the scientific discipline has enjoyed increasing popularity mainly during the last 15 years, since only now sufficient computing capacity is available. Remarkably, the increasing availability of computing resources is accompanied by a hunger for larger models, fueled by the common conception of machine learning practitioners and researchers that more trainable parameters equal higher performance and better generalization capabilities. The increase in model size exceeds the growth of available computing resources by orders of magnitude. Since 20122012, the computational resources used in the largest neural network models doubled every 3.43.4 months\footnote{\url{https://openai.com/blog/ai-and-compute/}}, opposed to Moore\u27s Law that proposes a 22-year doubling period in available computing power. To some extent, Dirac\u27s statement also applies to the recent computational challenges in the machine-learning community. The desire to evaluate and train on resource-limited devices sparked interest in model compression, where neural networks are sparsified or factorized, typically after training. The second goal of this dissertation is thus a low-rank method, originating from numerical methods for kinetic equations, to compress neural networks already during training by low-rank factorization. This dissertation thus considers synergies between kinetic models, neural networks, and numerical methods in both disciplines to develop time-, memory- and energy-efficient computational methods for both research areas

    Single-Input Signature Register-Based Time Delay Reservoir

    Get PDF
    Machine learning continues to play a critical role in our society. The ability to automatically identify intricate relationships in large volumes of data has proven incredibly useful for problems such as automatic speech recognition and image processing. In particular, neural networks have become increasingly popular in a wide set of application domains, given their ability to solve complex problems and process high-dimensional data. However, the impressive performance of state-of-the-art neural networks comes at the cost of large area and power consumption for the computation resources used in training and inference. As a result, a growing area of research concerns hardware implementations of neural networks. This work proposes a hardware-friendly design for a time-delay reservoir (TDR), a type of recurrent neural network. TDRs represent one class of reservoir computing neural network topologies, which employ random spatio-temporal feature extraction from time series data in order to produce a linearly separable set of features. Reservoir computing topologies differ from traditional recurrent neural networks because their recurrent weights are fixed, and the only the feedforward output weights need to be trained, usually with linear regression. Previous work on TDRs includes photonic implementation, software implementation, and both digital and analog electronic implementations. This work adds to the body of previous research by exploring the design space of a novel TDR based on single-input signature registers (SISRs), which are common digital circuits used for built-in self-test. The work is motivated by the structural similarity (delayed feedback loop) between TDRs and SISRs, and the possibility of dual-purpose of SISRs for conventional testing as well as machine learning within a single chip. The proposed designs can perform classification on multivariate datasets and perform better than a traditional TDR with quantized reservoir states for parity check, MNIST classification, and temperature prediction tasks. Classification accuracies of up to 100% were observed for some configurations of the SISR for the parity check task and accuracies of up to 85% were observed for MNIST classification. We also observe overfitting on a temperature prediction task with longer data sequences and provide analyses of the results based on the reservoir dynamics, as measured by the rate of divergence between SISR states and the SISR period
    corecore