397 research outputs found

    Wasserstein Introspective Neural Networks

    Full text link
    We present Wasserstein introspective neural networks (WINN) that are both a generator and a discriminator within a single model. WINN provides a significant improvement over the recent introspective neural networks (INN) method by enhancing INN's generative modeling capability. WINN has three interesting properties: (1) A mathematical connection between the formulation of the INN algorithm and that of Wasserstein generative adversarial networks (WGAN) is made. (2) The explicit adoption of the Wasserstein distance into INN results in a large enhancement to INN, achieving compelling results even with a single classifier --- e.g., providing nearly a 20 times reduction in model size over INN for unsupervised generative modeling. (3) When applied to supervised classification, WINN also gives rise to improved robustness against adversarial examples in terms of the error reduction. In the experiments, we report encouraging results on unsupervised learning problems including texture, face, and object modeling, as well as a supervised classification task against adversarial attacks.Comment: Accepted to CVPR 2018 (Oral

    STARE: Spatio-Temporal Attention Relocation for multiple structured activities detection

    No full text
    We present a spatio-temporal attention relocation (STARE) method, an information-theoretic approach for efficient detection of simultaneously occurring structured activities. Given multiple human activities in a scene, our method dynamically focuses on the currently most informative activity. Each activity can be detected without complete observation, as the structure of sequential actions plays an important role on making the system robust to unattended observations. For such systems, the ability to decide where and when to focus is crucial to achieving high detection performances under resource bounded condition. Our main contributions can be summarized as follows: 1) information-theoretic dynamic attention relocation framework that allows the detection of multiple activities efficiently by exploiting the activity structure information and 2) a new high-resolution data set of temporally-structured concurrent activities. Our experiments on applications show that the STARE method performs efficiently while maintaining a reasonable level of accuracy

    Inference And Learning In Spiking Neural Networks For Neuromorphic Systems

    Get PDF
    Neuromorphic computing is a computing field that takes inspiration from the biological and physical characteristics of the neocortex system to motivate a new paradigm of highly parallel and distributed computing to take on the demands of the ever-increasing scale and computational complexity of machine intelligence esp. in energy-limited systems such as Edge devices, Internet-of-Things (IOT), and cyber physical systems (CPS). Spiking neural network (SNN) is often studied together with neuromorphic computing as the underlying computational model . Similar to the biological neural system, SNN is an inherently dynamic and stateful network. The state and output of SNN do not only dependent on the current input, but also dependent on the history information. Another distinct property of SNN is that the information is represented, transmitted, and processed as discrete spike events, also referred to as action potentials. All the processing happens in the neurons such that the computation itself is massively distributed and parallel. This enables low power information transmission and processing. However, it is inefficient to implement SNNs on traditional Von Neumann architecture due to the performance gap between memory and processor. This has led to the advent of energy-efficient large-scale neuromorphic hardware such as IBM\u27s TrueNorth and Intel\u27s Loihi that enables low power implementation of large-scale neural networks for real-time applications. And although spiking networks have theoretically been shown to have Turing-equivalent computing power, it remains a challenge to train deep SNNs; the threshold functions that generate spikes are discontinuous, so they do not have derivatives and cannot directly utilize gradient-based optimization algorithms for training. Biologically plausible learning mechanism spike-timing-dependent plasticity (STDP) and its variants are local in synapses and time but are unstable during training and difficult to train multi-layer SNNs. To better exploit the energy-saving features such as spike domain representation and stochastic computing provided by SNNs in neuromorphic hardware, and to address the hardware limitations such as limited data precision and neuron fan-in/fan-out constraints, it is necessary to re-design a neural network including its structure and computing. Our work focuses on low-level (activations, weights) and high-level (alternative learning algorithms) redesign techniques to enable inference and learning with SNNs in neuromorphic hardware. First, we focused on transforming a trained artificial neural network (ANN) to a form that is suitable for neuromorphic hardware implementation. Here, we tackle transforming Long Short-Term Memory (LSTM), a version of recurrent neural network (RNN) which includes recurrent connectivity to enable learning long temporal patterns. This is specifically a difficult challenge due to the inherent nature of RNNs and SNNs; the recurrent connectivity in RNNs induces temporal dynamics which require synchronicity, especially with the added complexity of LSTMs; and SNNs are asynchronous in nature. In addition, the constraints of the neuromorphic hardware provided a massive challenge for this realization. Thus, in this work, we invented a store-and-release circuit using integrate-and-fire neurons which allows the synchronization and then developed modules using that circuit to replicate various parts of the LSTM. These modules enabled implementation of LSTMs with spiking neurons on IBM’s TrueNorth Neurosynaptic processor. This is the first work to realize such LSTM networks utilizing spiking neurons and implement on a neuromorphic hardware. This opens avenues for the use of neuromorphic hardware in applications involving temporal patterns. Moving from mapping a pretrained ANN, we work on training networks on the neuromorphic hardware. Here, we first looked at the biologically plausible learning algorithm called STDP which is a Hebbian learning rule for learning without supervision. Simplified computational interpretations of STDP is either unstable and/or complex such that it is costly to implement on hardware. Thus, in this work, we proposed a stable version of STDP and applied intentional approximations for low-cost hardware implementation called Quantized 2-Power Shift (Q2PS) rule. With this version, we performed both unsupervised learning for feature extraction and supervised learning for classification in a multilayer SNN to achieve comparable to better accuracy on MNIST dataset compared to manually labelled two-layered networks. Next, we approached training multilayer SNNs on a neuromorphic hardware with backpropagation, a gradient-based optimization algorithm that forms the backbone of deep neural networks (DNN). Although STDP is biologically plausible, its not as robust for learning deep networks as backpropagation is for DNNs. However, backpropagation is not biologically plausible and not suitable to be directly applied to SNNs, neither can it be implemented on a neuromorphic hardware. Thus, in the first part of this work, we devise a set of approximations to transform backprogation to the spike domain such that it is suitable for SNNs. After the set of approximations, we adapted the connectivity and weight update rule in backpropagation to enable learning solely based on the locally available information such that it resembled a rate-based STDP algorithm. We called this Error-Modulated STDP (EMSTDP). In the next part of this work, we implemented EMSTDP on Intel\u27s Loihi neuromorphic chip to realize online in-hardware supervised learning of deep SNNs. This is the first realization of a fully spike-based approximation of backpropagation algorithm implemented on a neuromorphic processor. This is the first step towards building an autonomous machine that learns continuously from its environment and experiences

    Firmware Development and Integration for ALICE TPC and PHOS Front-end Electronics: A Trigger Based Readout and Control System operating in a Radiation Environment

    Get PDF
    The readout electronics in PHOS and TPC - two of the major detectors of the ALICE experiment at the LHC - consist of a set of Front End Cards (FECs) that digitize, process and buffer the data from the detector sensors. The FECs are connected to a Readout Control Unit (RCU) via two sets of custom made PCB backplanes. For PHOS, 28 FECs are connected to one RCU, while for TPC the number is varying from 18 to 25 FECs depending on location. The RCU is in charge of the data readout, including reception and distribution of triggers and in moving the data from the FECs to the Data Acquisition System. In addition it does low level control tasks. The RCU consists of an RCU Motherboard that hosts a Detector Control System (DCS) board and a Source Interface Unit. The DCS board is an embedded computer running Linux that controls the readout electronics. All the mentioned devices are implemented in commercial grade SRAM based Field Programmable Gate Arrays (FPGAs). Even if these devices are not very radiation tolerant, they are chosen because of their cost and flexibility, and most importantly the possibility to easily do future upgrades of the electronics. Since physical shielding of the electronics is not possible in ALICE due to the architecture of the detector, the radiation related errors need to be handled with other techniques such as firmware mitigation techniques. The main objective of this thesis has been to make firmware modules for the FPGAs reciding in different parts of the readout electronics. Because of the flexibility of the designs, some of them have, with minor adaptations, been applied in different devices surrounding the readout electronics. Additionally, effort has been put into testing and integration of the system. In detail, the work presented in this thesis can be summarized as follows: - Firmware design for radiation environments. All firmware modules that are designed are to be used in a radiation environment, and then special precautions need to be taken. Additionally, a state-of-the-art solution has been designed for protecting the main FPGA on the RCU Motherboard against radiation induced functional failures. - Implementation of Trigger Handling for the TPC/PHOS Readout Electronics. The triggers are received from the global trigger system via an optical link and are handled by an Application Spesific Integrated Circuit (ASIC) on the DCS board. The problem is that the DCS board might have occasional down time 6 due to radiation related errors, so a special interface module is designed for the main FPGA on the RCU Motherboard. This module decodes and verifies the information received from the trigger system. As it is a generic design it has also been implemented as part of the BusyBox. The BusyBox is an important device in the trigger path of the TPC and PHOS sub-detectors. - Testing and Verification of all firmware modules. All firmware modules have been extensively verified with computer simulation before being tested in real hardware. - Maintenance of the DCS board for TPC/PHOS and of the different Fee firmware modules in general. - System Integration and System Level Tests. A big contribution has been done integrating and testing all the modules and sub-systems. This concern both locally on the RCU and the BusyBox, as well as making all the devices play together on a larger scale. - Testing and Verification of all firmware modules. All firmware modules have been extensively verified with computer simulation before being tested in real hardware. - Maintenance of the DCS board for TPC/PHOS and of the different Fee firmware modules in general. - System Integration and System Level Tests. A big contribution has been done integrating and testing all the modules and sub-systems. This concern both locally on the RCU and the BusyBox, as well as making all the devices play together on a larger scale. As the presented electronics are located in a radiation environment and are physically unavailable after commissioning, effort has been put into making designs that are reliable, scalable and possible to upgrade. This has been ensured by following a systematic design approach where testability, version management and documentation are key elements. Some parts of the work described in this thesis have been published and presented in international peer reviewed publications and conferences

    In Vitro Studies of Neuronal Networks and Synaptic Plasticity in Invertebrates and in Mammals Using Multielectrode Arrays

    Get PDF
    Brain functions are strictly dependent on neural connections formed during development and modified during life. The cellular and molecular mechanisms underlying synaptogenesis and plastic changes involved in learning and memory have been analyzed in detail in simple animals such as invertebrates and in circuits of mammalian brains mainly by intracellular recordings of neuronal activity. In the last decades, the evolution of techniques such as microelectrode arrays (MEAs) that allow simultaneous, long-lasting, noninvasive, extracellular recordings from a large number of neurons has proven very useful to study long-term processes in neuronal networks in vivo and in vitro. In this work, we start off by briefly reviewing the microelectrode array technology and the optimization of the coupling between neurons and microtransducers to detect subthreshold synaptic signals. Then, we report MEA studies of circuit formation and activity in invertebrate models such as Lymnaea, Aplysia, and Helix. In the following sections, we analyze plasticity and connectivity in cultures of mammalian dissociated neurons, focusing on spontaneous activity and electrical stimulation. We conclude by discussing plasticity in closed-loop experiments

    Learning meaning representations for text generation with deep generative models

    Get PDF
    This thesis explores conditioning a language generation model with auxiliary variables. By doing so, we hope to be able to better control the output of the language generator. We explore several kinds of auxiliary variables in this thesis, from unstructured continuous, to discrete, to structured discrete auxiliary variables, and evaluate their advantages and disadvantages. We consider three primary axes of variation: how interpretable the auxiliary variables are, how much control they provide over the generated text, and whether the variables can be induced from unlabelled data. The latter consideration is particularly interesting: if we can show that induced latent variables correspond to the semantics of the generated utterance, then by manipulating the variables, we have fine-grained control over the meaning of the generated utterance, thereby learning simple meaning representations for text generation. We investigate three language generation tasks: open domain conversational response generation, sentence generation from a semantic topic, and generating surface form realisations of meaning representations. We use a different type of auxiliary variable for each task, describe the reasons for choosing that type of variable, and critically discuss how much the task benefited from an auxiliary variable decomposition. All of the models that we use combine a high-level graphical model with a neural language model text generator. The graphical model lets us specify the structure of the text generating process, while the neural text generator can learn how to generate fluent text from a large corpus of examples. We aim to show the utility of such \textit{deep generative models} of text for text generation in the following work
    • …
    corecore