1,403 research outputs found

    Moving Learning Machine Towards Fast Real-Time Applications: A High-Speed FPGA-based Implementation of the OS-ELM Training Algorithm

    Get PDF
    Currently, there are some emerging online learning applications handling data streams in real-time. The On-line Sequential Extreme Learning Machine (OS-ELM) has been successfully used in real-time condition prediction applications because of its good generalization performance at an extreme learning speed, but the number of trainings by a second (training frequency) achieved in these continuous learning applications has to be further reduced. This paper proposes a performance-optimized implementation of the OS-ELM training algorithm when it is applied to real-time applications. In this case, the natural way of feeding the training of the neural network is one-by-one, i.e., training the neural network for each new incoming training input vector. Applying this restriction, the computational needs are drastically reduced. An FPGA-based implementation of the tailored OS-ELMalgorithm is used to analyze, in a parameterized way, the level of optimization achieved. We observed that the tailored algorithm drastically reduces the number of clock cycles consumed for the training execution up to approximately the 1%. This performance enables high-speed sequential training ratios, such as 14 KHz of sequential training frequency for a 40 hidden neurons SLFN, or 180 Hz of sequential training frequency for a 500 hidden neurons SLFN. In practice, the proposed implementation computes the training almost 100 times faster, or more, than other applications in the bibliography. Besides, clock cycles follows a quadratic complexity O(N 2), with N the number of hidden neurons, and are poorly influenced by the number of input neurons. However, it shows a pronounced sensitivity to data type precision even facing small-size problems, which force to use double floating-point precision data types to avoid finite precision arithmetic effects. In addition, it has been found that distributed memory is the limiting resource and, thus, it can be stated that current FPGA devices can support OS-ELM-based on-chip learning of up to 500 hidden neurons. Concluding, the proposed hardware implementation of the OS-ELM offers great possibilities for on-chip learning in portable systems and real-time applications where frequent and fast training is required

    Analogue neuromorphic systems.

    Get PDF
    This thesis addresses a new area of science and technology, that of neuromorphic systems, namely the problems and prospects of analogue neuromorphic systems. The subject is subdivided into three chapters. Chapter 1 is an introduction. It formulates the oncoming problem of the creation of highly computationally costly systems of nonlinear information processing (such as artificial neural networks and artificial intelligence systems). It shows that an analogue technology could make a vital contribution to the creation such systems. The basic principles of creation of analogue neuromorphic systems are formulated. The importance will be emphasised of the principle of orthogonality for future highly efficient complex information processing systems. Chapter 2 reviews the basics of neural and neuromorphic systems and informs on the present situation in this field of research, including both experimental and theoretical knowledge gained up-to-date. The chapter provides the necessary background for correct interpretation of the results reported in Chapter 3 and for a realistic decision on the direction for future work. Chapter 3 describes my own experimental and computational results within the framework of the subject, obtained at De Montfort University. These include: the building of (i) Analogue Polynomial Approximator/lnterpolatoriExtrapolator, (ii) Synthesiser of orthogonal functions, (iii) analogue real-time video filter (performing the homomorphic filtration), (iv) Adaptive polynomial compensator of geometrical distortions of CRT- monitors, (v) analogue parallel-learning neural network (backpropagation algorithm). Thus, this thesis makes a dual contribution to the chosen field: it summarises the present knowledge on the possibility of utilising analogue technology in up-to-date and future computational systems, and it reports new results within the framework of the subject. The main conclusion is that due to its promising power characteristics, small sizes and high tolerance to degradation, the analogue neuromorphic systems will playa more and more important role in future computational systems (in particular in systems of artificial intelligence)

    FEEDFORWARD ARTIFICIAL NEURAL NETWORK DESIGN UTILISING SUBTHRESHOLD MODE CMOS DEVICES

    Get PDF
    This thesis reviews various previously reported techniques for simulating artificial neural networks and investigates the design of fully-connected feedforward networks based on MOS transistors operating in the subthreshold mode of conduction as they are suitable for performing compact, low power, implantable pattern recognition systems. The principal objective is to demonstrate that the transfer characteristic of the devices can be fully exploited to design basic processing modules which overcome the linearity range, weight resolution, processing speed, noise and mismatch of components problems associated with weak inversion conduction, and so be used to implement networks which can be trained to perform practical tasks. A new four-quadrant analogue multiplier, one of the most important cells in the design of artificial neural networks, is developed. Analytical as well as simulation results suggest that the new scheme can efficiently be used to emulate both the synaptic and thresholding functions. To complement this thresholding-synapse, a novel current-to-voltage converter is also introduced. The characteristics of the well known sample-and-hold circuit as a weight memory scheme are analytically derived and simulation results suggest that a dummy compensated technique is required to obtain the required minimum of 8 bits weight resolution. Performance of the combined load and thresholding-synapse arrangement as well as an on-chip update/refresh mechanism are analytically evaluated and simulation studies on the Exclusive OR network as a benchmark problem are provided and indicate a useful level of functionality. Experimental results on the Exclusive OR network and a 'QRS' complex detector based on a 10:6:3 multilayer perceptron are also presented and demonstrate the potential of the proposed design techniques in emulating feedforward neural networks

    Neuroverkon inferenssi digitaalisessa signaalikäsittelyssä kovien reaaliaikavaatimusten alaisuudessa

    Get PDF
    The main objective of this thesis is to investigate how neural network inference can be efficiently implemented on a digital signal processor under hard real-time constraints from the execution speed point of view. Theories on digital signal processors and software optimization as well as neural networks are discussed. A neural network model for the specific use case is designed and a digital signal processor implementation is created based on the neural network model. A neural network model for the use case is created based on the data from the Matlab simulation model. The neural network model is trained and validated using the Python programming language with the Keras package. The neural network model is implemented on the CEVA-XC4500 digital signal processor. The digital signal processor implementation is written in C++ language with the processor specific vector-processing intrinsics. The neural network model is evaluated based on the model accuracy, precision, recall and f1-score. The model performance is compared to the conventional use case implementation by calculating 3GPP specified metrics of misdetection probability, false alarm rate and bit error rate. The execution speed of the digital signal processor implementation is evaluated with the CEVA integrated development environment profiling tool and also with the Lauterbach PowerTrace profiling module attached to the real base station product. Through this thesis, an optimized CEVA-XC4500 digital signal processor implementation was created for the specific neural network architecture. The optimized implementation showed to consume 88 percent less cycles than the conventional implementation. Also, the neural network model performance fulfills the 3GPP specification requirements.Tämän diplomityön tarkoituksena on tutkia miten neuroverkon inferenssi voidaan toteuttaa tehokkaasti digitaalisella signaaliprosessorilla suoritusnopeuden näkökulmasta, kun sovelluksella on kovat reaaliaikavaatimukset. Työssä käsitellään teoriaa digitaalisista signaaliprosessoreista, ohjelmistojen optimoinnista ja neuroverkoista. Työssä kehitetään neuroverkkomalli tiettyyn käyttötapaukseen, ja mallin pohjalta luodaan toteutus digitaaliselle signaaliprosessorille. Neuroverkkomalli luodaan Matlab-simulointimallin avulla kerätystä datasta. Neuroverkkomalli opetetaan ja varmennetaan Python-ohjelmointikiellellä ja Keras-paketilla. Neuroverkkomalli toteutetaan CEVA-XC4500 digitaaliselle signaaliprosessorille. Digitaalisen signaaliprosessorin toteutus kirjoitetaan C++-ohjelmointikielellä ja prosessorikohtaisilla vektorilaskentaoperaatioilla. Neuroverkkomalli varmennetaan mallin tarkkuuden, precision-arvon, recall-arvon ja f1-arvon perusteella. Mallin suorituskykyä verrataan käyttötapauksen tavanomaiseen toteutukseen laskemalla 3GPP-spesifikaation mukaiset mittarit virhehavaintotodennäköisyys, väärien hälytysten lukumäärä ja bittivirhemäärä. Suoritusnopeus määritetään sekä CEVA-ohjelmointiympäristön profilointityökalulla että tukiasematuotteeseen kytketyllä Lauterbach PowerTrace-yksiköllä. Työn tuloksena luotiin optimoitu CEVA-XC4500 digitaalinen signaaliprosessoritoteutus valitulle neuroverkkoarkkitehtuurille. Optimoitu toteutus kulutti 88% vähemmän laskentasyklejä kuin tavanomainen toteutus. Neuroverkkomalli täytti 3GPP-spesifikaation mukaiset vaatimukset

    Research reports: 1990 NASA/ASEE Summer Faculty Fellowship Program

    Get PDF
    Reports on the research projects performed under the NASA/ASEE Summer Faculty Fellowship Program are presented. The program was conducted by The University of Alabama and MSFC during the period from June 4, 1990 through August 10, 1990. Some of the topics covered include: (1) Space Shuttles; (2) Space Station Freedom; (3) information systems; (4) materials and processes; (4) Space Shuttle main engine; (5) aerospace sciences; (6) mathematical models; (7) mission operations; (8) systems analysis and integration; (9) systems control; (10) structures and dynamics; (11) aerospace safety; and (12) remote sensin

    An investigation into adaptive power reduction techniques for neural hardware

    No full text
    In light of the growing applicability of Artificial Neural Network (ANN) in the signal processing field [1] and the present thrust of the semiconductor industry towards lowpower SOCs for mobile devices [2], the power consumption of ANN hardware has become a very important implementation issue. Adaptability is a powerful and useful feature of neural networks. All current approaches for low-power ANN hardware techniques are ‘non-adaptive’ with respect to the power consumption of the network (i.e. power-reduction is not an objective of the adaptation/learning process). In the research work presented in this thesis, investigations on possible adaptive power reduction techniques have been carried out, which attempt to exploit the adaptability of neural networks in order to reduce the power consumption. Three separate approaches for such adaptive power reduction are proposed: adaptation of size, adaptation of network weights and adaptation of calculation precision. Initial case studies exhibit promising results with significantpower reduction

    Computational aspects of parvalbumin-positive interneuron function

    Get PDF
    The activity of neurons is dependent on the manner in which they process synaptic inputs from other cells. In the event of clustered synaptic input, neurons can respond in a nonlinear manner through synaptic and dendritic mechanisms. Such mechanisms are well established in principal excitatory neurons throughout the brain, where they increase neuronal computational ability and information storage capacity. In contrast for parvalbumin-positive (PV+) interneurons, the most common cortical class of in- hibitory interneuron, synaptic integration is thought to be either linear or sub-linear in nature, facilitating their role as mediators of precise and fast inhibition. This thesis addresses situations in which PV+ interneurons integrate synaptic inputs in a nonlinear manner, and explores the functions of this synaptic processing. First, I describe a form of cooperative supralinear synaptic integration by local excitatory inputs onto PV+ interneurons, and I extend these results to show how this augments the computational capability of PV+ cells within spiking neuron networks. I also explore the importance of polyamine-modulation of synaptic receptors in mediating sublinear synaptic integration, and discuss how this expands the array of mechanisms known to perform similar functions in PV+ cells. Finally, I present work manipulating PV+ cells experimentally during epilepsy. I consider these findings together with recent scientific advances and suggest how they account for a number of open questions and previously contradictory theories of PV+ interneuron function
    corecore