1,403 research outputs found
Moving Learning Machine Towards Fast Real-Time Applications: A High-Speed FPGA-based Implementation of the OS-ELM Training Algorithm
Currently, there are some emerging online learning applications handling data streams in real-time. The On-line Sequential Extreme Learning Machine (OS-ELM) has been successfully used in real-time condition prediction applications because of its good generalization performance at an extreme learning speed, but the number of trainings by a second (training frequency) achieved in these continuous learning applications has to be further reduced. This paper proposes a performance-optimized implementation of the OS-ELM training algorithm when it is applied to real-time applications. In this case, the natural way of feeding the training of the neural network is one-by-one, i.e., training the neural network for each new incoming training input vector. Applying this restriction, the computational needs are drastically reduced. An FPGA-based implementation of the tailored OS-ELMalgorithm is used to analyze, in a parameterized way, the level of optimization achieved. We observed that the tailored algorithm drastically reduces the number of clock cycles consumed for the training execution up to approximately the 1%. This performance enables high-speed sequential training ratios, such as 14 KHz of sequential training frequency for a 40 hidden neurons SLFN, or 180 Hz of sequential training frequency for a 500 hidden neurons SLFN. In practice, the proposed implementation computes the training almost 100 times faster, or more, than other applications in the bibliography. Besides, clock cycles follows a quadratic complexity O(N 2), with N the number of hidden neurons, and are poorly influenced by the number of input neurons. However, it shows a pronounced sensitivity to data type precision even facing small-size problems, which force to use double floating-point precision data types to avoid finite precision arithmetic effects. In addition, it has been found that distributed memory is the limiting resource and, thus, it can be stated that current FPGA devices can support OS-ELM-based on-chip learning of up to 500 hidden neurons. Concluding, the proposed hardware implementation of the OS-ELM offers great possibilities for on-chip learning in portable systems and real-time applications where frequent and fast training is required
Analogue neuromorphic systems.
This thesis addresses a new area of science and technology, that of neuromorphic
systems, namely the problems and prospects of analogue neuromorphic systems. The
subject is subdivided into three chapters.
Chapter 1 is an introduction. It formulates the oncoming problem of the creation
of highly computationally costly systems of nonlinear information processing (such as
artificial neural networks and artificial intelligence systems). It shows that an analogue
technology could make a vital contribution to the creation such systems. The basic principles
of creation of analogue neuromorphic systems are formulated. The importance
will be emphasised of the principle of orthogonality for future highly efficient complex
information processing systems.
Chapter 2 reviews the basics of neural and neuromorphic systems and informs on
the present situation in this field of research, including both experimental and theoretical
knowledge gained up-to-date. The chapter provides the necessary background for
correct interpretation of the results reported in Chapter 3 and for a realistic decision on
the direction for future work.
Chapter 3 describes my own experimental and computational results within the
framework of the subject, obtained at De Montfort University. These include: the
building of (i) Analogue Polynomial Approximator/lnterpolatoriExtrapolator, (ii) Synthesiser
of orthogonal functions, (iii) analogue real-time video filter (performing the
homomorphic filtration), (iv) Adaptive polynomial compensator of geometrical distortions
of CRT- monitors, (v) analogue parallel-learning neural network (backpropagation
algorithm).
Thus, this thesis makes a dual contribution to the chosen field: it summarises the
present knowledge on the possibility of utilising analogue technology in up-to-date and
future computational systems, and it reports new results within the framework of the
subject. The main conclusion is that due to its promising power characteristics, small
sizes and high tolerance to degradation, the analogue neuromorphic systems will playa
more and more important role in future computational systems (in particular in systems
of artificial intelligence)
FEEDFORWARD ARTIFICIAL NEURAL NETWORK DESIGN UTILISING SUBTHRESHOLD MODE CMOS DEVICES
This thesis reviews various previously reported techniques for simulating artificial
neural networks and investigates the design of fully-connected feedforward networks
based on MOS transistors operating in the subthreshold mode of conduction as they are
suitable for performing compact, low power, implantable pattern recognition systems.
The principal objective is to demonstrate that the transfer characteristic of the devices
can be fully exploited to design basic processing modules which overcome the linearity
range, weight resolution, processing speed, noise and mismatch of components
problems associated with weak inversion conduction, and so be used to implement
networks which can be trained to perform practical tasks.
A new four-quadrant analogue multiplier, one of the most important cells in the
design of artificial neural networks, is developed. Analytical as well as simulation
results suggest that the new scheme can efficiently be used to emulate both the synaptic
and thresholding functions. To complement this thresholding-synapse, a novel
current-to-voltage converter is also introduced. The characteristics of the well known
sample-and-hold circuit as a weight memory scheme are analytically derived and
simulation results suggest that a dummy compensated technique is required to obtain the
required minimum of 8 bits weight resolution. Performance of the combined load and
thresholding-synapse arrangement as well as an on-chip update/refresh mechanism are
analytically evaluated and simulation studies on the Exclusive OR network as a
benchmark problem are provided and indicate a useful level of functionality.
Experimental results on the Exclusive OR network and a 'QRS' complex detector
based on a 10:6:3 multilayer perceptron are also presented and demonstrate the potential
of the proposed design techniques in emulating feedforward neural networks
Neuroverkon inferenssi digitaalisessa signaalikäsittelyssä kovien reaaliaikavaatimusten alaisuudessa
The main objective of this thesis is to investigate how neural network inference can be efficiently implemented on a digital signal processor under hard real-time constraints from the execution speed point of view. Theories on digital signal processors and software optimization as well as neural networks are discussed. A neural network model for the specific use case is designed and a digital signal processor implementation is created based on the neural network model.
A neural network model for the use case is created based on the data from the Matlab simulation model. The neural network model is trained and validated using the Python programming language with the Keras package. The neural network model is implemented on the CEVA-XC4500 digital signal processor. The digital signal processor implementation is written in C++ language with the processor specific vector-processing intrinsics. The neural network model is evaluated based on the model accuracy, precision, recall and f1-score. The model performance is compared to the conventional use case implementation by calculating 3GPP specified metrics of misdetection probability, false alarm rate and bit error rate. The execution speed of the digital signal processor implementation is evaluated with the CEVA integrated development environment profiling tool and also with the Lauterbach PowerTrace profiling module attached to the real base station product.
Through this thesis, an optimized CEVA-XC4500 digital signal processor implementation was created for the specific neural network architecture. The optimized implementation showed to consume 88 percent less cycles than the conventional implementation. Also, the neural network model performance fulfills the 3GPP specification requirements.Tämän diplomityön tarkoituksena on tutkia miten neuroverkon inferenssi voidaan toteuttaa tehokkaasti digitaalisella signaaliprosessorilla suoritusnopeuden näkökulmasta, kun sovelluksella on kovat reaaliaikavaatimukset. Työssä käsitellään teoriaa digitaalisista signaaliprosessoreista, ohjelmistojen optimoinnista ja neuroverkoista. Työssä kehitetään neuroverkkomalli tiettyyn käyttötapaukseen, ja mallin pohjalta luodaan toteutus digitaaliselle signaaliprosessorille.
Neuroverkkomalli luodaan Matlab-simulointimallin avulla kerätystä datasta. Neuroverkkomalli opetetaan ja varmennetaan Python-ohjelmointikiellellä ja Keras-paketilla. Neuroverkkomalli toteutetaan CEVA-XC4500 digitaaliselle signaaliprosessorille. Digitaalisen signaaliprosessorin toteutus kirjoitetaan C++-ohjelmointikielellä ja prosessorikohtaisilla vektorilaskentaoperaatioilla. Neuroverkkomalli varmennetaan mallin tarkkuuden, precision-arvon, recall-arvon ja f1-arvon perusteella. Mallin suorituskykyä verrataan käyttötapauksen tavanomaiseen toteutukseen laskemalla 3GPP-spesifikaation mukaiset mittarit virhehavaintotodennäköisyys, väärien hälytysten lukumäärä ja bittivirhemäärä. Suoritusnopeus määritetään sekä CEVA-ohjelmointiympäristön profilointityökalulla että tukiasematuotteeseen kytketyllä Lauterbach PowerTrace-yksiköllä.
Työn tuloksena luotiin optimoitu CEVA-XC4500 digitaalinen signaaliprosessoritoteutus valitulle neuroverkkoarkkitehtuurille. Optimoitu toteutus kulutti 88% vähemmän laskentasyklejä kuin tavanomainen toteutus. Neuroverkkomalli täytti 3GPP-spesifikaation mukaiset vaatimukset
The use of neural networks to help facilitate the accurate prediction of electricity demand on Crete
Research reports: 1990 NASA/ASEE Summer Faculty Fellowship Program
Reports on the research projects performed under the NASA/ASEE Summer Faculty Fellowship Program are presented. The program was conducted by The University of Alabama and MSFC during the period from June 4, 1990 through August 10, 1990. Some of the topics covered include: (1) Space Shuttles; (2) Space Station Freedom; (3) information systems; (4) materials and processes; (4) Space Shuttle main engine; (5) aerospace sciences; (6) mathematical models; (7) mission operations; (8) systems analysis and integration; (9) systems control; (10) structures and dynamics; (11) aerospace safety; and (12) remote sensin
An investigation into adaptive power reduction techniques for neural hardware
In light of the growing applicability of Artificial Neural Network (ANN) in the signal processing field [1] and the present thrust of the semiconductor industry towards lowpower SOCs for mobile devices [2], the power consumption of ANN hardware has become a very important implementation issue. Adaptability is a powerful and useful feature of neural networks. All current approaches for low-power ANN hardware techniques are ‘non-adaptive’ with respect to the power consumption of the network (i.e. power-reduction is not an objective of the adaptation/learning process). In the research work presented in this thesis, investigations on possible adaptive power reduction techniques have been carried out, which attempt to exploit the adaptability of neural networks in order to reduce the power consumption. Three separate approaches for such adaptive power reduction are proposed: adaptation of size, adaptation of network weights and adaptation of calculation precision. Initial case studies exhibit promising results with significantpower reduction
Computational aspects of parvalbumin-positive interneuron function
The activity of neurons is dependent on the manner in which they process synaptic inputs from other cells. In the event of clustered synaptic input, neurons can respond in a nonlinear manner through synaptic and dendritic mechanisms. Such mechanisms are well established in principal excitatory neurons throughout the brain, where they increase neuronal computational ability and information storage capacity. In contrast for parvalbumin-positive (PV+) interneurons, the most common cortical class of in- hibitory interneuron, synaptic integration is thought to be either linear or sub-linear in nature, facilitating their role as mediators of precise and fast inhibition. This thesis addresses situations in which PV+ interneurons integrate synaptic inputs in a nonlinear manner, and explores the functions of this synaptic processing. First, I describe a form of cooperative supralinear synaptic integration by local excitatory inputs onto PV+ interneurons, and I extend these results to show how this augments the computational capability of PV+ cells within spiking neuron networks. I also explore the importance of polyamine-modulation of synaptic receptors in mediating sublinear synaptic integration, and discuss how this expands the array of mechanisms known to perform similar functions in PV+ cells. Finally, I present work manipulating PV+ cells experimentally during epilepsy. I consider these findings together with recent scientific advances and suggest how they account for a number of open questions and previously contradictory theories of PV+ interneuron function
- …