Search CORE

156 research outputs found

Autonomously Reconfigurable Artificial Neural Network on a Chip

Author: Jin Zhanpeng
Publication venue
Publication date: 30/09/2010
Field of study

Artificial neural network (ANN), an established bio-inspired computing paradigm, has proved very effective in a variety of real-world problems and particularly useful for various emerging biomedical applications using specialized ANN hardware. Unfortunately, these ANN-based systems are increasingly vulnerable to both transient and permanent faults due to unrelenting advances in CMOS technology scaling, which sometimes can be catastrophic. The considerable resource and energy consumption and the lack of dynamic adaptability make conventional fault-tolerant techniques unsuitable for future portable medical solutions. Inspired by the self-healing and self-recovery mechanisms of human nervous system, this research seeks to address reliability issues of ANN-based hardware by proposing an Autonomously Reconfigurable Artificial Neural Network (ARANN) architectural framework. Leveraging the homogeneous structural characteristics of neural networks, ARANN is capable of adapting its structures and operations, both algorithmically and microarchitecturally, to react to unexpected neuron failures. Specifically, we propose three key techniques --- Distributed ANN, Decoupled Virtual-to-Physical Neuron Mapping, and Dual-Layer Synchronization --- to achieve cost-effective structural adaptation and ensure accurate system recovery. Moreover, an ARANN-enabled self-optimizing workflow is presented to adaptively explore a "Pareto-optimal" neural network structure for a given application, on the fly. Implemented and demonstrated on a Virtex-5 FPGA, ARANN can cover and adapt 93% chip area (neurons) with less than 1% chip overhead and O(n) reconfiguration latency. A detailed performance analysis has been completed based on various recovery scenarios

D-Scholarship@Pitt

Intrinsically Evolvable Artificial Neural Networks

Author: Merchant Saumil Girish
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2007
Field of study

Dedicated hardware implementations of neural networks promise to provide faster, lower power operation when compared to software implementations executing on processors. Unfortunately, most custom hardware implementations do not support intrinsic training of these networks on-chip. The training is typically done using offline software simulations and the obtained network is synthesized and targeted to the hardware offline. The FPGA design presented here facilitates on-chip intrinsic training of artificial neural networks. Block-based neural networks (BbNN), the type of artificial neural networks implemented here, are grid-based networks neuron blocks. These networks are trained using genetic algorithms to simultaneously optimize the network structure and the internal synaptic parameters. The design supports online structure and parameter updates, and is an intrinsically evolvable BbNN platform supporting functional-level hardware evolution. Functional-level evolvable hardware (EHW) uses evolutionary algorithms to evolve interconnections and internal parameters of functional modules in reconfigurable computing systems such as FPGAs. Functional modules can be any hardware modules such as multipliers, adders, and trigonometric functions. In the implementation presented, the functional module is a neuron block. The designed platform is suitable for applications in dynamic environments, and can be adapted and retrained online. The online training capability has been demonstrated using a case study. A performance characterization model for RC implementations of BbNNs has also been presented

University of Tennessee, Knoxville: Trace

A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks

Author: Bataller Mompean Manuel
Francés Villora José Vicente
Iakymchuk Taras
Medus Leandro Daniel
Rosado Muñoz Alfredo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

New chips for machine learning applications appear, they are tuned for a specific topology, being efficient by using highly parallel designs at the cost of high power or large complex devices. However, the computational demands of deep neural networks require flexible and efficient hardware architectures able to fit different applications, neural network types, number of inputs, outputs, layers, and units in each layer, making the migration from software to hardware easy. This paper describes novel hardware implementing any feedforward neural network (FFNN): multilayer perceptron, autoencoder, and logistic regression. The architecture admits an arbitrary input and output number, units in layers, and a number of layers. The hardware combines matrix algebra concepts with serial-parallel computation. It is based on a systolic ring of neural processing elements (NPE), only requiring as many NPEs as neuron units in the largest layer, no matter the number of layers. The use of resources grows linearly with the number of NPEs. This versatile architecture serves as an accelerator in real-time applications and its size does not affect the system clock frequency. Unlike most approaches, a single activation function block (AFB) for the whole FFNN is required. Performance, resource usage, and accuracy for several network topologies and activation functions are evaluated. The architecture reaches 550 MHz clock speed in a Virtex7 FPGA. The proposed implementation uses 18-bit fixed point achieving similar classification performance to a floating point approach. A reduced weight bit size does not affect the accuracy, allowing more weights in the same memory. Different FFNN for Iris and MNIST datasets were evaluated and, for a real-time application of abnormal cardiac detection, a x256 acceleration was achieved. The proposed architecture can perform up to 1980 Giga operations per second (GOPS), implementing the multilayer FFNN of up to 3600 neurons per layer in a single chip. The architecture can be extended to bigger capacity devices or multi-chip by the simple NPE ring extension

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

Recommended from our members

Enabling high-performance, mixed-signal approximate computing

Author: St Amant Renee Marie
Publication venue
Publication date: 07/07/2014
Field of study

textFor decades, the semiconductor industry enjoyed exponential improvements in microprocessor power and performance with the device scaling of successive technology generations. Scaling limitations at sub-micron technologies, however, have ceased to provide these historical performance improvements within a limited power budget. While device scaling provides a larger number of transistors per chip, for the same chip area, a growing percentage of the chip will have to be powered off at any given time due to power constraints. As such, the architecture community has focused on energy-efficient designs and is looking to specialized hardware to provide gains in performance. A focus on energy efficiency, along with increasingly less reliable transistors due to device scaling, has led to research in the area of approximate computing, where accuracy is traded for energy efficiency when precise computation is not required. There is a growing body of approximation-tolerant applications that, for example, compute on noisy or incomplete data, such as real-world sensor inputs, or make approximations to decrease the computation load in the analysis of cumbersome data sets. These approximation-tolerant applications span application domains, such as machine learning, image processing, robotics, and financial analysis, among others. Since the advent of the modern processor, computing models have largely presumed the attribute of accuracy. A willingness to relax accuracy requirements, however, with goal of gaining energy efficiency, warrants the re-investigation of the potential of analog computing. Analog hardware offers the opportunity for fast and low-power computation; however, it presents challenges in the form of accuracy. Where analog compute blocks have been applied to solve fixed-function problems, general-purpose computing has relied on digital hardware implementations that provide generality and programmability. The work presented in this thesis aims to answer the following questions: Can analog circuits be successfully integrated into general-purpose computing to provide performance and energy savings? And, what is required to address the historical analog challenges of inaccuracy, programmability, and a lack of generality to enable such an approach? This thesis work investigates a neural approach as a means to address the historical analog challenges of inaccuracy, programmability, and generality and to enable the use of analog circuits in general-purpose, high-performance computing. The first piece of this thesis work investigates the use of analog circuits at the microarchitecture level in the form of an analog neural branch predictor. The task of branch prediction can tolerate imprecision, as roll-back mechanisms correct for branch mispredictions, and application-level accuracy remains unaffected. We show that analog circuits enable the implementation of a highly-accurate, neural-prediction algorithm that is infeasible to implement in the digital domain. The second piece of this thesis work presents a neural accelerator that targets approximation-tolerant code. Analog neural acceleration provides application speedup of 3.3x and energy savings of 12.1x with a quality loss less than 10% for all except one approximation-tolerant benchmark. These results show that, using a neural approach, analog circuits can be applied to provide performance and energy efficiency in high-performance, general-purpose computing.Computer Science

Texas ScholarWorks

Data Mining Applications to Fault Diagnosis in Power Electronic Systems: A Systematic Review

Author: Anvari-Moghaddam Amjad
Mohammadi-Ivatloo Behnam
Moradzadeh Arash
Pourhossein Kazem
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

VBN

An investigation into adaptive power reduction techniques for neural hardware

Author: Modi Sankalp
Publication venue
Publication date: 01/12/2011
Field of study

In light of the growing applicability of Artificial Neural Network (ANN) in the signal processing field [1] and the present thrust of the semiconductor industry towards lowpower SOCs for mobile devices [2], the power consumption of ANN hardware has become a very important implementation issue. Adaptability is a powerful and useful feature of neural networks. All current approaches for low-power ANN hardware techniques are ‘non-adaptive’ with respect to the power consumption of the network (i.e. power-reduction is not an objective of the adaptation/learning process). In the research work presented in this thesis, investigations on possible adaptive power reduction techniques have been carried out, which attempt to exploit the adaptability of neural networks in order to reduce the power consumption. Three separate approaches for such adaptive power reduction are proposed: adaptation of size, adaptation of network weights and adaptation of calculation precision. Initial case studies exhibit promising results with significantpower reduction

Southampton (e-Prints Soton)

Fault Diagnostic System for Cascaded H-bridge Multilevel Inverter Drives Based on Artificial Intelligent Approaches Incorporating a Reconfiguration Technique

Author: Khomfoi Surin
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2007
Field of study

A fault diagnostic and reconfiguration system in a multilevel inverter drive (MLID) using artificial intelligent based techniques is developed in this dissertation. Output phase voltages of a MLID can be used as valuable information to diagnose faults and their locations. It is difficult to diagnose a MLID system using a mathematical model because MLID systems consist of many switching devices and their system complexity has a nonlinear factor. Therefore, a neural network (NN) classification is applied to the fault diagnosis of a MLID system. Multilayer perceptron (MLP) networks are used to identify the type and location of occurring faults. The principal component analysis (PCA) is utilized in the feature extraction process to reduce the NN input size. A lower dimensional input space will also usually reduce the time necessary to train a NN, and the reduced noise may improve the mapping performance. The genetic algorithm is also applied to select the valuable principal components. The comparison among MLP neural network (NN), principal component neural network (PC-NN), and genetic algorithm based selective principal component neural network (PC-GA-NN) are performed. Proposed neural networks are evaluated with simulation test set and experimental test set. The PC-NN has improved overall classification performance from NN by about 5% points, whereas PC-GA-NN has better overall classification performance from NN by about 7.5% points. Therefore, the application of a genetic algorithm improves the classification from PC-NN by about 2.5% point. The overall classification performance of the proposed networks is more than 90%. A reconfiguration technique is also developed. The effects of using the developed reconfiguration technique at high modulation index are addressed. The developed fault diagnostic system is validated with experimental results. The developed fault diagnostic system requires about 6 cycles at 60 Hz to clear an open circuit and about 9 cycles at 60 Hz to clear a short circuit fault. The experimental results show that the developed system performs satisfactorily to detect the fault type, fault location, and reconfiguration

University of Tennessee, Knoxville: Trace

Run-time reconfiguration for efficient tracking of implanted magnets with a myokinetic control interface applied to robotic hands

Author: Mendez Sergio Andres Pertuz
Publication venue
Publication date: 05/04/2021
Field of study

Tese (doutorado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2021.Este trabalho introduz a aplicação de soluções de aprendizagem de máquinas visado ao problema do rastreamento de posição do antebraço baseado em sensores magnéticos. Especi ficamente, emprega-se uma estratégia baseada em dados para criar modelos matemáticos que possam traduzir as informações magnéticas medidas em entradas utilizáveis para dispositivos protéticos. Estes modelos são implementados em FPGAs usando operadores customizados de ponto flutuante para otimizar o consumo de hardware e energia, que são importantes em dispositivos embarcados. A arquitetura de hardware é proposta para ser implementada como um sistema com reconfiguração dinâmica parcial, reduzindo potencialmente a utilização de recursos e o consumo de energia da FPGA. A estratégia de dados proposta e sua implemen tação de hardware pode alcançar uma latência na ordem de microssegundos e baixo consumo de energia, o que encoraja mais pesquisas para melhorar os métodos aqui desenvolvidos para outras aplicações.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).This work introduces the application of embedded machine learning solutions for the problem of magnetic sensors-based limb tracking. Namely, we employ a data-driven strat egy to create mathematical models that can translate the magnetic information measured to usable inputs for prosthetic devices. These models are implemented in FPGAs using cus tomized floating-point operations to optimize hardware and energy consumption, which are important in wearable devices. The hardware architecture is proposed to be implemented as a dynamically partial reconfigured system, potentially reducing resource utilization and power consumption of the FPGA. The proposed data-driven strategy and its hardware implementa tion can achieve a latency in the order of microseconds and low energy consumption, which encourages further research on improving the methods herein devised for other application

Repositório Institucional da Universidade de Brasília