Search CORE

542 research outputs found

NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors

Author: Cheung K
Luk W
Schultz SR
Publication venue: 'Frontiers Media SA'
Publication date: 11/12/2015
Field of study

© 2016 Cheung, Schultz and Luk.NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation

Spiral - Imperial College Digital Repository

Automatic Nested Loop Acceleration on FPGAs Using Soft CGRA Overlay

Author: Liu C
Ng HC
So HKH
Publication venue: United Kingdom
Publication date: 01/01/2015
Field of study

Session 1: HLS Toolingpostprin

HKU Scholars Hub

기기 상에서의 심층 신경망 개인화 방법

Author: 바렌드 토마스
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. Egger, Bernhard.There exist several deep neural network (DNN) architectures suitable for embedded inference, however little work has focused on training neural networks on-device. User customization of DNNs is desirable due to the difficulty of collecting a training set representative of real world scenarios. Additionally, inter-user variation means that a general model has a limitation on its achievable accuracy. In this thesis, a DNN architecture that allows for low power on-device user customization is proposed. This approach is applied to handwritten character recognition of both the Latin and the Korean alphabets. Experiments show a 3.5-fold reduction of the prediction error after user customization for both alphabets compared to a DNN trained with general data. This architecture is additionally evaluated using a number of embedded processors demonstrating its practical application.내장형 기기에서 심층 신경망을 추론할 수 있는 아키텍처들은 존재하지만 내장형 기기에서 신경망을 학습하는 연구는 별로 이뤄지지 않았다. 실제 환경을 반영하는 학습용 데이터 집합을 모으는 것이 어렵고 사용자간의 다양성으로 인해 일반적으로 학습된 모델이 충분한 정확도를 가지기엔 한계가 존재하기 때문에 사용자 맞춤형 심층 신경망이 필요하다. 이 논문에서는 기기상에서 저전력으로 사용자 맞춤화가 가능한 심층 신경망 아키텍처를 제안한다. 이러한 접근 방법은 라틴어와 한글의 필기체 글자 인식에 적용된다. 라틴어와 한글에 사용자 맞춤화를 적용하여 일반적인 데이터로 학습한 심층 신경망보다 3.5배나 작은 예측 오류의 결과를 얻었다. 또한 이 아키텍처의 실용성을 보여주기 위하여 다양한 내장형 프로세서에서 실험을 진행하였다.Abstract i Contents iii List of Figures vii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Motivation 4 Chapter 3 Background 6 3.1 Deep Neural Networks 6 3.1.1 Inference 6 3.1.2 Training 7 3.2 Convolutional Neural Networks 8 3.3 On-Device Acceleration 9 3.3.1 Hardware Accelerators 9 3.3.2 Software Optimization 10 Chapter 4 Methodology 12 4.1 Initialization 13 4.2 On-Device Training 14 Chapter 5 Implementation 16 5.1 Pre-processing 16 5.2 Latin Handwritten Character Recognition 17 5.2.1 Dataset and BIE Selection 17 5.2.2 AE Design 17 5.3 Korean Handwritten Character Recognition 21 5.3.1 Dataset and BIE Selection 21 5.3.2 AE Design 21 Chapter 6 On-Device Acceleration 26 6.1 Architecure Optimizations 27 6.2 Compiler Optimizations 29 Chapter 7 Experimental Setup 30 Chapter 8 Evaluation 33 8.1 Latin Handwritten Character Recognition 33 8.2 Korean Handwritten Character Recognition 38 8.3 On-Device Acceleration 40 Chapter 9 Related Work 44 Chapter 10 Conclusion 47 Bibliography 47 요약 55 Acknowledgements 56Maste

SNU Open Repository and Archive

Making a case for an ARM Cortex-A9 CPU interlay replacing the NEON SIMD unit

Author: Garcia Ordaz Jose Raul
Koch Dirk
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Crossref

The University of Manchester - Institutional Repository