Search CORE

6 research outputs found

VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing

Author: Ardakani Arash
Gross Warren J.
Hanyu Takahiro
Leduc-Primeau François
Onizawa Naoya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The hardware implementation of deep neural networks (DNNs) has recently received tremendous attention: many applications in fact require high-speed operations that suit a hardware implementation. However, numerous elements and complex interconnections are usually required, leading to a large area occupation and copious power consumption. Stochastic computing has shown promising results for low-power area-efficient hardware implementations, even though existing stochastic algorithms require long streams that cause long latencies. In this paper, we propose an integer form of stochastic computation and introduce some elementary circuits. We then propose an efficient implementation of a DNN based on integral stochastic computing. The proposed architecture has been implemented on a Virtex7 FPGA, resulting in 45% and 62% average reductions in area and latency compared to the best reported architecture in literature. We also synthesize the circuits in a 65 nm CMOS technology and we show that the proposed integral stochastic architecture results in up to 21% reduction in energy consumption compared to the binary radix implementation at the same misclassification rate. Due to fault-tolerant nature of stochastic architectures, we also consider a quasi-synchronous implementation which yields 33% reduction in energy consumption w.r.t. the binary radix implementation without any compromise on performance.Comment: 11 pages, 12 figure

arXiv.org e-Print Archive

Crossref

HAL-Université de Bretagne Occidentale

HAL Descartes

PolyPublie

Hal-Diderot

A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network

Author: Holt J.
Lindsey C.
Lok-Won Kim
Memisevic R.
Memisevic R.
Ralph Linsker
Ranzato M.
Sameh Asaad
Zhu J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Heterogeneous computing system with field programmable gate array coprocessor for decision tree learning

Author: Škoda Peter
Publication venue
Publication date: 01/01/2014
Field of study

U ovom radu prikazan je heterogeni računalni sustav i novi hibridni algoritam za učenje stabla odluke Dataflow decision tree construction – DF‑DTC. Algoritam DF‑DTC zasnovan je na algoritmu C4.5. Heterogeni sustav sadrži koprocesor izveden programirljivim poljem logičkih elemenata (FPGA, engl. field programmable gate array). Razrada arhitekture koprocesora i hibridnog algoritma DF‑DTC provedena je metodologijom programsko-sklopovskog suobliokovanja. U koprocesoru je izvedena obrada nominalnih atributa skupa za učenje, a u algoritam su uvedene prilagodbe podatkovnih struktura, te podrška za višedtretveno izvođenje. Vrednovanje performansi provedeno je mjerenjem ukupnog vremena izvršavanja rada programa, te mjerenjem vremena izvršavanja ključnih dijelova algoritma. Pri vrednovanju su korišteni sintetički skupovi za učenje, te skupovi za učenje javno dostupni na UCI repozitoriju. Performanse DF‑DTC-a uspoređene su s performansama postojeće programske implementacije algoritma EC4.5. Ubrzanje obrade nominalnih atributa na DF‑DTC-u iznosi u prosjeku 3, 00 puta u usporedbi s programskom implementacijom EC4.5. Za cjelokupno izvršavanje programa najbolje ubrzanje iznosi 1, 18 puta. Izvedba DF‑DTC-a za pokazala je potencijal FPGA-a kao platforme za ubrzanje učenja stabla odluke

Croatian Digital Dissertations Repository

University of Zagreb Repository

Full-text Institutional Repository of the Ruđer Bošković Institute

FER Repository