Search CORE

35 research outputs found

Selective Decoding in Associative Memories Based on Sparse-Clustered Networks

Author: Gross Warren J.
Jarollahi Hooman
Onizawa Naoya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Associative memories are structures that can retrieve previously stored information given a partial input pattern instead of an explicit address as in indexed memories. A few hardware approaches have recently been introduced for a new family of associative memories based on Sparse-Clustered Networks (SCN) that show attractive features. These architectures are suitable for implementations with low retrieval latency, but are limited to small networks that store a few hundred data entries. In this paper, a new hardware architecture of SCNs is proposed that features a new data-storage technique as well as a method we refer to as Selective Decoding (SD-SCN). The SD-SCN has been implemented using a similar FPGA used in the previous efforts and achieves two orders of magnitude higher capacity, with no error-performance penalty but with the cost of few extra clock cycles per data access.Comment: 4 pages, Accepted in IEEE Global SIP 2013 conferenc

arXiv.org e-Print Archive

CiteSeerX

Crossref

VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing

Author: Ardakani Arash
Gross Warren J.
Hanyu Takahiro
Leduc-Primeau François
Onizawa Naoya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The hardware implementation of deep neural networks (DNNs) has recently received tremendous attention: many applications in fact require high-speed operations that suit a hardware implementation. However, numerous elements and complex interconnections are usually required, leading to a large area occupation and copious power consumption. Stochastic computing has shown promising results for low-power area-efficient hardware implementations, even though existing stochastic algorithms require long streams that cause long latencies. In this paper, we propose an integer form of stochastic computation and introduce some elementary circuits. We then propose an efficient implementation of a DNN based on integral stochastic computing. The proposed architecture has been implemented on a Virtex7 FPGA, resulting in 45% and 62% average reductions in area and latency compared to the best reported architecture in literature. We also synthesize the circuits in a 65 nm CMOS technology and we show that the proposed integral stochastic architecture results in up to 21% reduction in energy consumption compared to the binary radix implementation at the same misclassification rate. Due to fault-tolerant nature of stochastic architectures, we also consider a quasi-synchronous implementation which yields 33% reduction in energy consumption w.r.t. the binary radix implementation without any compromise on performance.Comment: 11 pages, 12 figure

arXiv.org e-Print Archive

Crossref

HAL-Université de Bretagne Occidentale

PolyPublie

Stochastic Simulated Quantum Annealing for Fast Solution of Combinatorial Optimization Problems

Author: Gross Warren J.
Hanyu Takahiro
Onizawa Naoya
Sasaki Ryoma
Shin Duckgyu
Publication venue
Publication date: 28/06/2023
Field of study

In this paper, we introduce stochastic simulated quantum annealing (SSQA) for large-scale combinatorial optimization problems. SSQA is designed based on stochastic computing and quantum Monte Carlo, which can simulate quantum annealing (QA) by using multiple replicas of spins (probabilistic bits) in classical computing. The use of stochastic computing leads to an efficient parallel spin-state update algorithm, enabling quick search for a solution around the global minimum energy. Therefore, SSQA realizes quantum-like annealing for large-scale problems and can handle fully connected models in combinatorial optimization, unlike QA. The proposed method is evaluated in MATLAB on graph isomorphism problems, which are typical combinatorial optimization problems. The proposed method achieves a convergence speed an order of magnitude faster than a conventional stochastic simulaated annealing method. Additionally, it can handle a 100-times larger problem size compared to QA and a 25-times larger problem size compared to a traditional SA method, respectively, for similar convergence probabilities.Comment: 14 pages, 8 figure

arXiv.org e-Print Archive

A Design Framework for Invertible Logic

Author: Fujita Hiroyuki
Gross Warren J.
Hanyu Takahiro
Meyer Brett H.
Nishino Kaito
Onizawa Naoya
Smithson Sean C.
Yamagata Hitoshi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/06/2020
Field of study

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Algorithm and Architecture of Fully-Parallel Associative Memories Based on Sparse Clustered Networks

Author: Gripon Vincent
Gross Warren
Jarollahi Hooman
Onizawa Naoya
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/09/2014
Field of study

International audienceAssociative memories retrieve stored information given partial or erroneous input patterns. A new family of associative memories based on Sparse Clustered Networks (SCNs) has been recently introduced that can store many more messages than classical Hopfield-Neural Networks (HNNs). In this paper, we propose fully-parallel hardware architectures of such memories for partial or erroneous inputs. The proposed architectures eliminate winner-take-all modules and thus reduce the hardware complexity by consuming 65 % fewer FPGA lookup tables and increase the operating frequency by approximately 1.9 times compared to that of previous work. Furthermore, the scaling behaviour of the implemented architectures for various design choices are investigated. We explore the effect of varying design variables such as the number of clusters, network nodes, and erased symbols on the error performance and the hardware resources

HAL Descartes

HAL-Université de Bretagne Occidentale

Algorithm and Architecture for a Low-Power Content-Addressable Memory Based on Sparse-Clustered Networks

Author: Gripon Vincent
GROSS Warren
JAROLLAHI Hooman
ONIZAWA Naoya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2015
Field of study

International audienceWe propose a low-power content-addressable memory (CAM) employing a new algorithm for associativity between the input tag and the corresponding address of the output data. The proposed architecture is based on a recently developed sparse clustered network using binary connections that on-average eliminates most of the parallel comparisons performed during a search. Therefore, the dynamic energy consumption of the proposed design is significantly lower compared with that of a conventional low-power CAM design. Given an input tag, the proposed architecture computes a few possibilities for the location of the matched tag and performs the comparisons on them to locate a single valid match. TSMC 65-nm CMOS technology was used for simulation purposes. Following a selection of design parameters, such as the number of CAM entries, the energy consumption and the search delay of the proposed design are 8%, and 26% of that of the conventional NAND architecture, respectively, with a 10% area overhead. A design methodology based on the silicon area and power budgets, and performance requirements is discussed

HAL-Université de Bretagne Occidentale

Algorithm and Architecture of Fully-Parallel Associative Memories Based on Sparse Clustered Networks

Author: Gripon Vincent
Gross Warren
Jarollahi Hooman
Onizawa Naoya
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/09/2014
Field of study

HAL Descartes

Application of stochastic computing in brainware

Author: Kazumichi Matsumiya
Naoya Onizawa
Takahiro Hanyu
Warren Gross
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/10/2018
Field of study

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Multiple-valued duplex asynchronous data transfer scheme for interleaving in LDPC decoders

Author: Gaudet Vincent C.
Hanyu Takahiro
Mochizuki Akira
Onizawa Naoya
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 17/04/2009
Field of study

Institutional Repositories DataBase (IRDB)