Search CORE

1,690 research outputs found

Experiments on autonomous Boolean networks

Author: Gauthier Daniel J.
Rontani Damien
Rosin David P.
Schöll Eckehard
Publication venue: 'AIP Publishing'
Publication date: 16/04/2013
Field of study

We realize autonomous Boolean networks by using logic gates in their autonomous mode-of-operation on a field-programmable gate array. This allows us to implement time-continuous systems with complex dynamical behaviors that can be conveniently interconnected into large-scale networks with flexible topologies that consist of time-delay links and a large number of nodes. We demonstrate how we realize networks with periodic, chaotic, and excitable dynamics and study their properties. Field-programmable gate arrays define a new experimental paradigm that holds great potential to test a large body of theoretical results on the dynamics of complex networks, which has been beyond reach of traditional experimental approaches.Comment: 10 pages, 6 figure

arXiv.org e-Print Archive

Multigrid Solvers in Reconfigurable Hardware

Author: Damaj Issam
Haraty Ramzi
Kasbah Safaa
Publication venue: 'Elsevier BV'
Publication date: 01/04/2019
Field of study

The problem of finding the solution of Partial Differential Equations (PDEs) plays a central role in modeling real world problems. Over the past years, Multigrid solvers have showed their robustness over other techniques, due to its high convergence rate which is independent of the problem size. For this reason, many attempts for exploiting the inherent parallelism of Multigrid have been made to achieve the desired efficiency and scalability of the method. Yet, most efforts fail in this respect due to many factors (time, resources) governed by software implementations. In this paper, we present a hardware implementation of the V-cycle Multigrid method for finding the solution of a 2D-Poisson equation. We use Handel-C to implement our hardware design, which we map onto available Field Programmable Gate Arrays (FPGAs). We analyze the implementation performance using the FPGA vendor's tools. We demonstrate the robustness of Multigrid over other iterative solvers, such as Jacobi and Successive Over Relaxation (SOR), in both hardware and software. We compare our findings with a C++ version of each algorithm. The obtained results show better performance when compared to existing software versions.Comment: 24 Pages, 11 Figures, 10 Table

arXiv.org e-Print Archive

Reconfigurable Hardware Implementation of the Successive Overrelaxation Method

Author: Damaj Issam
Haraty Ramzi
Kasbah Safaa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/05/2019
Field of study

In this chapter, we study the feasibility of implementing SOR in reconfigurable hardware. We use Handel-C, a higher level design tool, to code our design, which is analyzed, synthesized, and placed and routed using the FPGAs proprietary software (DK Design Suite, Xilinx ISE 8.1i, and Quartus II 5.1). We target Virtex II Pro, Altera Stratix, and Spartan3L, which is embedded in the RC10 FPGA-based system from Celoxica. We report our timing results when targeting Virtex II Pro and compare them to software version results written in C++ and running on a general purpose processor (GPP).Comment: 15 pages, 5 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:1904.0062

arXiv.org e-Print Archive

A High-Performance HOG Extractor on FPGA

Author: Carrabina Jordi
Casadevall Arnau
Castells-Rufas David
Codina Marc
Ngo Vinh
Publication venue
Publication date: 12/01/2018
Field of study

Pedestrian detection is one of the key problems in emerging self-driving car industry. And HOG algorithm has proven to provide good accuracy for pedestrian detection. There are plenty of research works have been done in accelerating HOG algorithm on FPGA because of its low-power and high-throughput characteristics. In this paper, we present a high-performance HOG architecture for pedestrian detection on a low-cost FPGA platform. It achieves a maximum throughput of 526 FPS with 640x480 input images, which is 3.25 times faster than the state of the art design. The accelerator is integrated with SVM-based prediction in realizing a pedestrian detection system. And the power consumption of the whole system is comparable with the best existing implementations.Comment: Presented at HIP3ES, 201

arXiv.org e-Print Archive

Infrastructure for Usable Machine Learning: The Stanford DAWN Project

Author: Bailis Peter
Olukotun Kunle
Re Christopher
Zaharia Matei
Publication venue
Publication date: 08/06/2017
Field of study

Despite incredible recent advances in machine learning, building machine learning applications remains prohibitively time-consuming and expensive for all but the best-trained, best-funded engineering organizations. This expense comes not from a need for new and improved statistical models but instead from a lack of systems and tools for supporting end-to-end machine learning application development, from data preparation and labeling to productionization and monitoring. In this document, we outline opportunities for infrastructure supporting usable, end-to-end machine learning applications in the context of the nascent DAWN (Data Analytics for What's Next) project at Stanford

arXiv.org e-Print Archive

A Survey of Methods For Analyzing and Improving GPU Energy Efficiency

Author: Mittal Sparsh
Vetter Jeffrey S.
Publication venue
Publication date: 18/04/2014
Field of study

Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to synthesize research works which compare energy efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim of this survey is to provide researchers with knowledge of state-of-the-art in GPU power management and motivate them to architect highly energy-efficient GPUs of tomorrow.Comment: Accepted with minor revision in ACM Computing Survey Journal (impact factor 3.85, five year impact of 7.85

arXiv.org e-Print Archive

Software-defined Radios: Architecture, State-of-the-art, and Challenges

Author: Akeela Rami
Dezfouli Behnam
Publication venue
Publication date: 18/04/2018
Field of study

Software-defined Radio (SDR) is a programmable transceiver with the capability of operating various wireless communication protocols without the need to change or update the hardware. Progress in the SDR field has led to the escalation of protocol development and a wide spectrum of applications, with more emphasis on programmability, flexibility, portability, and energy efficiency, in cellular, WiFi, and M2M communication. Consequently, SDR has earned a lot of attention and is of great significance to both academia and industry. SDR designers intend to simplify the realization of communication protocols while enabling researchers to experiment with prototypes on deployed networks. This paper is a survey of the state-of-the-art SDR platforms in the context of wireless communication protocols. We offer an overview of SDR architecture and its basic components, then discuss the significant design trends and development tools. In addition, we highlight key contrasts between SDR architectures with regards to energy, computing power, and area, based on a set of metrics. We also review existing SDR platforms and present an analytical comparison as a guide to developers. Finally, we recognize a few of the related research topics and summarize potential solutions

arXiv.org e-Print Archive

A Hardware Friendly Unsupervised Memristive Neural Network with Weight Sharing Mechanism

Author: Chang Sheng
He Jin
Huang Qijun
Lin Peng
Ma Qiming
Tang Zhiri
Wang Hao
Zhu Ruohua
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Memristive neural networks (MNNs), which use memristors as neurons or synapses, have become a hot research topic recently. However, most memristors are not compatible with mainstream integrated circuit technology and their stabilities in large-scale are not very well so far. In this paper, a hardware friendly MNN circuit is introduced, in which the memristive characteristics are implemented by digital integrated circuit. Through this method, spike timing dependent plasticity (STDP) and unsupervised learning are realized. A weight sharing mechanism is proposed to bridge the gap of network scale and hardware resource. Experiment results show the hardware resource is significantly saved with it, maintaining good recognition accuracy and high speed. Moreover, the tendency of resource increase is slower than the expansion of network scale, which infers our method's potential on large scale neuromorphic network's realization.Comment: 10 pages, 11 figure

arXiv.org e-Print Archive

High-level Synthesis

Author: Damaj Issam
Publication venue: 'Wiley'
Publication date: 03/05/2019
Field of study

Hardware synthesis is a general term used to refer to the processes involved in automatically generating a hardware design from its specification. High-level synthesis (HLS) could be defined as the translation from a behavioral description of the intended hardware circuit into a structural description similar to the compilation of programming languages (such as C and Pascal into assembly language. The chained synthesis tasks at each level of the design process include system synthesis, register-transfer synthesis, logic synthesis, and circuit synthesis. The development of hardware solutions for complex applications is no more a complicated task with the emergence of various HLS tools. Many areas of application have benefited from the modern advances in hardware design, such as automotive and aerospace industries, computer graphics, signal and image processing, security, complex simulations like molecular modeling, and DND matching. The field of HLS is continuing its rapid growth to facilitate the creation of hardware and to blur more and more the border separating the processes of designing hardware and software.Comment: 19 Pages, 16 Figures. arXiv admin note: text overlap with arXiv:1905.02075, arXiv:1905.0207

arXiv.org e-Print Archive

Recent Advances in Physical Reservoir Computing: A Review

Author: Hirose Akira
Héroux Jean Benoit
Kanazawa Naoki
Nakane Ryosho
Nakano Daiju
Numata Hidetoshi
Takeda Seiji
Tanaka Gouhei
Yamane Toshiyuki
Publication venue: 'Elsevier BV'
Publication date: 15/04/2019
Field of study

Reservoir computing is a computational framework suited for temporal/sequential data processing. It is derived from several recurrent neural network models, including echo state networks and liquid state machines. A reservoir computing system consists of a reservoir for mapping inputs into a high-dimensional space and a readout for pattern analysis from the high-dimensional states in the reservoir. The reservoir is fixed and only the readout is trained with a simple method such as linear regression and classification. Thus, the major advantage of reservoir computing compared to other recurrent neural networks is fast learning, resulting in low training cost. Another advantage is that the reservoir without adaptive updating is amenable to hardware implementation using a variety of physical systems, substrates, and devices. In fact, such physical reservoir computing has attracted increasing attention in diverse fields of research. The purpose of this review is to provide an overview of recent advances in physical reservoir computing by classifying them according to the type of the reservoir. We discuss the current issues and perspectives related to physical reservoir computing, in order to further expand its practical applications and develop next-generation machine learning systems.Comment: 62 pages, 13 figure

arXiv.org e-Print Archive