1,690 research outputs found
Experiments on autonomous Boolean networks
We realize autonomous Boolean networks by using logic gates in their
autonomous mode-of-operation on a field-programmable gate array. This allows us
to implement time-continuous systems with complex dynamical behaviors that can
be conveniently interconnected into large-scale networks with flexible
topologies that consist of time-delay links and a large number of nodes. We
demonstrate how we realize networks with periodic, chaotic, and excitable
dynamics and study their properties. Field-programmable gate arrays define a
new experimental paradigm that holds great potential to test a large body of
theoretical results on the dynamics of complex networks, which has been beyond
reach of traditional experimental approaches.Comment: 10 pages, 6 figure
Multigrid Solvers in Reconfigurable Hardware
The problem of finding the solution of Partial Differential Equations (PDEs)
plays a central role in modeling real world problems. Over the past years,
Multigrid solvers have showed their robustness over other techniques, due to
its high convergence rate which is independent of the problem size. For this
reason, many attempts for exploiting the inherent parallelism of Multigrid have
been made to achieve the desired efficiency and scalability of the method. Yet,
most efforts fail in this respect due to many factors (time, resources)
governed by software implementations. In this paper, we present a hardware
implementation of the V-cycle Multigrid method for finding the solution of a
2D-Poisson equation. We use Handel-C to implement our hardware design, which we
map onto available Field Programmable Gate Arrays (FPGAs). We analyze the
implementation performance using the FPGA vendor's tools. We demonstrate the
robustness of Multigrid over other iterative solvers, such as Jacobi and
Successive Over Relaxation (SOR), in both hardware and software. We compare our
findings with a C++ version of each algorithm. The obtained results show better
performance when compared to existing software versions.Comment: 24 Pages, 11 Figures, 10 Table
Reconfigurable Hardware Implementation of the Successive Overrelaxation Method
In this chapter, we study the feasibility of implementing SOR in
reconfigurable hardware. We use Handel-C, a higher level design tool, to code
our design, which is analyzed, synthesized, and placed and routed using the
FPGAs proprietary software (DK Design Suite, Xilinx ISE 8.1i, and Quartus II
5.1). We target Virtex II Pro, Altera Stratix, and Spartan3L, which is embedded
in the RC10 FPGA-based system from Celoxica. We report our timing results when
targeting Virtex II Pro and compare them to software version results written in
C++ and running on a general purpose processor (GPP).Comment: 15 pages, 5 figures, 4 tables. arXiv admin note: substantial text
overlap with arXiv:1904.0062
A High-Performance HOG Extractor on FPGA
Pedestrian detection is one of the key problems in emerging self-driving car
industry. And HOG algorithm has proven to provide good accuracy for pedestrian
detection. There are plenty of research works have been done in accelerating
HOG algorithm on FPGA because of its low-power and high-throughput
characteristics. In this paper, we present a high-performance HOG architecture
for pedestrian detection on a low-cost FPGA platform. It achieves a maximum
throughput of 526 FPS with 640x480 input images, which is 3.25 times faster
than the state of the art design. The accelerator is integrated with SVM-based
prediction in realizing a pedestrian detection system. And the power
consumption of the whole system is comparable with the best existing
implementations.Comment: Presented at HIP3ES, 201
Infrastructure for Usable Machine Learning: The Stanford DAWN Project
Despite incredible recent advances in machine learning, building machine
learning applications remains prohibitively time-consuming and expensive for
all but the best-trained, best-funded engineering organizations. This expense
comes not from a need for new and improved statistical models but instead from
a lack of systems and tools for supporting end-to-end machine learning
application development, from data preparation and labeling to
productionization and monitoring. In this document, we outline opportunities
for infrastructure supporting usable, end-to-end machine learning applications
in the context of the nascent DAWN (Data Analytics for What's Next) project at
Stanford
A Survey of Methods For Analyzing and Improving GPU Energy Efficiency
Recent years have witnessed a phenomenal growth in the computational
capabilities and applications of GPUs. However, this trend has also led to
dramatic increase in their power consumption. This paper surveys research works
on analyzing and improving energy efficiency of GPUs. It also provides a
classification of these techniques on the basis of their main research idea.
Further, it attempts to synthesize research works which compare energy
efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim
of this survey is to provide researchers with knowledge of state-of-the-art in
GPU power management and motivate them to architect highly energy-efficient
GPUs of tomorrow.Comment: Accepted with minor revision in ACM Computing Survey Journal (impact
factor 3.85, five year impact of 7.85
Software-defined Radios: Architecture, State-of-the-art, and Challenges
Software-defined Radio (SDR) is a programmable transceiver with the
capability of operating various wireless communication protocols without the
need to change or update the hardware. Progress in the SDR field has led to the
escalation of protocol development and a wide spectrum of applications, with
more emphasis on programmability, flexibility, portability, and energy
efficiency, in cellular, WiFi, and M2M communication. Consequently, SDR has
earned a lot of attention and is of great significance to both academia and
industry. SDR designers intend to simplify the realization of communication
protocols while enabling researchers to experiment with prototypes on deployed
networks. This paper is a survey of the state-of-the-art SDR platforms in the
context of wireless communication protocols. We offer an overview of SDR
architecture and its basic components, then discuss the significant design
trends and development tools. In addition, we highlight key contrasts between
SDR architectures with regards to energy, computing power, and area, based on a
set of metrics. We also review existing SDR platforms and present an analytical
comparison as a guide to developers. Finally, we recognize a few of the related
research topics and summarize potential solutions
A Hardware Friendly Unsupervised Memristive Neural Network with Weight Sharing Mechanism
Memristive neural networks (MNNs), which use memristors as neurons or
synapses, have become a hot research topic recently. However, most memristors
are not compatible with mainstream integrated circuit technology and their
stabilities in large-scale are not very well so far. In this paper, a hardware
friendly MNN circuit is introduced, in which the memristive characteristics are
implemented by digital integrated circuit. Through this method, spike timing
dependent plasticity (STDP) and unsupervised learning are realized. A weight
sharing mechanism is proposed to bridge the gap of network scale and hardware
resource. Experiment results show the hardware resource is significantly saved
with it, maintaining good recognition accuracy and high speed. Moreover, the
tendency of resource increase is slower than the expansion of network scale,
which infers our method's potential on large scale neuromorphic network's
realization.Comment: 10 pages, 11 figure
High-level Synthesis
Hardware synthesis is a general term used to refer to the processes involved
in automatically generating a hardware design from its specification.
High-level synthesis (HLS) could be defined as the translation from a
behavioral description of the intended hardware circuit into a structural
description similar to the compilation of programming languages (such as C and
Pascal into assembly language. The chained synthesis tasks at each level of the
design process include system synthesis, register-transfer synthesis, logic
synthesis, and circuit synthesis. The development of hardware solutions for
complex applications is no more a complicated task with the emergence of
various HLS tools. Many areas of application have benefited from the modern
advances in hardware design, such as automotive and aerospace industries,
computer graphics, signal and image processing, security, complex simulations
like molecular modeling, and DND matching. The field of HLS is continuing its
rapid growth to facilitate the creation of hardware and to blur more and more
the border separating the processes of designing hardware and software.Comment: 19 Pages, 16 Figures. arXiv admin note: text overlap with
arXiv:1905.02075, arXiv:1905.0207
Recent Advances in Physical Reservoir Computing: A Review
Reservoir computing is a computational framework suited for
temporal/sequential data processing. It is derived from several recurrent
neural network models, including echo state networks and liquid state machines.
A reservoir computing system consists of a reservoir for mapping inputs into a
high-dimensional space and a readout for pattern analysis from the
high-dimensional states in the reservoir. The reservoir is fixed and only the
readout is trained with a simple method such as linear regression and
classification. Thus, the major advantage of reservoir computing compared to
other recurrent neural networks is fast learning, resulting in low training
cost. Another advantage is that the reservoir without adaptive updating is
amenable to hardware implementation using a variety of physical systems,
substrates, and devices. In fact, such physical reservoir computing has
attracted increasing attention in diverse fields of research. The purpose of
this review is to provide an overview of recent advances in physical reservoir
computing by classifying them according to the type of the reservoir. We
discuss the current issues and perspectives related to physical reservoir
computing, in order to further expand its practical applications and develop
next-generation machine learning systems.Comment: 62 pages, 13 figure
- …