4,945 research outputs found
Reconfigurable Hardware Accelerators: Opportunities, Trends, and Challenges
With the emerging big data applications of Machine Learning, Speech
Recognition, Artificial Intelligence, and DNA Sequencing in recent years,
computer architecture research communities are facing the explosive scale of
various data explosion. To achieve high efficiency of data-intensive computing,
studies of heterogeneous accelerators which focus on latest applications, have
become a hot issue in computer architecture domain. At present, the
implementation of heterogeneous accelerators mainly relies on heterogeneous
computing units such as Application-specific Integrated Circuit (ASIC),
Graphics Processing Unit (GPU), and Field Programmable Gate Array (FPGA). Among
the typical heterogeneous architectures above, FPGA-based reconfigurable
accelerators have two merits as follows: First, FPGA architecture contains a
large number of reconfigurable circuits, which satisfy requirements of high
performance and low power consumption when specific applications are running.
Second, the reconfigurable architectures of employing FPGA performs prototype
systems rapidly and features excellent customizability and reconfigurability.
Nowadays, in top-tier conferences of computer architecture, emerging a batch of
accelerating works based on FPGA or other reconfigurable architectures. To
better review the related work of reconfigurable computing accelerators
recently, this survey reserves latest high-level research products of
reconfigurable accelerator architectures and algorithm applications as the
basis. In this survey, we compare hot research issues and concern domains,
furthermore, analyze and illuminate advantages, disadvantages, and challenges
of reconfigurable accelerators. In the end, we prospect the development
tendency of accelerator architectures in the future, hoping to provide a
reference for computer architecture researchers
An Integrated Design and Verification Methodology for Reconfigurable Multimedia Systems
Recently a lot of multimedia applications are emerging on portable
appliances. They require both the flexibility of upgradeable devices
(traditionally software based) and a powerful computing engine (typically
hardware). In this context, programmable HW and dynamic reconfiguration allow
novel approaches to the migration of algorithms from SW to HW. Thus, in the
frame of the Symbad project, we propose an industrial design flow for
reconfigurable SoC's. The goal of Symbad consists of developing a system level
design platform for hardware and software SoC systems including formal and
semi-formal verification techniques.Comment: Submitted on behalf of EDAA (http://www.edaa.com/
End-to-End Design for Self-Reconfigurable Heterogeneous Robotic Swarms
More widespread adoption requires swarms of robots to be more flexible for
real-world applications. Multiple challenges remain in complex scenarios where
a large amount of data needs to be processed in real-time and high degrees of
situational awareness are required. The options in this direction are limited
in existing robotic swarms, mostly homogeneous robots with limited operational
and reconfiguration flexibility. We address this by bringing elastic computing
techniques and dynamic resource management from the edge-cloud computing domain
to the swarm robotics domain. This enables the dynamic provisioning of
collective capabilities in the swarm for different applications. Therefore, we
transform a swarm into a distributed sensing and computing platform capable of
complex data processing tasks, which can then be offered as a service. In
particular, we discuss how this can be applied to adaptive resource management
in a heterogeneous swarm of drones, and how we are implementing the dynamic
deployment of distributed data processing algorithms. With an elastic drone
swarm built on reconfigurable hardware and containerized services, it will be
possible to raise the self-awareness, degree of intelligence, and level of
autonomy of heterogeneous swarms of robots. We describe novel directions for
collaborative perception, and new ways of interacting with a robotic swarm
Collective Tuning Initiative
Computing systems rarely deliver best possible performance due to ever
increasing hardware and software complexity and limitations of the current
optimization technology. Additional code and architecture optimizations are
often required to improve execution time, size, power consumption, reliability
and other important characteristics of computing systems. However, it is often
a tedious, repetitive, isolated and time consuming process. In order to
automate, simplify and systematize program optimization and architecture
design, we are developing open-source modular plugin-based Collective Tuning
Infrastructure (CTI, http://cTuning.org) that can distribute optimization
process and leverage optimization experience of multiple users. CTI provides a
novel fully integrated, collaborative, "one button" approach to improve
existing underperfoming computing systems ranging from embedded architectures
to high-performance servers based on systematic iterative compilation,
statistical collective optimization and machine learning. Our experimental
results show that it is possible to reduce execution time (and code size) of
some programs from SPEC2006 and EEMBC among others by more than a factor of 2
automatically. It can also reduce development and testing time considerably.
Together with the first production quality machine learning enabled interactive
research compiler (MILEPOST GCC) this infrastructure opens up many research
opportunities to study and develop future realistic self-tuning and
self-organizing adaptive intelligent computing systems based on systematic
statistical performance evaluation and benchmarking. Finally, using common
optimization repository is intended to improve the quality and reproducibility
of the research on architecture and code optimization.Comment: GCC Developers' Summit'09, 14 June 2009, Montreal, Canad
High Level Hardware/Software Embedded System Design with Redsharc
As tools for designing multiple processor systems-on-chips (MPSoCs) continue
to evolve to meet the demands of developers, there exist systematic gaps that
must be bridged to provide a more cohesive hardware/software development
environment. We present Redsharc to address these problems and enable: system
generation, software/hardware compilation and synthesis, run-time control and
execution of MPSoCs. The efforts presented in this paper extend our previous
work to provide a rich API, build infrastructure, and runtime enabling
developers to design a system of simultaneously executing kernels in software
or hardware, that communicate seamlessly. In this work we take Redsharc further
to support a broader class of applications across a larger number of devices
requiring a more unified system development environment and build
infrastructure. To accomplish this we leverage existing tools and extend
Redsharc with build and control infrastructure to relieve the burden of system
development allowing software programmers to focus their efforts on application
and kernel development.Comment: Presented at First International Workshop on FPGAs for Software
Programmers (FSP 2014) (arXiv:1408.4423
High-level Synthesis
Hardware synthesis is a general term used to refer to the processes involved
in automatically generating a hardware design from its specification.
High-level synthesis (HLS) could be defined as the translation from a
behavioral description of the intended hardware circuit into a structural
description similar to the compilation of programming languages (such as C and
Pascal into assembly language. The chained synthesis tasks at each level of the
design process include system synthesis, register-transfer synthesis, logic
synthesis, and circuit synthesis. The development of hardware solutions for
complex applications is no more a complicated task with the emergence of
various HLS tools. Many areas of application have benefited from the modern
advances in hardware design, such as automotive and aerospace industries,
computer graphics, signal and image processing, security, complex simulations
like molecular modeling, and DND matching. The field of HLS is continuing its
rapid growth to facilitate the creation of hardware and to blur more and more
the border separating the processes of designing hardware and software.Comment: 19 Pages, 16 Figures. arXiv admin note: text overlap with
arXiv:1905.02075, arXiv:1905.0207
High Performance Reconfigurable Computing Systems
The rapid progress and advancement in electronic chips technology provide a
variety of new implementation options for system engineers. The choice varies
between the flexible programs running on a general-purpose processor (GPP) and
the fixed hardware implementation using an application specific integrated
circuit (ASIC). Many other implementation options present, for instance, a
system with a RISC processor and a DSP core. Other options include graphics
processors and microcontrollers. Specialist processors certainly improve
performance over general-purpose ones, but this comes as a quid pro quo for
flexibility. Combining the flexibility of GPPs and the high performance of
ASICs leads to the introduction of reconfigurable computing (RC) as a new
implementation option with a balance between versatility and speed. The focus
of this chapter is on introducing reconfigurable computers as modern super
computing architectures. The chapter also investigates the main reasons behind
the current advancement in the development of RC-systems. Furthermore, a
technical survey of various RC-systems is included laying common grounds for
comparisons. In addition, this chapter mainly presents case studies implemented
under the MorphoSys RC-system. The selected case studies belong to different
areas of application, such as, computer graphics and information coding.
Parallel versions of the studied algorithms are developed to match the
topologies supported by the MorphoSys. Performance evaluation and results
analyses are included for implementations with different characteristics.Comment: 53 pages, 14 tables, 15 figure
Criteria and Approaches for Virtualization on Modern FPGAs
Modern field programmable gate arrays (FPGAs) can produce high performance in
a wide range of applications, and their computational capacity is becoming
abundant in personal computers. Regardless of this fact, FPGA virtualization is
an emerging research field. Nowadays, challenges of the research area come from
not only technical difficulties but also from the ambiguous standards of
virtualization. In this paper, we introduce novel criteria of FPGA
virtualization and discuss several approaches to accomplish those criteria. In
addition, we present and describe in detail the specific FPGA virtualization
architecture that we developed on Intel Arria 10 FPGA. We evaluate our solution
with a combination of applications and microbenchmarks. The result shows that
our virtualization solution can provide a full abstraction of FPGA device in
both user and developer perspective while maintaining a reasonable performance
compared to native FPGA
Self-Partial and Dynamic Reconfiguration Implementation for AES using FPGA
This paper addresses efficient hardware/software implementation approaches for the AES (Advanced Encryption Standard) algorithm and describes the design and performance testing algorithm for embedded system. Also, with the spread of reconfigurable hardware such as FPGAs (Field Programmable Gate Array) embedded cryptographic hardware became cost-effective. Nevertheless, it is worthy to note that nowadays, even hardwired cryptographic algorithms are not so safe. From another side, the self-reconfiguring platform is reported that enables an FPGA to dynamically reconfigure itself under the control of an embedded microprocessor. Hardware acceleration significantly increases the performance of embedded systems built on programmable logic. Allowing a FPGA-based MicroBlaze processor to self-select the coprocessors uses can help reduce area requirements and increase a system's versatility. The architecture proposed in this paper is an optimal hardware implementation algorithm and takes dynamic partially reconfigurable of FPGA. This implementation is good solution to preserve confidentiality and accessibility to the information in the numeric communication
Performance Improvement by Changing Modulation Methods for Software Defined Radios
This paper describes an automatic switching of modulation method to
reconfigure transceivers of Software Defined Radio (SDR) based wireless
communication system. The programmable architecture of Software Radio promotes
a flexible implementation of modulation methods. This flexibility also
translates into adaptively, which is used here to optimize the throughput of a
wireless network, operating under varying channel conditions. It is robust and
efficient with processing time overhead that still allows the SDR to maintain
its real-time operating objectives. This technique is studied for digital
wireless communication systems. Tests and simulations using an AWGN channel
show that the SNR threshold is 5dB for the case study.Comment: IJACS
- …