Search CORE

4,945 research outputs found

Reconfigurable Hardware Accelerators: Opportunities, Trends, and Challenges

Author: Gong Lei
Hu Yahui
Jin Lihui
Li Xi
Lou Wenqi
Tan Luchao
Wang Chao
Zhou Xuehai
Publication venue
Publication date: 13/12/2017
Field of study

With the emerging big data applications of Machine Learning, Speech Recognition, Artificial Intelligence, and DNA Sequencing in recent years, computer architecture research communities are facing the explosive scale of various data explosion. To achieve high efficiency of data-intensive computing, studies of heterogeneous accelerators which focus on latest applications, have become a hot issue in computer architecture domain. At present, the implementation of heterogeneous accelerators mainly relies on heterogeneous computing units such as Application-specific Integrated Circuit (ASIC), Graphics Processing Unit (GPU), and Field Programmable Gate Array (FPGA). Among the typical heterogeneous architectures above, FPGA-based reconfigurable accelerators have two merits as follows: First, FPGA architecture contains a large number of reconfigurable circuits, which satisfy requirements of high performance and low power consumption when specific applications are running. Second, the reconfigurable architectures of employing FPGA performs prototype systems rapidly and features excellent customizability and reconfigurability. Nowadays, in top-tier conferences of computer architecture, emerging a batch of accelerating works based on FPGA or other reconfigurable architectures. To better review the related work of reconfigurable computing accelerators recently, this survey reserves latest high-level research products of reconfigurable accelerator architectures and algorithm applications as the basis. In this survey, we compare hot research issues and concern domains, furthermore, analyze and illuminate advantages, disadvantages, and challenges of reconfigurable accelerators. In the end, we prospect the development tendency of accelerator architectures in the future, hoping to provide a reference for computer architecture researchers

arXiv.org e-Print Archive

An Integrated Design and Verification Methodology for Reconfigurable Multimedia Systems

Author: Borgatti M.
Capello A.
Fummi F.
Lambert J. -L.
Moussa I.
Pravadelli G.
Rossi U.
Publication venue
Publication date: 25/10/2007
Field of study

Recently a lot of multimedia applications are emerging on portable appliances. They require both the flexibility of upgradeable devices (traditionally software based) and a powerful computing engine (typically hardware). In this context, programmable HW and dynamic reconfiguration allow novel approaches to the migration of algorithms from SW to HW. Thus, in the frame of the Symbad project, we propose an industrial design flow for reconfigurable SoC's. The goal of Symbad consists of developing a system level design platform for hardware and software SoC systems including formal and semi-formal verification techniques.Comment: Submitted on behalf of EDAA (http://www.edaa.com/

arXiv.org e-Print Archive

End-to-End Design for Self-Reconfigurable Heterogeneous Robotic Swarms

Author: Gia Tuan Nguyen
Qingqing Li
Queralta Jorge Peña
Truong Hong-Linh
Westerlund Tomi
Publication venue
Publication date: 29/04/2020
Field of study

More widespread adoption requires swarms of robots to be more flexible for real-world applications. Multiple challenges remain in complex scenarios where a large amount of data needs to be processed in real-time and high degrees of situational awareness are required. The options in this direction are limited in existing robotic swarms, mostly homogeneous robots with limited operational and reconfiguration flexibility. We address this by bringing elastic computing techniques and dynamic resource management from the edge-cloud computing domain to the swarm robotics domain. This enables the dynamic provisioning of collective capabilities in the swarm for different applications. Therefore, we transform a swarm into a distributed sensing and computing platform capable of complex data processing tasks, which can then be offered as a service. In particular, we discuss how this can be applied to adaptive resource management in a heterogeneous swarm of drones, and how we are implementing the dynamic deployment of distributed data processing algorithms. With an elastic drone swarm built on reconfigurable hardware and containerized services, it will be possible to raise the self-awareness, degree of intelligence, and level of autonomy of heterogeneous swarms of robots. We describe novel directions for collaborative perception, and new ways of interacting with a robotic swarm

arXiv.org e-Print Archive

Collective Tuning Initiative

Author: Fursin Grigori
Publication venue
Publication date: 13/07/2014
Field of study

Computing systems rarely deliver best possible performance due to ever increasing hardware and software complexity and limitations of the current optimization technology. Additional code and architecture optimizations are often required to improve execution time, size, power consumption, reliability and other important characteristics of computing systems. However, it is often a tedious, repetitive, isolated and time consuming process. In order to automate, simplify and systematize program optimization and architecture design, we are developing open-source modular plugin-based Collective Tuning Infrastructure (CTI, http://cTuning.org) that can distribute optimization process and leverage optimization experience of multiple users. CTI provides a novel fully integrated, collaborative, "one button" approach to improve existing underperfoming computing systems ranging from embedded architectures to high-performance servers based on systematic iterative compilation, statistical collective optimization and machine learning. Our experimental results show that it is possible to reduce execution time (and code size) of some programs from SPEC2006 and EEMBC among others by more than a factor of 2 automatically. It can also reduce development and testing time considerably. Together with the first production quality machine learning enabled interactive research compiler (MILEPOST GCC) this infrastructure opens up many research opportunities to study and develop future realistic self-tuning and self-organizing adaptive intelligent computing systems based on systematic statistical performance evaluation and benchmarking. Finally, using common optimization repository is intended to improve the quality and reproducibility of the research on architecture and code optimization.Comment: GCC Developers' Summit'09, 14 June 2009, Montreal, Canad

arXiv.org e-Print Archive

High Level Hardware/Software Embedded System Design with Redsharc

Author: French Matthew
Schmidt Andrew G.
Skalicky Sam
Publication venue
Publication date: 20/08/2014
Field of study

As tools for designing multiple processor systems-on-chips (MPSoCs) continue to evolve to meet the demands of developers, there exist systematic gaps that must be bridged to provide a more cohesive hardware/software development environment. We present Redsharc to address these problems and enable: system generation, software/hardware compilation and synthesis, run-time control and execution of MPSoCs. The efforts presented in this paper extend our previous work to provide a rich API, build infrastructure, and runtime enabling developers to design a system of simultaneously executing kernels in software or hardware, that communicate seamlessly. In this work we take Redsharc further to support a broader class of applications across a larger number of devices requiring a more unified system development environment and build infrastructure. To accomplish this we leverage existing tools and extend Redsharc with build and control infrastructure to relieve the burden of system development allowing software programmers to focus their efforts on application and kernel development.Comment: Presented at First International Workshop on FPGAs for Software Programmers (FSP 2014) (arXiv:1408.4423

arXiv.org e-Print Archive

High-level Synthesis

Author: Damaj Issam
Publication venue: 'Wiley'
Publication date: 03/05/2019
Field of study

Hardware synthesis is a general term used to refer to the processes involved in automatically generating a hardware design from its specification. High-level synthesis (HLS) could be defined as the translation from a behavioral description of the intended hardware circuit into a structural description similar to the compilation of programming languages (such as C and Pascal into assembly language. The chained synthesis tasks at each level of the design process include system synthesis, register-transfer synthesis, logic synthesis, and circuit synthesis. The development of hardware solutions for complex applications is no more a complicated task with the emergence of various HLS tools. Many areas of application have benefited from the modern advances in hardware design, such as automotive and aerospace industries, computer graphics, signal and image processing, security, complex simulations like molecular modeling, and DND matching. The field of HLS is continuing its rapid growth to facilitate the creation of hardware and to blur more and more the border separating the processes of designing hardware and software.Comment: 19 Pages, 16 Figures. arXiv admin note: text overlap with arXiv:1905.02075, arXiv:1905.0207

arXiv.org e-Print Archive

High Performance Reconfigurable Computing Systems

Author: Damaj Issam
Publication venue
Publication date: 09/04/2019
Field of study

The rapid progress and advancement in electronic chips technology provide a variety of new implementation options for system engineers. The choice varies between the flexible programs running on a general-purpose processor (GPP) and the fixed hardware implementation using an application specific integrated circuit (ASIC). Many other implementation options present, for instance, a system with a RISC processor and a DSP core. Other options include graphics processors and microcontrollers. Specialist processors certainly improve performance over general-purpose ones, but this comes as a quid pro quo for flexibility. Combining the flexibility of GPPs and the high performance of ASICs leads to the introduction of reconfigurable computing (RC) as a new implementation option with a balance between versatility and speed. The focus of this chapter is on introducing reconfigurable computers as modern super computing architectures. The chapter also investigates the main reasons behind the current advancement in the development of RC-systems. Furthermore, a technical survey of various RC-systems is included laying common grounds for comparisons. In addition, this chapter mainly presents case studies implemented under the MorphoSys RC-system. The selected case studies belong to different areas of application, such as, computer graphics and information coding. Parallel versions of the studied algorithms are developed to match the topologies supported by the MorphoSys. Performance evaluation and results analyses are included for implementations with different characteristics.Comment: 53 pages, 14 tables, 15 figure

arXiv.org e-Print Archive

Criteria and Approaches for Virtualization on Modern FPGAs

Author: Le Duc-Canh
Youn Chan-Hyun
Publication venue
Publication date: 08/04/2019
Field of study

Modern field programmable gate arrays (FPGAs) can produce high performance in a wide range of applications, and their computational capacity is becoming abundant in personal computers. Regardless of this fact, FPGA virtualization is an emerging research field. Nowadays, challenges of the research area come from not only technical difficulties but also from the ambiguous standards of virtualization. In this paper, we introduce novel criteria of FPGA virtualization and discuss several approaches to accomplish those criteria. In addition, we present and describe in detail the specific FPGA virtualization architecture that we developed on Intel Arria 10 FPGA. We evaluate our solution with a combination of applications and microbenchmarks. The result shows that our virtualization solution can provide a full abstraction of FPGA device in both user and developer perspective while maintaining a reasonable performance compared to native FPGA

arXiv.org e-Print Archive

Self-Partial and Dynamic Reconfiguration Implementation for AES using FPGA

Author: Alaoui Ismaili Zine El Abidine
Moussa Ahmed
Publication venue: International Journal of Computer Science Issues, IJCSI
Publication date: 01/08/2009
Field of study

This paper addresses efficient hardware/software implementation approaches for the AES (Advanced Encryption Standard) algorithm and describes the design and performance testing algorithm for embedded system. Also, with the spread of reconfigurable hardware such as FPGAs (Field Programmable Gate Array) embedded cryptographic hardware became cost-effective. Nevertheless, it is worthy to note that nowadays, even hardwired cryptographic algorithms are not so safe. From another side, the self-reconfiguring platform is reported that enables an FPGA to dynamically reconfigure itself under the control of an embedded microprocessor. Hardware acceleration significantly increases the performance of embedded systems built on programmable logic. Allowing a FPGA-based MicroBlaze processor to self-select the coprocessors uses can help reduce area requirements and increase a system's versatility. The architecture proposed in this paper is an optimal hardware implementation algorithm and takes dynamic partially reconfigurable of FPGA. This implementation is good solution to preserve confidentiality and accessibility to the information in the numeric communication

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Performance Improvement by Changing Modulation Methods for Software Defined Radios

Author: Aldar Dilip S.
Godbole Bhalchandra B.
Publication venue
Publication date: 01/12/2012
Field of study

This paper describes an automatic switching of modulation method to reconfigure transceivers of Software Defined Radio (SDR) based wireless communication system. The programmable architecture of Software Radio promotes a flexible implementation of modulation methods. This flexibility also translates into adaptively, which is used here to optimize the throughput of a wireless network, operating under varying channel conditions. It is robust and efficient with processing time overhead that still allows the SDR to maintain its real-time operating objectives. This technique is studied for digital wireless communication systems. Tests and simulations using an AWGN channel show that the SNR threshold is 5dB for the case study.Comment: IJACS

arXiv.org e-Print Archive