Search CORE

2,707 research outputs found

FASTCUDA: Open Source FPGA Accelerator &amp; Hardware-Software Codesign Toolset for CUDA Kernels

Author: de la Torre E.()
Lavagno L.()
Lazarescu M.()
Mavroidis I. ()
Papaefstathiou I.()
Papaefstathiou Ioannis(http://users.isc.tuc.gr/~ipapaefstathiou)
Schafer F.()
Παπαευσταθιου Ιωαννης(http://users.isc.tuc.gr/~ipapaefstathiou)
Publication venue: IEEE / Institute of Electrical and Electronics Engineers Incorporated:445 Hoes Lane:Piscataway, NJ 08854:(800)701-4333, (732)981-0060, EMAIL: [email protected], INTERNET: http://www.ieee.org, Fax: (732)981-9667
Publication date: 01/01/2012
Field of study

Using FPGAs as hardware accelerators that communicate with a central CPU is becoming a common practice in the embedded design world but there is no standard methodology and toolset to facilitate this path yet. On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. With FASTCUDA, the CUDA kernels of a CUDA-based application are partitioned into two groups with minimal user intervention: those that are compiled and executed in parallel software, and those that are synthesized and implemented in hardware. A modern low power FPGA can provide the processing power (via numerous embedded micro-CPUs) and the logic capacity for both the software and hardware implementations of the CUDA kernels. This paper describes the system requirements and the architectural decisions behind the FASTCUDA approach

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Institutional Repository of the Technical University of Crete

Reconfigurable Architectures:From Physical Implementation to Dynamic Behavoir Modelling

Author: Wu Kehuai
Publication venue
Publication date: 01/01/2008
Field of study

Online Research Database In Technology

Transparent Dynamic reconfiguration for CORBA

Author: Almeida João Paulo A.
Nieuwenhuis Lambert
Sinderen Marten van
Wegdam Maarten
Publication venue: IEEE
Publication date: 01/01/2001
Field of study

Distributed systems with high availability requirements have to support some form of dynamic reconfiguration. This means that they must provide the ability to be maintained or upgraded without being taken off-line. Building a distributed system that allows dynamic reconfiguration is very intrusive to the overall design of the system, and generally requires special skills from both the client and server side application developers. There is an opportunity to provide support for dynamic reconfiguration at the object middleware level of distributed systems, and create a dynamic reconfiguration transparency to application developers. We propose a Dynamic Reconfiguration Service for CORBA that allows the reconfiguration of a running system with maximum transparency for both client and server side developers. We describe the architecture, a prototype implementation, and some preliminary test result

University of Twente Research Information

High-Rate Space Coding for Reconfigurable 2x2 Millimeter-Wave MIMO Systems

Author: Hua Yingbo
Jafarkhani Hamid
Mehrpouyan Hani
Vakilian Vida
Publication venue
Publication date: 24/05/2015
Field of study

Millimeter-wave links are of a line-of-sight nature. Hence, multiple-input multiple-output (MIMO) systems operating in the millimeter-wave band may not achieve full spatial diversity or multiplexing. In this paper, we utilize reconfigurable antennas and the high antenna directivity in the millimeter-wave band to propose a rate-two space coding design for 2x2 MIMO systems. The proposed scheme can be decoded with a low complexity maximum-likelihood detector at the receiver and yet it can enhance the bit-error-rate performance of millimeter-wave systems compared to traditional spatial multiplexing schemes, such as the Vertical Bell Laboratories Layered Space-Time Architecture (VBLAST). Using numerical simulations, we demonstrate the efficiency of the proposed code and show its superiority compared to existing rate-two space-time block codes

arXiv.org e-Print Archive

eScholarship - University of California

Cycle-accurate evaluation of reconfigurable photonic networks-on-chip

Author: Artundo Iñigo
Debaes Christof
Heirman Wim
Thienpont Hugo
Van Campenhout Jan
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2010
Field of study

There is little doubt that the most important limiting factors of the performance of next-generation Chip Multiprocessors (CMPs) will be the power efficiency and the available communication speed between cores. Photonic Networks-on-Chip (NoCs) have been suggested as a viable route to relieve the off- and on-chip interconnection bottleneck. Low-loss integrated optical waveguides can transport very high-speed data signals over longer distances as compared to on-chip electrical signaling. In addition, with the development of silicon microrings, photonic switches can be integrated to route signals in a data-transparent way. Although several photonic NoC proposals exist, their use is often limited to the communication of large data messages due to a relatively long set-up time of the photonic channels. In this work, we evaluate a reconfigurable photonic NoC in which the topology is adapted automatically (on a microsecond scale) to the evolving traffic situation by use of silicon microrings. To evaluate this system's performance, the proposed architecture has been implemented in a detailed full-system cycle-accurate simulator which is capable of generating realistic workloads and traffic patterns. In addition, a model was developed to estimate the power consumption of the full interconnection network which was compared with other photonic and electrical NoC solutions. We find that our proposed network architecture significantly lowers the average memory access latency (35% reduction) while only generating a modest increase in power consumption (20%), compared to a conventional concentrated mesh electrical signaling approach. When comparing our solution to high-speed circuit-switched photonic NoCs, long photonic channel set-up times can be tolerated which makes our approach directly applicable to current shared-memory CMPs

Crossref

Ghent University Academic Bibliography