Search CORE

16 research outputs found

BORPH: operating system support on the NetFPGA platform

Author: Hamilton BK
So HKH
Publication venue
Publication date: 01/01/2010
Field of study

This paper introduces the concepts behind BORPH, an operating system for reconfigurable computers. The porting and implementation of this operating system for the NetFPGA platform, as well as the tool flow integration are described.postprintThe 2nd North American NetFPGA Developers Workshop 2010, Stanford, CA., 12-13 August 2010

HKU Scholars Hub

Automated gateware discovery using open firmware

Author: Rajan Shanly
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2013
Field of study

Includes abstract.Includes bibliographical references.This dissertation describes the design and implementation of a mechanism that automates gateware device detection for reconfigurable hardware. The research facilitates the process of identifying and operating on gateware images by extending the existing infrastructure of probing devices in traditional software by using the chosen technology

Cape Town University OpenUCT

ROACH accelerated BLAST

Author: Macleod David
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2011
Field of study

Includes abstract.Includes bibliographical references (p. 115-118).Reconfigurable computing, in recent years, has been taking great strides in becoming part of mainstream computing largely due to the rapid growth in the size of FPGAs and their ability to adapt to certain complex applications efficiently. This dissertation investigates the reuse of application specific hardware developed for radio astronomy in accelerating a popular bioinformatics algorithm

Cape Town University OpenUCT

Accelerating Genomic Sequence Alignment using High Performance Reconfigurable Computers

Author: McMahon Mr Peter
Publication venue
Publication date: 01/01/2008
Field of study

Recongurable computing technology has progressed to a stage where it is now possible to achieve orders of magnitude performance and power eciency gains over conventional computer architectures for a subset of high performance computing applications. In this thesis, we investigate the potential of recongurable computers to accelerate genomic sequence alignment specically for genome sequencing applications. We present a highly optimized implementation of a parallel sequence alignment algorithm for the Berkeley Emulation Engine (BEE2) recongurable computer, allowing a single BEE2 to align simultaneously hundreds of sequences. For each recongurable processor (FPGA), we demonstrate a 61X speedup versus a state-of-the-art implementation on a modern conventional CPU core, and a 56X improvement in performance-per-Watt. We also show that our implementation is highly scalable and we provide performance results from a cluster implementation using 32 FPGAs. We conclude that reconfigurable computers provide an excellent platform on which to run sequence alignment, and that clusters of recongurable computers will be able to cope far more easily with the vast quantities of data produced by new ultra-high-throughput sequencers

CiteSeerX

Cape Town University OpenUCT

UCT Computer Science Research Document Archive

Accelerating Genomic Sequence Alignment using High Performance Reconfigurable Computers

Author: McMahon Peter
Publication venue
Publication date: 01/01/2008
Field of study

Recongurable computing technology has progressed to a stage where it is now possible to achieve orders of magnitude performance and power eciency gains over conventional computer architectures for a subset of high performance computing applications. In this thesis, we investigate the potential of recongurable computers to accelerate genomic sequence alignment specically for genome sequencing applications. We present a highly optimized implementation of a parallel sequence alignment algorithm for the Berkeley Emulation Engine (BEE2) recongurable computer, allowing a single BEE2 to align simultaneously hundreds of sequences. For each recongurable processor (FPGA), we demonstrate a 61X speedup versus a state-of-the-art implementation on a modern conventional CPU core, and a 56X improvement in performance-per-Watt. We also show that our implementation is highly scalable and we provide performance results from a cluster implementation using 32 FPGAs. We conclude that recongurable computers provide an excellent platform on which to run sequence alignment, and that clusters of recongurable computers will be able to cope far more easily with the vast quantities of data produced by new ultra-high-throughput sequencers

CiteSeerX

Cape Town University OpenUCT

UCT Computer Science Research Document Archive

RHINO ARM cluster control management system

Author: Chiriseri Valerie Edith
Publication venue: Department of Electrical Engineering
Publication date: 12/12/2016
Field of study

Cape Town University OpenUCT

HW/SW Codesign for the Xilinx Zynq Platform

Author: Viktorin Jan
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2013
Field of study

Tato práce se zabývá možnostmi pro HW/SW codesign na platformě Xilinx Zynq. Na základě studia rozhraní mezi částmi Processing System (ARM Cortex-A9 MPCore) a Programmable Logic (FPGA) je navržen abstraktní a univerzální přístup k vývoji aplikací, které jsou akcelerovány v programovatelném hardwaru na tomto čipu a běží nad operačním systémem Linux. V praktické části je pro tyto účely navržen framework určený pro Zynq, ale také pro jiné obdobné platformy. Žádný takový framework není v současné době k dispozici.This work describes a novel approach of HW/SW codesign on the Xilinx Zynq and similar platforms. It deals with interconnections between the Processing System (ARM Cortex-A9 MPCore) and the Programmable Logic (FPGA) to find an abstract and universal way to develop applications that are partially offloaded into the programmable hardware and that run in the Linux operating system. For that purpose a framework for HW/SW codesign on the Zynq and similar platforms is designed. No such framework is currently available.

Digital library of Brno University of Technology

National Repository of Grey Literature

A New System Architecture for Heterogeneous Compute Units

Author: Asmussen Nils
Publication venue
Publication date: 09/08/2019
Field of study

The ongoing trend to more heterogeneous systems forces us to rethink the design of systems. In this work, I study a new system design that considers heterogeneous compute units (general-purpose cores with different instruction sets, DSPs, FPGAs, fixed-function accelerators, etc.) from the beginning instead of as an afterthought. The goal is to treat all compute units (CUs) as first-class citizens, enabling (1) isolation and secure communication between all types of CUs, (2) a direct interaction of all CUs, removing the conventional CPU from the critical path, and (3) access to operating system (OS) services such as file systems and network stacks for all CUs. To study this system design, I am using a hardware/software co-design based on two key ideas: 1) introduce a new hardware component next to each CU used by the OS as the CUs' common interface and 2) let the OS kernel control applications remotely from a different CU. The hardware component is called data transfer unit (DTU) and offers the minimal set of features to reach the stated goals: secure message passing and memory access. The OS is called M³ and runs its kernel on a dedicated CU and runs the OS services and applications on the remaining CUs. The kernel is responsible for establishing DTU-based communication channels between services and applications. After a channel has been set up, services and applications communicate directly without involving the kernel. This approach allows to support arbitrary CUs as aforementioned first-class citizens, ranging from fixed-function accelerators to complex general-purpose cores

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Efficient and predictable high-speed storage access for real-time embedded systems

Author: Joyce Russell
Publication venue: University of York
Publication date: 01/09/2018
Field of study

As the speed, size, reliability and power efficiency of non-volatile storage media increases, and the data demands of many application domains grow, operating systems are being put under escalating pressure to provide high-speed access to storage. Traditional models of storage access assume devices to be slow, expecting plenty of slack time in which to process data between requests being serviced, and that all significant variations in timing will be down to the storage device itself. Modern high-speed storage devices break this assumption, causing storage applications to become processor-bound, rather than I/O-bound, in an increasing number of situations. This is especially an issue in real-time embedded systems, where limited processing resources and strict timing and predictability requirements amplify any issues caused by the complexity of the software storage stack. This thesis explores the issues related to accessing high-speed storage from real-time embedded systems, providing a thorough analysis of storage operations based on metrics relevant to the area. From this analysis, a number of alternative storage architectures are proposed and explored, showing that a simpler, more direct path from applications to storage can have a positive impact on efficiency and predictability in such systems

White Rose E-theses Online

Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009)(revised 08/2009)

Author
Publication venue
Publication date: 01/01/2009
Field of study

Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009) which was held Feb. 12th 2009 in Mannheim, Germany. The 1st International Workshop for Research on HyperTransport is an international high quality forum for scientists, researches and developers working in the area of HyperTransport. This includes not only developments and research in HyperTransport itself, but also work which is based on or enabled by HyperTransport. HyperTransport (HT) is an interconnection technology which is typically used as system interconnect in modern computer systems, connecting the CPUs among each other and with the I/O bridges. Primarily designed as interconnect between high performance CPUs it provides an extremely low latency, high bandwidth and excellent scalability. The definition of the HTX connector allows the use of HT even for add-in cards. In opposition to other peripheral interconnect technologies like PCI-Express no protocol conversion or intermediate bridging is necessary. HT is a direct connection between device and CPU with minimal latency. Another advantage is the possibility of cache coherent devices. Because of these properties HT is of high interest for high performance I/O like networking and storage, but also for co-processing and acceleration based on ASIC or FPGA technologies. In particular acceleration sees a resurgence of interest today. One reason is the possibility to reduce power consumption by the use of accelerators. In the area of parallel computing the low latency communication allows for fine grain communication schemes and is perfectly suited for scalable systems. Summing up, HT technology offers key advantages and great performance to any research aspect related to or based on interconnects. For more information please consult the workshop website (http://whtra.uni-hd.de)

Heidelberger Dokumentenserver