16 research outputs found

    BORPH: operating system support on the NetFPGA platform

    Get PDF
    This paper introduces the concepts behind BORPH, an operating system for reconfigurable computers. The porting and implementation of this operating system for the NetFPGA platform, as well as the tool flow integration are described.postprintThe 2nd North American NetFPGA Developers Workshop 2010, Stanford, CA., 12-13 August 2010

    Automated gateware discovery using open firmware

    Get PDF
    Includes abstract.Includes bibliographical references.This dissertation describes the design and implementation of a mechanism that automates gateware device detection for reconfigurable hardware. The research facilitates the process of identifying and operating on gateware images by extending the existing infrastructure of probing devices in traditional software by using the chosen technology

    ROACH accelerated BLAST

    Get PDF
    Includes abstract.Includes bibliographical references (p. 115-118).Reconfigurable computing, in recent years, has been taking great strides in becoming part of mainstream computing largely due to the rapid growth in the size of FPGAs and their ability to adapt to certain complex applications efficiently. This dissertation investigates the reuse of application specific hardware developed for radio astronomy in accelerating a popular bioinformatics algorithm

    Accelerating Genomic Sequence Alignment using High Performance Reconfigurable Computers

    Get PDF
    Recongurable computing technology has progressed to a stage where it is now possible to achieve orders of magnitude performance and power eciency gains over conventional computer architectures for a subset of high performance computing applications. In this thesis, we investigate the potential of recongurable computers to accelerate genomic sequence alignment specically for genome sequencing applications. We present a highly optimized implementation of a parallel sequence alignment algorithm for the Berkeley Emulation Engine (BEE2) recongurable computer, allowing a single BEE2 to align simultaneously hundreds of sequences. For each recongurable processor (FPGA), we demonstrate a 61X speedup versus a state-of-the-art implementation on a modern conventional CPU core, and a 56X improvement in performance-per-Watt. We also show that our implementation is highly scalable and we provide performance results from a cluster implementation using 32 FPGAs. We conclude that reconfigurable computers provide an excellent platform on which to run sequence alignment, and that clusters of recongurable computers will be able to cope far more easily with the vast quantities of data produced by new ultra-high-throughput sequencers

    Accelerating Genomic Sequence Alignment using High Performance Reconfigurable Computers

    Get PDF
    Recongurable computing technology has progressed to a stage where it is now possible to achieve orders of magnitude performance and power eciency gains over conventional computer architectures for a subset of high performance computing applications. In this thesis, we investigate the potential of recongurable computers to accelerate genomic sequence alignment specically for genome sequencing applications. We present a highly optimized implementation of a parallel sequence alignment algorithm for the Berkeley Emulation Engine (BEE2) recongurable computer, allowing a single BEE2 to align simultaneously hundreds of sequences. For each recongurable processor (FPGA), we demonstrate a 61X speedup versus a state-of-the-art implementation on a modern conventional CPU core, and a 56X improvement in performance-per-Watt. We also show that our implementation is highly scalable and we provide performance results from a cluster implementation using 32 FPGAs. We conclude that recongurable computers provide an excellent platform on which to run sequence alignment, and that clusters of recongurable computers will be able to cope far more easily with the vast quantities of data produced by new ultra-high-throughput sequencers

    RHINO ARM cluster control management system

    Get PDF

    HW/SW Codesign for the Xilinx Zynq Platform

    Get PDF
    Tato práce se zabývá možnostmi pro HW/SW codesign na platformě Xilinx Zynq. Na základě studia rozhraní mezi částmi Processing System (ARM Cortex-A9 MPCore) a Programmable Logic (FPGA) je navržen abstraktní a univerzální přístup k vývoji aplikací, které jsou akcelerovány v programovatelném hardwaru na tomto čipu a běží nad operačním systémem Linux. V praktické části je pro tyto účely navržen framework určený pro Zynq, ale také pro jiné obdobné platformy. Žádný takový framework není v současné době k dispozici.This work describes a novel approach of HW/SW codesign on the Xilinx Zynq and similar platforms. It deals with interconnections between the Processing System (ARM Cortex-A9 MPCore) and the Programmable Logic (FPGA) to find an abstract and universal way to develop applications that are partially offloaded into the programmable hardware and that run in the Linux operating system. For that purpose a framework for HW/SW codesign on the Zynq and similar platforms is designed. No such framework is currently available.

    A New System Architecture for Heterogeneous Compute Units

    Get PDF
    The ongoing trend to more heterogeneous systems forces us to rethink the design of systems. In this work, I study a new system design that considers heterogeneous compute units (general-purpose cores with different instruction sets, DSPs, FPGAs, fixed-function accelerators, etc.) from the beginning instead of as an afterthought. The goal is to treat all compute units (CUs) as first-class citizens, enabling (1) isolation and secure communication between all types of CUs, (2) a direct interaction of all CUs, removing the conventional CPU from the critical path, and (3) access to operating system (OS) services such as file systems and network stacks for all CUs. To study this system design, I am using a hardware/software co-design based on two key ideas: 1) introduce a new hardware component next to each CU used by the OS as the CUs' common interface and 2) let the OS kernel control applications remotely from a different CU. The hardware component is called data transfer unit (DTU) and offers the minimal set of features to reach the stated goals: secure message passing and memory access. The OS is called M³ and runs its kernel on a dedicated CU and runs the OS services and applications on the remaining CUs. The kernel is responsible for establishing DTU-based communication channels between services and applications. After a channel has been set up, services and applications communicate directly without involving the kernel. This approach allows to support arbitrary CUs as aforementioned first-class citizens, ranging from fixed-function accelerators to complex general-purpose cores

    Efficient and predictable high-speed storage access for real-time embedded systems

    Get PDF
    As the speed, size, reliability and power efficiency of non-volatile storage media increases, and the data demands of many application domains grow, operating systems are being put under escalating pressure to provide high-speed access to storage. Traditional models of storage access assume devices to be slow, expecting plenty of slack time in which to process data between requests being serviced, and that all significant variations in timing will be down to the storage device itself. Modern high-speed storage devices break this assumption, causing storage applications to become processor-bound, rather than I/O-bound, in an increasing number of situations. This is especially an issue in real-time embedded systems, where limited processing resources and strict timing and predictability requirements amplify any issues caused by the complexity of the software storage stack. This thesis explores the issues related to accessing high-speed storage from real-time embedded systems, providing a thorough analysis of storage operations based on metrics relevant to the area. From this analysis, a number of alternative storage architectures are proposed and explored, showing that a simpler, more direct path from applications to storage can have a positive impact on efficiency and predictability in such systems

    Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009)(revised 08/2009)

    Get PDF
    Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009) which was held Feb. 12th 2009 in Mannheim, Germany. The 1st International Workshop for Research on HyperTransport is an international high quality forum for scientists, researches and developers working in the area of HyperTransport. This includes not only developments and research in HyperTransport itself, but also work which is based on or enabled by HyperTransport. HyperTransport (HT) is an interconnection technology which is typically used as system interconnect in modern computer systems, connecting the CPUs among each other and with the I/O bridges. Primarily designed as interconnect between high performance CPUs it provides an extremely low latency, high bandwidth and excellent scalability. The definition of the HTX connector allows the use of HT even for add-in cards. In opposition to other peripheral interconnect technologies like PCI-Express no protocol conversion or intermediate bridging is necessary. HT is a direct connection between device and CPU with minimal latency. Another advantage is the possibility of cache coherent devices. Because of these properties HT is of high interest for high performance I/O like networking and storage, but also for co-processing and acceleration based on ASIC or FPGA technologies. In particular acceleration sees a resurgence of interest today. One reason is the possibility to reduce power consumption by the use of accelerators. In the area of parallel computing the low latency communication allows for fine grain communication schemes and is perfectly suited for scalable systems. Summing up, HT technology offers key advantages and great performance to any research aspect related to or based on interconnects. For more information please consult the workshop website (http://whtra.uni-hd.de)
    corecore