606 research outputs found

    A Framework for the Design and Analysis of High-Performance Applications on FPGAs using Partial Reconfiguration

    Get PDF
    The field-programmable gate array (FPGA) is a dynamically reconfigurable digital logic chip used to implement custom hardware. The large densities of modern FPGAs and the capability of the on-thely reconfiguration has made the FPGA a viable alternative to fixed logic hardware chips such as the ASIC. In high-performance computing, FPGAs are used as co-processors to speed up computationally intensive processes or as autonomous systems that realize a complete hardware application. However, due to the limited capacity of FPGA logic resources, denser FPGAs must be purchased if more logic resources are required to realize all the functions of a complex application. Alternatively, partial reconfiguration (PR) can be used to swap, on demand, idle components of the application with active components. This research uses PR to swap components to improve the performance of the application given the limited logic resources available with smaller but economical FPGAs. The swap is called ”resource sharing PR”. In a pipelined design of multiple hardware modules (pipeline stages), resource sharing PR is a technique that uses PR to improve the performance of pipeline bottlenecks. This is done by reconfiguring other pipeline stages, typically those that are idle waiting for data from a bottleneck, into an additional parallel bottleneck module. The target pipeline of this research is a two-stage “slow-toast” pipeline where the flow of data traversing the pipeline transitions from a relatively slow, bottleneck stage to a fast stage. A two stage pipeline that combines FPGA-based hardware implementations of well-known Bioinformatics search algorithms, the X! Tandem algorithm and the Smith-Waterman algorithm, is implemented for this research; the implemented pipeline demonstrates that characteristics of these algorithm. The experimental results show that, in a database of unknown peptide spectra, when matching spectra with 388 peaks or greater, performing resource sharing PR to instantiate a parallel X! Tandem module is worth the cost for PR. In addition, from timings gathered during experiments, a general formula was derived for determining the value of performing PR upon a fast module

    Towards composition of verified hardware devices

    Get PDF
    Computers are being used where no affordable level of testing is adequate. Safety and life critical systems must find a replacement for exhaustive testing to guarantee their correctness. Through a mathematical proof, hardware verification research has focused on device verification and has largely ignored system composition verification. To address these deficiencies, we examine how the current hardware verification methodology can be extended to verify complete systems

    Aplicações De Métodos De Sensoriamento De Vibração Baseados Em Técnicas

    Get PDF
    Orientadores: Fabiano Fruett, Claudio FloridiaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Sensores à fibra óptica distribuídos têm sido empregados para monitorar vários parâmetros, tais como temperatura, vibração, tensão mecânica, campo magnético e corrente elétrica. Quando comparados a outras técnicas convencionais, tais sensores são vantajosos devido a suas pequenas dimensões, imunidade a interferências eletromagnéticas, alta adaptabilidade, robustez a ambientes nocivos, dentre outros. Sensores acústicos distribuídos em particular são interessantes devido a sua capacidade em serem usados em aplicações tais como monitoração de saúde de estruturas e vigilância de perímetros. Através da análise em frequência da estrutura, por exemplo uma aeronave, uma ponte, um edifício ou mesmo máquinas em uma fábrica, é possível avaliar sua condição e detectar danos e falhas em um estágio primário. Tais soluções podem cobrir ambas as aplicações de detecção de intrusão e monitoração estrutural com mínimas adaptações no sistema sensor. Desta forma, vibrações e distúrbios pequenas estruturas com resolução de dezenas de centímetros e em grandes estruturas ou perímetros com alguns metros de resolução espacial e centenas de quilômetros de alcance podem ser detectadas. Outra característica útil desta solução baseada em fibra óptica é a possibilidade de ser combinada com técnicas de processamento digital de sinais, permitindo a detecção e localização de perturbações rápidas, reconhecimento de padrões de intrusão em tempo real e multiplexação de dados de superfícies estruturais para aplicações SHM. O principal objetivo desta tese é fazer uso desses recursos para empregar técnicas de DAS como soluções de tecnologias- chave para várias aplicações. Neste trabalho, as técnicas de phase-OTDR foram estudadas e as principais contribuições da tese focaram em trazer soluções inovadoras e validações para aplicações de vigilância e vigilância. Este doutorado teve um período sanduíche nas instalações da RISE Acreo AB, Estocolmo, Suécia, onde experimentos foram realizados e foi parte da 42ª Chamada CISB/Saab/CNPqAbstract: Distributed optical fiber sensors have been increasingly employed for monitoring several parameters, such as temperature, vibration, strain, magnetic field and current. When compared to other conventional techniques, these sensors are advantageous due to their small dimensions, lightweight, immunity to electromagnetic interference, high adaptability, robustness to hazardous environments, less complex data multiplexing, the feasibility to be embedded into structures with minimum invasion, the capability to extract data with high resolution from long perimeters using a single optical fiber and detect multiple events along the fiber. In particular, distributed acoustic sensors (DAS) based on optical time domain reflectometry (OTDR), are of high interest, due to their capability to be used in applications such as structural health monitoring (SHM) and perimeter surveillance. Through the frequency analysis of a structure, for instance an aircraft, a bridge, a building or even machines in a workshop, it is possible to evaluate its condition and detect damages and failures at an early stage. Also, OTDR based solutions for vibration monitoring can be easily adapted with minimum setup modifications to detect intrusion in a perimeter, a useful tool for surveillance of military facilities, laboratories, power plants and homeland security. The same OTDR technique can be used as a non-destructive diagnostic tool to evaluate vibrations and disturbances on both small structures with some dozens of centimeters¿ resolution and in big structures or perimeters with some meters of spatial resolution and hundreds of kilometers of reach. Another useful feature of this optical fiber based solution is the possibility to be combined with high-performance digital signal processing techniques, enabling fast disturbance detection and location, real-time intrusion pattern recognition and fast data multiplexing of structure surfaces for SHM applications. The main goal of this thesis is to make use of these features to employ DAS techniques as key enabling technologies solutions for several applications. In this work, OTDR based techniques were studied and the thesis main contributions were focused on bringing innovative solutions and validations for SHM and surveillance applications. This PhD had a sandwich period at Acreo AB, Stockholm, Sweden, where experimental tests were performed and it was part of the 42ª CISB/Saab/CNPq CalDoutoradoEletrônica, Microeletrônica e OptoeletrônicaDoutora em Engenharia Elétrica202816/2015-0CAPESCNP

    The "MIND" Scalable PIM Architecture

    Get PDF
    MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

    A Memory Controller for FPGA Applications

    Get PDF
    As designers and researchers strive to achieve higher performance, field-programmable gate arrays (FPGAs) become an increasingly attractive solution. As coprocessors, FPGAs can provide application specific acceleration that cannot be matched by modern processors. Most of these applications will make use of large data sets, so achieving acceleration will require a capable interface to this data. The research in this thesis describes the design of a memory controller that is both efficient and flexible for FPGA applications requiring floating point operations. In particular, the benefits of certain design choices are explored, including: scalability, memory caching, and configurable precision. Results are given to prove the controller\u27s effectiveness and to compare various design trade-offs

    Semantic-Preserving Transformations for Stream Program Orchestration on Multicore Architectures

    Get PDF
    Because the demand for high performance with big data processing and distributed computing is increasing, the stream programming paradigm has been revisited for its abundance of parallelism in virtue of independent actors that communicate via data channels. The synchronous data-flow (SDF) programming model is frequently adopted with stream programming languages for its convenience to express stream programs as a set of nodes connected by data channels. Static data-rates of SDF programming model enable program transformations that greatly improve the performance of SDF programs on multicore architectures. The major application domain is for SDF programs are digital signal processing, audio, video, graphics kernels, networking, and security. This thesis makes the following three contributions that improve the performance of SDF programs: First, a new intermediate representation (IR) called LaminarIR is introduced. LaminarIR replaces FIFO queues with direct memory accesses to reduce the data communication overhead and explicates data dependencies between producer and consumer nodes. We provide transformations and their formal semantics to convert conventional, FIFO-queue based program representations to LaminarIR. Second, a compiler framework to perform sound and semantics-preserving program transformations from FIFO semantics to LaminarIR. We employ static program analysis to resolve token positions in FIFO queues and replace them by direct memory accesses. Third, a communication-cost-aware program orchestration method to establish a foundation of LaminarIR parallelization on multicore architectures. The LaminarIR framework, which consists of the aforementioned contributions together with the benchmarks that we used with the experimental evaluation, has been open-sourced to advocate further research on improving the performance of stream programming languages

    A Modular Approach to Adaptive Reactive Streaming Systems

    Get PDF
    The latest generations of FPGA devices offer large resource counts that provide the headroom to implement large-scale and complex systems. However, there are increasing challenges for the designer, not just because of pure size and complexity, but also in harnessing effectively the flexibility and programmability of the FPGA. A central issue is the need to integrate modules from diverse sources to promote modular design and reuse. Further, the capability to perform dynamic partial reconfiguration (DPR) of FPGA devices means that implemented systems can be made reconfigurable, allowing components to be changed during operation. However, use of DPR typically requires low-level planning of the system implementation, adding to the design challenge. This dissertation presents ReShape: a high-level approach for designing systems by interconnecting modules, which gives a ‘plug and play’ look and feel to the designer, is supported by tools that carry out implementation and verification functions, and is carried through to support system reconfiguration during operation. The emphasis is on the inter-module connections and abstracting the communication patterns that are typical between modules – for example, the streaming of data that is common in many FPGA-based systems, or the reading and writing of data to and from memory modules. ShapeUp is also presented as the static precursor to ReShape. In both, the details of wiring and signaling are hidden from view, via metadata associated with individual modules. ReShape allows system reconfiguration at the module level, by supporting type checking of replacement modules and by managing the overall system implementation, via metadata associated with its FPGA floorplan. The methodology and tools have been implemented in a prototype for a broad domain-specific setting – networking systems – and have been validated on real telecommunications design projects

    DRAM Bender: An Extensible and Versatile FPGA-based Infrastructure to Easily Test State-of-the-art DRAM Chips

    Full text link
    To understand and improve DRAM performance, reliability, security and energy efficiency, prior works study characteristics of commodity DRAM chips. Unfortunately, state-of-the-art open source infrastructures capable of conducting such studies are obsolete, poorly supported, or difficult to use, or their inflexibility limit the types of studies they can conduct. We propose DRAM Bender, a new FPGA-based infrastructure that enables experimental studies on state-of-the-art DRAM chips. DRAM Bender offers three key features at the same time. First, DRAM Bender enables directly interfacing with a DRAM chip through its low-level interface. This allows users to issue DRAM commands in arbitrary order and with finer-grained time intervals compared to other open source infrastructures. Second, DRAM Bender exposes easy-to-use C++ and Python programming interfaces, allowing users to quickly and easily develop different types of DRAM experiments. Third, DRAM Bender is easily extensible. The modular design of DRAM Bender allows extending it to (i) support existing and emerging DRAM interfaces, and (ii) run on new commercial or custom FPGA boards with little effort. To demonstrate that DRAM Bender is a versatile infrastructure, we conduct three case studies, two of which lead to new observations about the DRAM RowHammer vulnerability. In particular, we show that data patterns supported by DRAM Bender uncovers a larger set of bit-flips on a victim row compared to the data patterns commonly used by prior work. We demonstrate the extensibility of DRAM Bender by implementing it on five different FPGAs with DDR4 and DDR3 support. DRAM Bender is freely and openly available at https://github.com/CMU-SAFARI/DRAM-Bender.Comment: To appear in TCAD 202
    corecore