44 research outputs found

    Hardware Accelerated Molecular Docking: A Survey

    Get PDF

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    High Performance Computing via High Level Synthesis

    Get PDF
    As more and more powerful integrated circuits are appearing on the market, more and more applications, with very different requirements and workloads, are making use of the available computing power. This thesis is in particular devoted to High Performance Computing applications, where those trends are carried to the extreme. In this domain, the primary aspects to be taken into consideration are (1) performance (by definition) and (2) energy consumption (since operational costs dominate over procurement costs). These requirements can be satisfied more easily by deploying heterogeneous platforms, which include CPUs, GPUs and FPGAs to provide a broad range of performance and energy-per-operation choices. In particular, as we will see, FPGAs clearly dominate both CPUs and GPUs in terms of energy, and can provide comparable performance. An important aspect of this trend is of course design technology, because these applications were traditionally programmed in high-level languages, while FPGAs required low-level RTL design. The OpenCL (Open Computing Language) developed by the Khronos group enables developers to program CPU, GPU and recently FPGAs using functionally portable (but sadly not performance portable) source code which creates new possibilities and challenges both for research and industry. FPGAs have been always used for mid-size designs and ASIC prototyping thanks to their energy efficient and flexible hardware architecture, but their usage requires hardware design knowledge and laborious design cycles. Several approaches are developed and deployed to address this issue and shorten the gap between software and hardware in FPGA design flow, in order to enable FPGAs to capture a larger portion of the hardware acceleration market in data centers. Moreover, FPGAs usage in data centers is growing already, regardless of and in addition to their use as computational accelerators, because they can be used as high performance, low power and secure switches inside data-centers. High-Level Synthesis (HLS) is the methodology that enables designers to map their applications on FPGAs (and ASICs). It synthesizes parallel hardware from a model originally written C-based programming languages .e.g. C/C++, SystemC and OpenCL. Design space exploration of the variety of implementations that can be obtained from this C model is possible through wide range of optimization techniques and directives, e.g. to pipeline loops and partition memories into multiple banks, which guide RTL generation toward application dependent hardware and benefit designers from flexible parallel architecture of FPGAs. Model Based Design (MBD) is a high-level and visual process used to generate implementations that solve mathematical problems through a varied set of IP-blocks. MBD enables developers with different expertise, e.g. control theory, embedded software development, and hardware design to share a common design framework and contribute to a shared design using the same tool. Simulink, developed by MATLAB, is a model based design tool for simulation and development of complex dynamical systems. Moreover, Simulink embedded code generators can produce verified C/C++ and HDL code from the graphical model. This code can be used to program micro-controllers and FPGAs. This PhD thesis work presents a study using automatic code generator of Simulink to target Xilinx FPGAs using both HDL and C/C++ code to demonstrate capabilities and challenges of high-level synthesis process. To do so, firstly, digital signal processing unit of a real-time radar application is developed using Simulink blocks. Secondly, generated C based model was used for high level synthesis process and finally the implementation cost of HLS is compared to traditional HDL synthesis using Xilinx tool chain. Alternative to model based design approach, this work also presents an analysis on FPGA programming via high-level synthesis techniques for computationally intensive algorithms and demonstrates the importance of HLS by comparing performance-per-watt of GPUs(NVIDIA) and FPGAs(Xilinx) manufactured in the same node running standard OpenCL benchmarks. We conclude that generation of high quality RTL from OpenCL model requires stronger hardware background with respect to the MBD approach, however, the availability of a fast and broad design space exploration ability and portability of the OpenCL code, e.g. to CPUs and GPUs, motivates FPGA industry leaders to provide users with OpenCL software development environment which promises FPGA programming in CPU/GPU-like fashion. Our experiments, through extensive design space exploration(DSE), suggest that FPGAs have higher performance-per-watt with respect to two high-end GPUs manufactured in the same technology(28 nm). Moreover, FPGAs with more available resources and using a more modern process (20 nm) can outperform the tested GPUs while consuming much less power at the cost of more expensive devices

    Design methodology addressing static/reconfigurable partitioning optimizing software defined radio (SDR) implementation through FPGA dynamic partial reconfiguration and rapid prototyping tools

    Get PDF
    The characteristics people request for communication devices become more and more demanding every day. And not only in those aspects dealing with communication speed, but also in such different characteristics as different communication standards compatibility, battery life, device size or price. Moreover, when this communication need is addressed by the industrial world, new characteristics such as reliability, robustness or time-to-market appear. In this context, Software Defined Radios (SDR) and evolutions such as Cognitive Radios or Intelligent Radios seem to be the technological answer that will satisfy all these requirements in a short and mid-term. Consequently, this PhD dissertation deals with the implementation of this type of communication system. Taking into account that there is no limitation neither in the implementation architecture nor in the target device, a novel framework for SDR implementation is proposed. This framework is made up of FPGAs, using dynamic partial reconfiguration, as target device and rapid prototyping tools as designing tool. Despite the benefits that this framework generates, there are also certain drawbacks that need to be analyzed and minimized to the extent possible. On this purpose, a SDR design methodology has been designed and tested. This methodology addresses the static/reconfigurable partitioning of the SDRs in order to optimize their implementation in the aforementioned framework. In order to verify the feasibility of both the design framework and the design methodology, several implementations have been carried out making use of them. A multi-standard modulator implementing WiFi, WiMAX and UMTS, a small-form-factor cognitive video transmission system and the implementation of several data coding functions over R3TOS, a hardware operating system developed by the University of Edinburgh, are these implementations.Las características que la gente exige a los dispositivos de comunicaciones son cada día más exigentes. Y no solo en los aspectos relacionados con la velocidad de comunicación, sino que también en diferentes características como la compatibilidad con diferentes estándares de comunicación, autonomía, tamaño o precio. Es más, cuando esta necesidad de comunicación se traslada al mundo industrial, aparecen nuevas características como fiabilidad, robustez o plazo de comercialización que también es necesario cubrir. En este contexto, las Radios Definidas por Software (SDR) y evoluciones como las Radios Cognitivas o Radios Inteligentes parecen la respuesta tecnológica que va a satisfacer estas necesidades a corto y medio plazo. Por ello, esta tesis doctoral aborda la implementación de este tipo de sistemas de comunicaciones. Teniendo en cuenta que no existe una limitación, ni en la arquitectura de implementación, ni en el tipo de dispositivo a usar, se propone un nuevo entrono de diseño formado por las FPGAs, haciendo uso de la reconfiguración parcial dinámica, y por las herramientas de prototipado rápido. A pesar de que este entorno de diseño ofrece varios beneficios, también genera algunos inconvenientes que es necesario analizar y minimizar en la medida de lo posible. Con este objetivo, se ha diseñado y verificado una metodología de diseño de SDRs. Esta metodología se encarga del particionado estático/reconfigurable de las SDRs para optimizar su implementación sobre el entrono de diseño antes comentado. Para verificar la viabilidad tanto del entorno, como de la metodología de diseño propuesta, se han realizado varias implementaciones que hacen uso de ambas cosas. Estas implementaciones son: un modulador multi-estándar que implementa WiFi, WiMAX y UMTS, un sistema cognitivo y compacto de transmisión de video y la implementación de varias funciones de codificación de datos sobre R3TOS, un sistema operativo hardware desarrollado por la Universidad de Edimburgo

    High-Precision Automotive Radar Target Simulation

    Get PDF

    High-Precision Automotive Radar Target Simulation

    Get PDF
    Radar target simulators (RTSs) deceive a radar under test (RuT) by creating an artificial environment consisting of virtual radar targets. In this work, new techniques are presented that overcome the rasterization deficiency of current RTS systems and enable the generation of virtual targets at arbitrary high-precision positions. This allows for continuous movement of the targets and thus a more credible simulation environment
    corecore