883 research outputs found

    FASTCUDA: Open Source FPGA Accelerator & Hardware-Software Codesign Toolset for CUDA Kernels

    Get PDF
    Using FPGAs as hardware accelerators that communicate with a central CPU is becoming a common practice in the embedded design world but there is no standard methodology and toolset to facilitate this path yet. On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. With FASTCUDA, the CUDA kernels of a CUDA-based application are partitioned into two groups with minimal user intervention: those that are compiled and executed in parallel software, and those that are synthesized and implemented in hardware. A modern low power FPGA can provide the processing power (via numerous embedded micro-CPUs) and the logic capacity for both the software and hardware implementations of the CUDA kernels. This paper describes the system requirements and the architectural decisions behind the FASTCUDA approach

    Unifying mesh- and tree-based programmable interconnect

    Get PDF
    We examine the traditional, symmetric, Manhattan mesh design for field-programmable gate-array (FPGA) routing along with tree-of-meshes (ToM) and mesh-of-trees (MoT) based designs. All three networks can provide general routing for limited bisection designs (Rent's rule with p<1) and allow locality exploitation. They differ in their detailed topology and use of hierarchy. We show that all three have the same asymptotic wiring requirements. We bound this tightly by providing constructive mappings between routes in one network and routes in another. For example, we show that a (c,p) MoT design can be mapped to a (2c,p) linear population ToM and introduce a corner turn scheme which will make it possible to perform the reverse mapping from any (c,p) linear population ToM to a (2c,p) MoT augmented with a particular set of corner turn switches. One consequence of this latter mapping is a multilayer layout strategy for N-node, linear population ToM designs that requires only /spl Theta/(N) two-dimensional area for any p when given sufficient wiring layers. We further show upper and lower bounds for global mesh routes based on recursive bisection width and show these are within a constant factor of each other and within a constant factor of MoT and ToM layout area. In the process we identify the parameters and characteristics which make the networks different, making it clear there is a unified design continuum in which these networks are simply particular regions

    MARTE based design flow for Partially Reconfigurable Systems-on-Chips

    Get PDF
    International audienceSystems-on-Chip (SoCs) are considered an integral solution for designing embedded systems, for targeting complex intensive parallel computation applications. As advances in SoC technology permit integration of increasing number of hardware resources on a single chip, the targeted application domains such as software-defined radio are become increasingly sophisticated. The fallout of this complexity is that the system design, particularly software design, does not evolve at the same pace as that of hardware leading to a significant productivity gap. Adaptivity and reconfigurability are also critical issues for SoCs which must be able to cope with end user environment and requirements

    Multilevel Runtime Verification for Safety and Security Critical Cyber Physical Systems from a Model Based Engineering Perspective

    Get PDF
    Advanced embedded system technology is one of the key driving forces behind the rapid growth of Cyber-Physical System (CPS) applications. CPS consists of multiple coordinating and cooperating components, which are often software-intensive and interact with each other to achieve unprecedented tasks. Such highly integrated CPSs have complex interaction failures, attack surfaces, and attack vectors that we have to protect and secure against. This dissertation advances the state-of-the-art by developing a multilevel runtime monitoring approach for safety and security critical CPSs where there are monitors at each level of processing and integration. Given that computation and data processing vulnerabilities may exist at multiple levels in an embedded CPS, it follows that solutions present at the levels where the faults or vulnerabilities originate are beneficial in timely detection of anomalies. Further, increasing functional and architectural complexity of critical CPSs have significant safety and security operational implications. These challenges are leading to a need for new methods where there is a continuum between design time assurance and runtime or operational assurance. Towards this end, this dissertation explores Model Based Engineering methods by which design assurance can be carried forward to the runtime domain, creating a shared responsibility for reducing the overall risk associated with the system at operation. Therefore, a synergistic combination of Verification & Validation at design time and runtime monitoring at multiple levels is beneficial in assuring safety and security of critical CPS. Furthermore, we realize our multilevel runtime monitor framework on hardware using a stream-based runtime verification language

    Développement des techniques de test et de diagnostic pour les FPGA hiérarchique de type mesh

    Get PDF
    The evolution trend of shrinking feature size and increasing complexity in modern electronics is being slowed down due to physical limits that generate numerous imperfections and defects during fabrication steps or projected life time of the chip. Field Programmable Gate Arrays (FPGAs) are used in complex digital systems mainly due to their reconfigurability and shorter time-to-market. To maintain a high reliability of such systems, FPGAs should be tested thoroughly for defects. FPGA architecture optimization for area saving and better signal routability is an ongoing process which directly impacts the overall FPGA testability, hence the reliability. This thesis presents a complete strategy for test and diagnosis of manufacturing defects in mesh-based FPGAs containing a novel multilevel interconnects topology which promises to provide better area and routability. Efficiency of the proposed test schemes is analyzed in terms of test cost, respective fault coverage and diagnostic resolution.L’évolution tendant Ă  rĂ©duire la taille et augmenter la complexitĂ© des circuits Ă©lectroniques modernes, est en train de ralentir du fait des limitations technologiques, qui gĂ©nĂšrent beaucoup de d’imperfections et de defaults durant la fabrication ou la durĂ©e de vie de la puce. Les FPGAs sont utilisĂ©s dans les systĂšmes numĂ©riques complexes, essentiellement parce qu’ils sont reconfigurables et rapide Ă  commercialiser. Pour garder une grande fiabilitĂ© de tels systĂšmes, les FPGAs doivent ĂȘtre testĂ©s minutieusement pour les defaults. L’optimisation de l’architecture des FPGAs pour l’économie de surface et une meilleure routabilitĂ© est un processus continue qui impacte directement la testabilitĂ© globale et de ce fait, la fiabilitĂ©. Cette thĂšse prĂ©sente une stratĂ©gie complĂšte pour le test et le diagnostique des defaults de fabrication des “mesh-based FPGA” contenant une nouvelle topologie d’interconnections Ă  plusieurs niveaux, ce qui promet d’apporter une meilleure routabilitĂ©. EfficacitĂ© des schĂ©mas proposes est analysĂ©e en termes de temps de test, couverture de faute et rĂ©solution de diagnostique

    Locally Adaptive Resolution (LAR) codec

    Get PDF
    The JPEG committee has initiated a study of potential technologies dedicated to future generation image compression systems. The idea is to design a new norm of image compression, named JPEG AIC (Advanced Image Coding), together with advanced evaluation methodologies, closely matching to human vision system characteristics. JPEG AIC thus aimed at defining a complete coding system able to address advanced functionalities such as lossy to lossless compression, scalability (spatial, temporal, depth, quality, complexity, component, granularity...), robustness, embed-ability, content description for image handling at object level... The chosen compression method would have to fit perceptual metrics defined by the JPEG community within the JPEG AIC project. In this context, we propose the Locally Adaptive Resolution (LAR) codec as a contribution to the relative call for technologies, tending to fit all of previous functionalities. This method is a coding solution that simultaneously proposes a relevant representation of the image. This property is exploited through various complementary coding schemes in order to design a highly scalable encoder. The LAR method has been initially introduced for lossy image coding. This efficient image compression solution relies on a content-based system driven by a specific quadtree representation, based on the assumption that an image can be represented as layers of basic information and local texture. Multiresolution versions of this codec have shown their efficiency, from low bit rates up to lossless compressed images. An original hierarchical self-extracting region representation has also been elaborated: a segmentation process is realized at both coder and decoder, leading to a free segmentation map. This later can be further exploited for color region encoding, image handling at region level. Moreover, the inherent structure of the LAR codec can be used for advanced functionalities such as content securization purposes. In particular, dedicated Unequal Error Protection systems have been produced and tested for transmission over the Internet or wireless channels. Hierarchical selective encryption techniques have been adapted to our coding scheme. Data hiding system based on the LAR multiresolution description allows efficient content protection. Thanks to the modularity of our coding scheme, complexity can be adjusted to address various embedded systems. For example, basic version of the LAR coder has been implemented onto FPGA platform while respecting real-time constraints. Pyramidal LAR solution and hierarchical segmentation process have also been prototyped on DSPs heterogeneous architectures. This chapter first introduces JPEG AIC scope and details associated requirements. Then we develop the technical features, of the LAR system, and show the originality of the proposed scheme, both in terms of functionalities and services. In particular, we show that the LAR coder remains efficient for natural images, medical images, and art images

    Design, implementation, and verification of an FPGA-based control system for a permanent-magnet motor drive built upon a three-phase four-level active-clamped inverter

    Get PDF
    At the present time, a DE0 board from Terasic/Altera, which includes a Field Programmable Gate Array (FPGA) Cyclone III, is used to control a three-phase four-level active-clamped inverter which drives a permanent-magnet motor. The project consists in designing a new FPGA-based control system that substitutes the current control system based on the DE0 board. The novel control system will consist of a single board containing a new FPGA more suitable for the specific application, the analog-to-digital converters, and all the necessary auxiliary circuitry. The FPGA content wi[ANGLÈS] The present work summarizes the work and knowledge acquired by the author during its Master’s Thesis in the Research Group in Power Electronics, GREP. The development is based on the Multilevel Active-Clamped (MAC) power converter prototype, which was initially developed by GREP. Serving as a great introduction to the multilevel converter state-of-the-art, the prototype was tested and it was proved the need for a custom FPGA-based control platform board to drive a PMSM. The design of the board is then performed following the requirements established by the research group and the results obtained from the initial tests. Issues as power decoupling, signal conditioning and grounding strategies are discussed in the following chapters.[CASTELLÀ] La memoria aquĂ­ presentada recoge el trabajo y el conocimiento adquirido por el autor durante la elaboraciĂłn de su tesis de MĂĄster dentro del Grupo de InvestigaciĂłn en ElectrĂłnica de Potencia de la Universidad PolitĂ©cnica de Cataluña, GREP. El trabajo elaborado se desarrolla en torno al prototipo, previamente desarrollado por los miembros del GREP, de un convertidor de potencia multinivel de tipo MAC (Multilevel Active-Clamped). La familiarizaciĂłn con los Ășltimos avances en conversores multinivel se lleva a cabo mediante la fase de pruebas experimentales con este dispositivo, que a su vez demuestran la necesidad de diseñar una placa controladora especĂ­fica basada en FPGA para mover un motor de imanes permanentes. Esta placa de control se diseña siguiendo los requisitos establecidos por el GREP y las necesidades surgidas en la fase de experimentaciĂłn. En los capĂ­tulos del trabajo se tratan temas como el desacoplo de la alimentaciĂłn, acondicionamiento de señales o metodologĂ­as de diseño de planos de masa.[CATALÀ] La memĂČria aquĂ­ presentada recull el treball i el coneixement adquirit per l'autor durant l'elaboraciĂł de la seva tesi de MĂ ster dins del Grup de Recerca en ElectrĂČnica de PotĂšncia de la Universitat PolitĂšcnica de Catalunya, GREP. El treball es desenvolupa en torn al prototipus, prĂšviament desenvolupat pels membres del GREP, d'un convertidor de potĂšncia multinivell de tipus MAC (Multilevel Active-Clamped). La familiaritzaciĂł amb els darrers avanços en convertidors multinivell s'ha dut a terme mitjançant la fase de proves experimentals amb aquest prototipus, les quals han demostrat la necessitat de dissenyar una placa controladora especĂ­fica basada en FPGA per controlar un motor d'imants permanents. Aquesta placa de control s'ha dissenyat seguint els requisits establerts pel GREP i les necessitats aparegudes en la fase d'experimentaciĂł. En els capĂ­tols del treball es tracten temes com el desacoblament de l'alimentaciĂł, condicionament de senyals o metodologies de disseny de plans de massa
    • 

    corecore