2,302 research outputs found

    Supporting task creation inside FPGA devices

    Get PDF
    The most common model to use co-processors/accelerators is the master-slave model where the slaves (coprocessors/ accelerators) are driven by a general purpose cpu. This simplifies the management of the accelerators because they cannot actively interact with the runtime and they are just passive slaves that operate over the memory under demand. However, the master-slave model limits system possibilities and introduces synchronization overheads that could be avoided. To overcome those limitations and increase the possibilities of accelerators, we propose extending task based programming models (like OpenMP [1] or OmpSs) to support some runtime APIs inside the FPGA co-processor. As a proof-of-concept, we implemented our proposal over the OmpSs@FPGA environment [2] adding the needed infrastructure in the FPGA bitstream and modifying the existing tools to support creation of children tasks inside a task offloaded to an FPGA accelerator. In addition, we added support to synchronize the children tasks created by a FPGA task regardless they are executed in a SMP host thread or they also target another FPGA accelerator in the same co-processor

    Dos modelos micromécanicos de deformación con daño en materiales compuestos

    Full text link
    La existencia del daño en los materiales compuestos y su evolución con el trasncurso de la deformación tiene una influencia capital en la resistencia mecáncia, la ductilidad y la tenacidad de fractura. Estas tesis desarrolla dos modelos micromecánicos de deformación con daño para este tipo de materiales. El primero de los modelos es capaz de analizar el comportamiento mecánico de un material fibásico cuando la fracción volumétrica de casa fase varía con la deformación. Las fases se comportan como sólidos elastoplásticos e isótropos con endurecimiento por deformación quer verifican las teoría incremental de la plasticidad y el criterio de plastificación de Von Mises, y el comportamiento efectivo del material compuesto se obtuvo a partir de las teorías de campo medio. Ests teorías suponen que el material es estadísticamente homogéneo y, por lo tanto, no tienen en cuenta los efectos de la localización espacial del daño. La formulación del modelo es general y permite obtener la respuesta mecánica bajo solicitaciones multiaxiales. El modelo se utiliza para simular la deformación en tracción uniaxial de dos aleaciones de Al reforzadas con partículas de SiC. El mecanismo de daño dominante en estos materiales es la rotura frágil del refuerzo y el material se analiza a partir de una mezcla homogénea de dos fases que representan el material sin daño y dañado. La fracción volumétrica de ambas faes cambia durante el transcurso de la deformación como consecuencia del incremento progresivo de las partículas cerámicas rotas. El segundo modelo predice la respuesta de un material compuesto reforzado con fibras cuando se deforma en tracción uniaxial en la dirección de éstas. El mecanismo de daño dominante es la rotura frágil de las fibras. La rotura de una fibra afecta a las vecinas y desencadena un proceso de localización prematura del daño en una sección concreta de la probeta. La ductilidad y la resistencia a tracción del material se calculan a partir de un modelo probabilístico basado en la rotura de dos y tres fibras adyacentes. La distribución de tensiones alrededor de una fibra rota se determinó de forma precisa mediante un modelo de elementos finitos que incluyó el efecto de la plasticidad de la matriz y del deslizamiento relativo entre la fibra rota y la matriz. El modelo se valida experimentalmente con los resultados obtenidos sobre una aleación de Ti-6Al-4V reforzada con fibras de Sic, donde se rotura se produce al localizarse el daño en una sección particular de la probeta como consecuencia de la rotura de las primeras fibras. Con este objeto se midieron los principales parámetros microestructurales que gobiernan el comportameinto en tracción del material compuesto desarrollando las técnicas experimentales necesarias al efecto. Entre ellos se incluyen el módulo elástico de las fibras y los parámetros de Weibull de su resistencia mecánica, las tensiones residuales que aprecen como consecuencia del proceso de consolidación en caliente y la tensión de rozamiento entre la matriz y las fibras en la intercara

    A level set approach for the analysis of flow and compaction during resin infusion in composite materials

    Full text link
    Fluid flow and fabric compaction during vacuum assisted resin infusion (VARI) of composite materials was simulated using a level set-based approach. Fluid infusion through the fiber preform was modeled using Darcy’s equations for the fluid flow through a porous media. The stress partition between the fluid and the fiber bed was included by means of Terzaghi’s effective stress theory. Tracking the fluid front during infusion was introduced by means of the level set method. The resulting partial differential equations for the fluid infusion and the evolution of flow front were discretized and solved approximately using the finite differences method with a uniform grid discretization of the spatial domain. The model results were validated against uniaxial VARI experiments through an [0]8 E-glass plain woven preform. The physical parameters of the model were also independently measured. The model results (in terms of the fabric thickness, pressure and fluid front evolution during filling) were in good agreement with the numerical simulations, showing the potential of the level set method to simulate resin infusio

    Performance analysis of a hardware accelerator of dependence management for taskbased dataflow programming models

    Get PDF
    Along with the popularity of multicore and manycore, task-based dataflow programming models obtain great attention for being able to extract high parallelism from applications without exposing the complexity to programmers. One of these pioneers is the OpenMP Superscalar (OmpSs). By implementing dynamic task dependence analysis, dataflow scheduling and out-of-order execution in runtime, OmpSs achieves high performance using coarse and medium granularity tasks. In theory, for the same application, the more parallel tasks can be exposed, the higher possible speedup can be achieved. Yet this factor is limited by task granularity, up to a point where the runtime overhead outweighs the performance increase and slows down the application. To overcome this handicap, Picos was proposed to support task-based dataflow programming models like OmpSs as a fast hardware accelerator for fine-grained task and dependence management, and a simulator was developed to perform design space exploration. This paper presents the very first functional hardware prototype inspired by Picos. An embedded system based on a Zynq 7000 All-Programmable SoC is developed to study its capabilities and possible bottlenecks. Initial scalability and hardware consumption studies of different Picos designs are performed to find the one with the highest performance and lowest hardware cost. A further thorough performance study is employed on both the prototype with the most balanced configuration and the OmpSs software-only alternative. Results show that our OmpSs runtime hardware support significantly outperforms the software-only implementation currently available in the runtime system for finegrained tasks.This work is supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project, by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272) and by the European Research Council RoMoL Grant Agreement number 321253. We also thank the Xilinx University Program for its hardware and software donations.Peer ReviewedPostprint (published version

    Fast evaluation methodology for automatic custom hardware prototyping

    Get PDF
    Hardware customization for scientific applications has shown a big potential for reducing power consumption and increasing performance. In particular, the automatic generation of ISA extensions for General-Purpose Processors (GPPs) to accelerate domain-specific applications is an active field of research to accelerate. Those domain-specific accelerated processors are mostly evaluated in simulation environments due to technical and programmability issues while using real hardware. There is no automatic mechanism to test those custom units in a real hardware environment. In this paper we present a toolchain that can automatically identify candidate parts of the code suitable for reconfigurable hardware acceleration. We validate our toolchain using ClustalW.Postprint (published version

    Metodologí­a para la generación y evaluación automática de hardware específico

    Get PDF
    En el área de la bioinformática podemos encontrar aplicaciones que suponen un reto para el diseño de nuevas arquitecturas de procesadores en términos de rendimiento, ya que sus características difieren de las de las aplicaciones de propósito general. Por ello proponemos una nueva arquitectura con unidades funcionales reconfigurables para un dominio específico de aplicaciones. Así, el primer paso para definir la nueva arquitectura será la creación de la nueva ISA del procesador, que se compondrá de extensiones de la ISA original. Para conseguir dicho objetivo, presentamos una metodología para identificar automáticamente patrones de instrucciones y generar prototipos de las unidades funcionales que las ejecutan. Hemos implementado la metodología de manera experimental con el soporte de la infraestructura Trimaran para la identificación de extensiones de la ISA, la herramienta DWARV para la generación de código VHDL, y la plataforma MOLEN para la evaluación de los prototipos hardware específicos generados automáticamente. En las evaluaciones iniciales de los prototipos generados para una aplicación de estudio, ClustalW, se ha obtenido hasta un 8.54x de speed-up para un único acelerador, mientras que el speed-up de toda la aplicación está por encima de 2x.Postprint (published version

    Preliminary work on a mechanism for testing a customized architecture

    Get PDF
    Hardware customization for scientific applications has shown a big potential for reducing power consumption and increasing performance. In particular, the automatic generation of ISA extensions for General-Purpose Processors (GPPs) to accelerate domain-specific applications is an active field of research. Those domain-specific customized processors are mostly evaluated in simulation environments due to technical and programmability issues while using real hardware. There is no automatic mechanism to test ISA extensions in a real hardware environment. In this paper we present a toolchain that can automatically identify candidate parts of the code suitable for acceleration to test them in a reconfigurable hardware. We validate our toolchain using a bioinformatic application, ClustalW, obtaining an overall speed-up over 2x.Postprint (published version

    Mechanical behavior and failure micromechanisms of hybrid 3D woven composites in tension

    Full text link
    The deformation and failure micromechanisms of a hybrid 3D woven composite were studied in tension. Plain and open-hole composite coupons were tested in tension until failure in the fill and warp directions, as well as fiber tows extracted from the dry fabric and impregnated with the matrix. The macroscopic evolution of damage in the composite coupons was assessed by means of periodic unloading–reloading (to obtain the elastic modulus and the residual strain), whereas the microscopic mechanism were established by means of X-ray computed microtomography. To this end, specimens were periodically removed from the mechanical testing machine and infiltrated with ZnI-containing liquid to assess the main damage modes as a function of the applied strain. The experimental observations and the predictions of an isostrain model were used to understand the key factors controlling the elastic modulus, strength and notch sensitivity of hybrid 3D woven composites in tension. It was found that the full contribution of the glass fibers to the composite strength was not employed, due to the premature fracture of the carbon fibers, but their presence increased the fracture strain and the energy dissipated during fracture. Thus, hybridization of the 3D woven composite led to a notch-insensitive behavior as demonstrated by open-hole test

    Modelización numérica del comportamiento mecánico de un fieltro de polipropileno

    Full text link
    En el presente trabajo se presenta un modelo del continuo para un fieltro denso. Igualando la densidad de potencia de un elemento del continuo a la densidad de potencia mecánica que actúa sobre el conjunto de las fibras se obtiene una expresión del tensor de tensiones en la configuración de referencia. El modelo se completa mediante la inclusión de un modelo de daño para modelar fenomenológicamente los mecanismos de extracción (pull-out) y rotura de las fibras. Se ha implementado el modelo como una subrutina de material de usuario para un código de elementos finitos (ABAQUSExplicit), formulado en grandes deformaciones. Los resultados obtenidos se han comparado con experimentos realizados sobre un fieltro comercial (geotextil) de fibras de polipropileno y muestran que el modelo es capaz de reproducir el comportamiento del material hasta la localización del daño y pérdida de capacidad portante del mismo
    corecore