1,452 research outputs found

    Automated design of domain-specific custom instructions

    Get PDF

    Racing to hardware-validated simulation

    Get PDF
    Processor simulators rely on detailed timing models of the processor pipeline to evaluate performance. The diversity in real-world processor designs mandates building flexible simulators that expose parts of the underlying model to the user in the form of configurable parameters. Consequently, the accuracy of modeling a real processor relies on both the accuracy of the pipeline model itself, and the accuracy of adjusting the configuration parameters according to the modeled processor. Unfortunately, processor vendors publicly disclose only a subset of their design decisions, raising the probability of introducing specification inaccuracies when modeling these processors. Inaccurately tuning model parameters deviates the simulated processor from the actual one. In the worst case, using improper parameters may lead to imbalanced pipeline models compromising the simulation output. Therefore, simulation models should be hardware-validated before using them for performance evaluation. As processors increase in complexity and diversity, validating a simulator model against real hardware becomes increasingly more challenging and time-consuming. In this work, we propose a methodology for validating simulation models against real hardware. We create a framework that relies on micro-benchmarks to collect performance statistics on real hardware, and machine learning-based algorithms to fine-tune the unknown parameters based on the accumulated statistics. We overhaul the Sniper simulator to support the ARM AArch64 instruction-set architecture (ISA), and introduce two new timing models for ARM-based in-order and out-of-order cores. Using our proposed simulator validation framework, we tune the in-order and out-of-order models to match the performance of a real-world implementation of the Cortex-A53 and Cortex-A72 cores with an average error of 7% and 15%, respectively, across a set of SPEC CPU2017 benchmarks

    Automatic generation and testing of application specific hardware accelerators on a new reconfigurable OpenSPARC platform

    Get PDF
    Specific hardware customization for scientific applications has shown a big potential to address the current holy grail in computer architecture: reducing power consumption while increasing performance. In particular, the automatic generation of domain-specific accelerators for General- Purpose Processors (GPPs) is an active field of research to the point that different leading hardware design companies (e.g. Intel, ARM) are announcing commercial platforms that integrate GPPs and FPGAs. In this paper we present a new framework with a holistic approach that addresses the challenge of design exploration of specific application accelerators. Our work focuses on a target platform consisting of a GPP with a reconfigurable functional unit. The framework includes a reconfigurable 1-core 1-thread OpenSPARC with a new programmable specific purpose unit (SPU) inside the OpenSPARC core. In order to program the SPU we have developed an automatic toolchain that profiles an application and discovers its main computing bottlenecks. With that information our toolchain is able to both design hardware specific accelerators that can be automatically mapped in the aforementioned SPU, and generate the binary code necessary to run the application using those accelerators. The OpenSPARC with the new specific application accelerators, defined in a Hardware Description Language, can then be executed and measured. Still awaiting further development, nowadays our framework is a proof-of-concept that shows that this kind of systems can be developed and programmed as easily as a GPP. In a near future it would be the source of very interesting information about the capabilities and drawbacks of those mixed GPP-FPGA systems.Postprint (published version

    Metodologí­a para la generación y evaluación automática de hardware específico

    Get PDF
    En el área de la bioinformática podemos encontrar aplicaciones que suponen un reto para el diseño de nuevas arquitecturas de procesadores en términos de rendimiento, ya que sus características difieren de las de las aplicaciones de propósito general. Por ello proponemos una nueva arquitectura con unidades funcionales reconfigurables para un dominio específico de aplicaciones. Así, el primer paso para definir la nueva arquitectura será la creación de la nueva ISA del procesador, que se compondrá de extensiones de la ISA original. Para conseguir dicho objetivo, presentamos una metodología para identificar automáticamente patrones de instrucciones y generar prototipos de las unidades funcionales que las ejecutan. Hemos implementado la metodología de manera experimental con el soporte de la infraestructura Trimaran para la identificación de extensiones de la ISA, la herramienta DWARV para la generación de código VHDL, y la plataforma MOLEN para la evaluación de los prototipos hardware específicos generados automáticamente. En las evaluaciones iniciales de los prototipos generados para una aplicación de estudio, ClustalW, se ha obtenido hasta un 8.54x de speed-up para un único acelerador, mientras que el speed-up de toda la aplicación está por encima de 2x.Postprint (published version

    Fast evaluation methodology for automatic custom hardware prototyping

    Get PDF
    Hardware customization for scientific applications has shown a big potential for reducing power consumption and increasing performance. In particular, the automatic generation of ISA extensions for General-Purpose Processors (GPPs) to accelerate domain-specific applications is an active field of research to accelerate. Those domain-specific accelerated processors are mostly evaluated in simulation environments due to technical and programmability issues while using real hardware. There is no automatic mechanism to test those custom units in a real hardware environment. In this paper we present a toolchain that can automatically identify candidate parts of the code suitable for reconfigurable hardware acceleration. We validate our toolchain using ClustalW.Postprint (published version

    Preliminary work on a mechanism for testing a customized architecture

    Get PDF
    Hardware customization for scientific applications has shown a big potential for reducing power consumption and increasing performance. In particular, the automatic generation of ISA extensions for General-Purpose Processors (GPPs) to accelerate domain-specific applications is an active field of research. Those domain-specific customized processors are mostly evaluated in simulation environments due to technical and programmability issues while using real hardware. There is no automatic mechanism to test ISA extensions in a real hardware environment. In this paper we present a toolchain that can automatically identify candidate parts of the code suitable for acceleration to test them in a reconfigurable hardware. We validate our toolchain using a bioinformatic application, ClustalW, obtaining an overall speed-up over 2x.Postprint (published version

    Automatic design of domain-specific instructions for low-power processors

    Get PDF
    This paper explores hardware specialization of low­ power processors to improve performance and energy efficiency. Our main contribution is an automated framework that analyzes instruction sequences of applications within a domain at the loop body level and identifies exactly and partially-matching sequences across applications that can become custom instructions. Our framework transforms sequences to a new code abstraction, a Merging Diagram, that improves similarity identification, clusters alike groups of potential custom instructions to effectively reduce the search space, and selects merged custom instructions to efficiently exploit the available customizable area. For a set of 11 media applications, our fast framework generates instructions that significantly improve the energy-delay product and speed­ up, achieving more than double the savings as compared to a technique analyzing sequences within basic blocks. This paper shows that partially-matched custom instructions, which do not significantly increase design time, are crucial to achieving higher energy efficiency at limited hardware areas

    Mitochondrial Sulfide Detoxification Requires a Functional Isoform O-Acetylserine(thiol)lyase C in Arabidopsis thaliana

    Get PDF
    In non-cyanogenic species, the main source of cyanide derives from ethylene and camalexin biosyntheses. In mitochondria, cyanide is a potent inhibitor of the cytochrome c oxidase and is metabolised by the β-Cyanoalanine synthase CYS-C1, catalysing the conversion of cysteine and cyanide to hydrogen sulfide and β- cyanoalanine. The hydrogen sulfide released also inhibits the cytochrome c oxidase and needs to be detoxified by the O-acetylserine(thiol)lyase mitochondrial isoform, OAS-C, which catalyses the incorporation of sulfide to O-acetylserine to produce cysteine, thus generating a cyclic pathway in the mitochondria. The loss of functional OAS-C isoforms causes phenotypic characteristics very similar to the loss of the CYS-C1 enzyme, showing defects in root hair formation. Genetic complementation with the OAS-C gene rescues the impairment of root hair elongation restoring the wild type phenotype. The mitochondria compromise their capacity to proper detoxify cyanide and the resulting sulfide because the latter cannot re-assimilate into cysteine in the oas-c null mutant. Consequently, we observe an accumulation of sulfide and cyanide and of the alternative oxidase, which is unable to prevent the production of reactive oxygen species probably due to the accumulation of both toxic molecules. Our results allow us to suggest that the significance of OAS-C is related with its role in the proper sulfide and cyanide detoxification in mitochondria.Ministerio de Ciencia e Innovación BIO2010-15201Junta de Andalucía BIO–27

    Extreme Environmental Heat Stress in Costa Rica

    Get PDF
    Ponencia -- Universidad de Costa Rica. Escuela de Educación Física y Deportes, 1989. Presentado en la edición 32 de ICHPER Anniversary World Congress celebrada en State Universutary, Frostburg, Meryland, USA, 16-21 de julio 1989.Five years of meteorological records were manually tabulated and analyzed for the estimation of the maximum WBGT (an environmental heat stress index) values that could be expected at different times of the day in three regions of Costa Rica. The results were presented at an international conference in the U.S.A. in 1989. The original paper presented at the conference is included, together with improved graphs reproduced from the undergraduate thesis by Mario Fco. Calderón Navarro and Cecilia González Álvarez. This information should help with the planning of long distance running and triathlon events in Costa Rica. Comparison with recent five-year records is encouraged; current, on-site WBGT readings should always be used at endurance events.Universidad de Costa Rica. Escuela de Educación Física y DeportesCongreso Mundial del Consejo Internacional de Salud, Educación Física y RecreaciónUCR::Vicerrectoría de Docencia::Ciencias Sociales::Facultad de Educación::Escuela de Educación Físic
    corecore