Search CORE

1,128 research outputs found

Automatic generation and testing of application specific hardware accelerators on a new reconfigurable OpenSPARC platform

Author: Fernández Mikel
González Álvarez Cecilia
Jiménez González Daniel
Martorell Bofill Xavier
Álvarez Martínez Carlos
Publication venue
Publication date: 01/01/2011
Field of study

Specific hardware customization for scientific applications has shown a big potential to address the current holy grail in computer architecture: reducing power consumption while increasing performance. In particular, the automatic generation of domain-specific accelerators for General- Purpose Processors (GPPs) is an active field of research to the point that different leading hardware design companies (e.g. Intel, ARM) are announcing commercial platforms that integrate GPPs and FPGAs. In this paper we present a new framework with a holistic approach that addresses the challenge of design exploration of specific application accelerators. Our work focuses on a target platform consisting of a GPP with a reconfigurable functional unit. The framework includes a reconfigurable 1-core 1-thread OpenSPARC with a new programmable specific purpose unit (SPU) inside the OpenSPARC core. In order to program the SPU we have developed an automatic toolchain that profiles an application and discovers its main computing bottlenecks. With that information our toolchain is able to both design hardware specific accelerators that can be automatically mapped in the aforementioned SPU, and generate the binary code necessary to run the application using those accelerators. The OpenSPARC with the new specific application accelerators, defined in a Hardware Description Language, can then be executed and measured. Still awaiting further development, nowadays our framework is a proof-of-concept that shows that this kind of systems can be developed and programmed as easily as a GPP. In a near future it would be the source of very interesting information about the capabilities and drawbacks of those mixed GPP-FPGA systems.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Metodología para la generación y evaluación automática de hardware específico

Author: Gaydadjiev Georgi
González Cecilia
Jiménez González Daniel
Martorell Bofill Xavier
Álvarez Martínez Carlos
Publication venue
Publication date: 01/09/2009
Field of study

En el área de la bioinformática podemos encontrar aplicaciones que suponen un reto para el diseño de nuevas arquitecturas de procesadores en términos de rendimiento, ya que sus características difieren de las de las aplicaciones de propósito general. Por ello proponemos una nueva arquitectura con unidades funcionales reconfigurables para un dominio específico de aplicaciones. Así, el primer paso para definir la nueva arquitectura será la creación de la nueva ISA del procesador, que se compondrá de extensiones de la ISA original. Para conseguir dicho objetivo, presentamos una metodología para identificar automáticamente patrones de instrucciones y generar prototipos de las unidades funcionales que las ejecutan. Hemos implementado la metodología de manera experimental con el soporte de la infraestructura Trimaran para la identificación de extensiones de la ISA, la herramienta DWARV para la generación de código VHDL, y la plataforma MOLEN para la evaluación de los prototipos hardware específicos generados automáticamente. En las evaluaciones iniciales de los prototipos generados para una aplicación de estudio, ClustalW, se ha obtenido hasta un 8.54x de speed-up para un único acelerador, mientras que el speed-up de toda la aplicación está por encima de 2x.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

MPSoCBench : um framework para avaliação de ferramentas e metodologias para sistemas multiprocessados em chip

Author: Garanhani Liana Dessandre Duenha, 1977-
Publication venue: [s.n.]
Publication date: 30/08/2018
Field of study

Orientador: Rodolfo Jardim de AzevedoTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Recentes metodologias e ferramentas de projetos de sistemas multiprocessados em chip (MPSoC) aumentam a produtividade por meio da utilização de plataformas baseadas em simuladores, antes de definir os últimos detalhes da arquitetura. No entanto, a simulação só é eficiente quando utiliza ferramentas de modelagem que suportem a descrição do comportamento do sistema em um elevado nível de abstração. A escassez de plataformas virtuais de MPSoCs que integrem hardware e software escaláveis nos motivou a desenvolver o MPSoCBench, que consiste de um conjunto escalável de MPSoCs incluindo quatro modelos de processadores (PowerPC, MIPS, SPARC e ARM), organizado em plataformas com 1, 2, 4, 8, 16, 32 e 64 núcleos, cross-compiladores, IPs, interconexões, 17 aplicações paralelas e estimativa de consumo de energia para os principais componentes (processadores, roteadores, memória principal e caches). Uma importante demanda em projetos MPSoC é atender às restrições de consumo de energia o mais cedo possível. Considerando que o desempenho do processador está diretamente relacionado ao consumo, há um crescente interesse em explorar o trade-off entre consumo de energia e desempenho, tendo em conta o domínio da aplicação alvo. Técnicas de escalabilidade dinâmica de freqüência e voltagem fundamentam-se em gerenciar o nível de tensão e frequência da CPU, permitindo que o sistema alcance apenas o desempenho suficiente para processar a carga de trabalho, reduzindo, consequentemente, o consumo de energia. Para explorar a eficiência energética e desempenho, foram adicionados recursos ao MPSoCBench, visando explorar escalabilidade dinâmica de voltaegem e frequência (DVFS) e foram validados três mecanismos com base na estimativa dinâmica de energia e taxa de uso de CPUAbstract: Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before the definition of the final Multiprocessor System on Chip (MPSoC) architecture details. However, simulation can only be efficiently performed when using a modeling and simulation engine that supports system behavior description at a high abstraction level. The lack of MPSoC virtual platform prototyping integrating both scalable hardware and software in order to create and evaluate new methodologies and tools motivated us to develop the MPSoCBench, a scalable set of MPSoCs including four different ISAs (PowerPC, MIPS, SPARC, and ARM) organized in platforms with 1, 2, 4, 8, 16, 32, and 64 cores, cross-compilers, IPs, interconnections, 17 parallel version of software from well-known benchmarks, and power consumption estimation for main components (processors, routers, memory, and caches). An important demand in MPSoC designs is the addressing of energy consumption constraints as early as possible. Whereas processor performance comes with a high power cost, there is an increasing interest in exploring the trade-off between power and performance, taking into account the target application domain. Dynamic Voltage and Frequency Scaling techniques adaptively scale the voltage and frequency levels of the CPU allowing it to reach just enough performance to process the system workload while meeting throughput constraints, and thereby, reducing the energy consumption. To explore this wide design space for energy efficiency and performance, both for hardware and software components, we provided MPSoCBench features to explore dynamic voltage and frequency scalability (DVFS) and evaluated three mechanisms based on energy estimation and CPU usage rateDoutoradoCiência da ComputaçãoDoutora em Ciência da Computaçã

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio da Producao Cientifica e Intelectual da Unicamp

Compiler-Driven Leakage Energy Reduction

Author: David Atienza
Diederik Verkest
Francky Catthoor
Giovanni De Micheli
José L Ayala
Marisa Lopez-Vallejo
Praveen Raghavan
Publication venue
Publication date: 01/01/2006
Field of study

Abstract. Tomorrow's embedded devices need to run high-resolution multimedia applications which need an enormous computational complexity with a very low energy consumption constraint. In this context, the register file is one of the key sources of power consumption and its inappropriate design and management can severely affect the performance of the system. In this paper, we present a new approach to reduce the energy of the shared register file in upcoming embedded VLIW architectures with several processing units. Energy savings up to a 60% can be obtained in the register file without any performance penalty. It is based on a set of hardware extensions and a compiler-based energy-aware register assignment algorithm that enable the de/activation of parts of the register file (i.e. sub-banks) in an independent way at run-time, which can be easily included in these embedded architectures

CiteSeerX

Compiler-Driven Leakage Energy Reduction in Banked Register Files

Author: Atienza David
Ayala José L.
Catthoor Francky
De Micheli Giovanni
Lopez-Vallejo Marisa
Raghavan Praveen
Verkest Diederik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Tomorrow’s embedded devices need to run high-resolution multimedia applications which need an enormous computational complexity with a very low energy consumption constraint. In this context, the register file is one of the key sources of power consumption and its inappropriate design and management can severely affect the performance of the system. In this paper, we present a new approach to reduce the energy of the shared register file in upcoming embedded VLIW architectures with several processing units. Energy savings up to a 60% can be obtained in the register file without any performance penalty. It is based on a set of hardware extensions and a compiler-based energy-aware reg- ister assignment algorithm that enable the de/activation of parts of the register file (i.e. sub-banks) in an independent way at run-time, which can be easily included in these embedded architectures

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Tracing Hardware Monitors in the GR712RC Multicore Platform: Challenges and Lessons Learnt from a Space Case Study

Author: Abella Jaume
Cazorla Francisco J.
Fernandez Mikel
Girbal Sylvain
Mezzetti Enrico
Palomo Xavier
Rioux Laurent
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd Euromicro Conference on Real-Time Systems (ECRTS 2020)
Publication date: 01/01/2020
Field of study

The demand for increased computing performance is driving industry in critical-embedded systems (CES) domains, e.g. space, towards the use of multicores processors. Multicores, however, pose several challenges that must be addressed before their safe adoption in critical embedded domains. One of the prominent challenges is software timing analysis, a fundamental step in the verification and validation process. Monitoring and profiling solutions, traditionally used for debugging and optimization, are increasingly exploited for software timing in multicores. In particular, hardware event monitors related to requests to shared hardware resources are building block to assess and restraining multicore interference. Modern timing analysis techniques build on event monitors to track and control the contention tasks can generate each other in a multicore platform. In this paper we look into the hardware profiling problem from an industrial perspective and address both methodological and practical problems when monitoring a multicore application. We assess pros and cons of several profiling and tracing solutions, showing that several aspects need to be taken into account while considering the appropriate mechanism to collect and extract the profiling information from a multicore COTS platform. We address the profiling problem on a representative COTS platform for the aerospace domain to find that the availability of directly-accessible hardware counters is not a given, and it may be necessary to the develop specific tools that capture the needs of both the user’s and the timing analysis technique requirements. We report challenges in developing an event monitor tracing tool that works for bare-metal and RTEMS configurations and show the accuracy of the developed tool-set in profiling a real aerospace application. We also show how the profiling tools can be exploited, together with handcrafted benchmarks, to characterize the application behavior in terms of multicore timing interference.This work has been partially supported by a collaboration agreement between Thales Research and the Barcelona Supercomputing Center, and the European Research Council (ERC) under the EU’s Horizon 2020 research and innovation programme (grant agreement No. 772773). MINECO partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC2013-14717).Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Dagstuhl Research Online Publication Server

Energy-Aware Compilation and Hardware Design for VLIW Embedded Systems

Author: Atienza David
Ayala Jose L.
Catthoor Francky
Lopez-Vallejo Marisa
Raghavan Praveen
Verkest Diederik
Publication venue: 'Inderscience Publishers'
Publication date: 03/01/2009
Field of study

Tomorrow's embedded devices need to run multimedia applications demanding high computational power with low energy consumption constraints. In this context, the register file is a key source of power consumption and its inappropriate design and management severely affects system power. In this paper, we present a new approach to reduce the energy of shared register files in forthcoming embedded VLIW processors running real-life applications up to 60% without performance penalty. This approach relies on limited hardware extensions and a compiler-based energy-aware register assignment algorithm to deactivate at run-time parts of the register file (i.e., sub-banks) in an independent way

Infoscience - École polytechnique fédérale de Lausanne