5,182 research outputs found

    Generic Pipelined Processor Modeling and High Performance Cycle-Accurate Simulator Generation

    Full text link
    Detailed modeling of processors and high performance cycle-accurate simulators are essential for today's hardware and software design. These problems are challenging enough by themselves and have seen many previous research efforts. Addressing both simultaneously is even more challenging, with many existing approaches focusing on one over another. In this paper, we propose the Reduced Colored Petri Net (RCPN) model that has two advantages: first, it offers a very simple and intuitive way of modeling pipelined processors; second, it can generate high performance cycle-accurate simulators. RCPN benefits from all the useful features of Colored Petri Nets without suffering from their exponential growth in complexity. RCPN processor models are very intuitive since they are a mirror image of the processor pipeline block diagram. Furthermore, in our experiments on the generated cycle-accurate simulators for XScale and StrongArm processor models, we achieved an order of magnitude (~15 times) speedup over the popular SimpleScalar ARM simulator.Comment: Submitted on behalf of EDAA (http://www.edaa.com/

    A Virtual Testbed for Embedded Systems

    Get PDF
    Hardware-In-the-Loop (HIL) Simulation is a simulation approach in which a hardware embedded processor is connected to the simulation computer that simulates the electrical/mechanical devices controlled by the embedded processor. By using a real-time simulation computer and special-purpose hardware for connecting to the embedded processor, this method of simulation can be very precise but is costly. We are proposing an alternative method, HIL simulation with a network link, in which the device under test (the embedded processor) communicates with the simulation computer over a network connection (in our case a serial line) instead of through special-purpose hardware. We present an abstraction layer that facilitates the simulation of external devices. An earlier prototype had been developed for a 16-bit TMS320LF2407A DSP from Texas Instruments. We generalized the approach to the more advanced 32-bit TMS320F28335 DSP. We have made the changes in the DSP abstraction layer to enable more features and provide more flexibility to the programmer. For example, we introduced a shadow interrupt vector to make the simulation layer more general. We developed various scenarios to measure the performance of the system. In particular, we measure round-trip time and through-put for the communication between the simulator and the DSP. Also we rewrote the serial line drivers on the DSP to incorporate different working scenarios and to invoke the timers on the DSP for measuring the execution time. Our work helps to judge the performance of the system and to identify the application domains for this approach

    Co-simulation techniques based on virtual platforms for SoC design and verification in power electronics applications

    Get PDF
    En las últimas décadas, la inversión en el ámbito energético ha aumentado considerablemente. Actualmente, existen numerosas empresas que están desarrollando equipos como convertidores de potencia o máquinas eléctricas con sistemas de control de última generación. La tendencia actual es usar System-on-chips y Field Programmable Gate Arrays para implementar todo el sistema de control. Estos dispositivos facilitan el uso de algoritmos de control más complejos y eficientes, mejorando la eficiencia de los equipos y habilitando la integración de los sistemas renovables en la red eléctrica. Sin embargo, la complejidad de los sistemas de control también ha aumentado considerablemente y con ello la dificultad de su verificación. Los sistemas Hardware-in-the-loop (HIL) se han presentado como una solución para la verificación no destructiva de los equipos energéticos, evitando accidentes y pruebas de alto coste en bancos de ensayo. Los sistemas HIL simulan en tiempo real el comportamiento de la planta de potencia y su interfaz para realizar las pruebas con la placa de control en un entorno seguro. Esta tesis se centra en mejorar el proceso de verificación de los sistemas de control en aplicaciones de electrónica potencia. La contribución general es proporcionar una alternativa a al uso de los HIL para la verificación del hardware/software de la tarjeta de control. La alternativa se basa en la técnica de Software-in-the-loop (SIL) y trata de superar o abordar las limitaciones encontradas hasta la fecha en el SIL. Para mejorar las cualidades de SIL se ha desarrollado una herramienta software denominada COSIL que permite co-simular la implementación e integración final del sistema de control, sea software (CPU), hardware (FPGA) o una mezcla de software y hardware, al mismo tiempo que su interacción con la planta de potencia. Dicha plataforma puede trabajar en múltiples niveles de abstracción e incluye soporte para realizar co-simulación mixtas en distintos lenguajes como C o VHDL. A lo largo de la tesis se hace hincapié en mejorar una de las limitaciones de SIL, su baja velocidad de simulación. Se proponen diferentes soluciones como el uso de emuladores software, distintos niveles de abstracción del software y hardware, o relojes locales en los módulos de la FPGA. En especial se aporta un mecanismo de sincronizaron externa para el emulador software QEMU habilitando su emulación multi-core. Esta aportación habilita el uso de QEMU en plataformas virtuales de co-simulacion como COSIL. Toda la plataforma COSIL, incluido el uso de QEMU, se ha analizado bajo diferentes tipos de aplicaciones y bajo un proyecto industrial real. Su uso ha sido crítico para desarrollar y verificar el software y hardware del sistema de control de un convertidor de 400 kVA

    Pre-validation of SoC via hardware and software co-simulation

    Get PDF
    Abstract. System-on-chips (SoCs) are complex entities consisting of multiple hardware and software components. This complexity presents challenges in their design, verification, and validation. Traditional verification processes often test hardware models in isolation until late in the development cycle. As a result, cooperation between hardware and software development is also limited, slowing down bug detection and fixing. This thesis aims to develop, implement, and evaluate a co-simulation-based pre-validation methodology to address these challenges. The approach allows for the early integration of hardware and software, serving as a natural intermediate step between traditional hardware model verification and full system validation. The co-simulation employs a QEMU CPU emulator linked to a register-transfer level (RTL) hardware model. This setup enables the execution of software components, such as device drivers, on the target instruction set architecture (ISA) alongside cycle-accurate RTL hardware models. The thesis focuses on two primary applications of co-simulation. Firstly, it allows software unit tests to be run in conjunction with hardware models, facilitating early communication between device drivers, low-level software, and hardware components. Secondly, it offers an environment for using software in functional hardware verification. A significant advantage of this approach is the early detection of integration errors. Software unit tests can be executed at the IP block level with actual hardware models, a task previously only possible with costly system-level prototypes. This enables earlier collaboration between software and hardware development teams and smoothens the transition to traditional system-level validation techniques.Järjestelmäpiirin esivalidointi laitteiston ja ohjelmiston yhteissimulaatiolla. Tiivistelmä. Järjestelmäpiirit (SoC) ovat monimutkaisia kokonaisuuksia, jotka koostuvat useista laitteisto- ja ohjelmistokomponenteista. Tämä monimutkaisuus asettaa haasteita niiden suunnittelulle, varmennukselle ja validoinnille. Perinteiset varmennusprosessit testaavat usein laitteistomalleja eristyksissä kehityssyklin loppuvaiheeseen saakka. Tämän myötä myös yhteistyö laitteisto- ja ohjelmistokehityksen välillä on vähäistä, mikä hidastaa virheiden tunnistamista ja korjausta. Tämän diplomityön tavoitteena on kehittää, toteuttaa ja arvioida laitteisto-ohjelmisto-yhteissimulointiin perustuva esivalidointimenetelmä näiden haasteiden ratkaisemiseksi. Menetelmä mahdollistaa laitteiston ja ohjelmiston varhaisen integroinnin, toimien luonnollisena välietappina perinteisen laitteistomallin varmennuksen ja koko järjestelmän validoinnin välillä. Yhteissimulointi käyttää QEMU suoritinemulaattoria, joka on yhdistetty rekisterinsiirtotason (RTL) laitteistomalliin. Tämä mahdollistaa ohjelmistokomponenttien, kuten laiteajureiden, suorittamisen kohdejärjestelmän käskysarja-arkkitehtuurilla (ISA) yhdessä kellosyklitarkkojen RTL laitteistomallien kanssa. Työ keskittyy kahteen yhteissimulaation pääsovellukseen. Ensinnäkin se mahdollistaa ohjelmiston yksikkötestien suorittamisen laitteistomallien kanssa, varmistaen kommunikaation laiteajurien, matalan tason ohjelmiston ja laitteistokomponenttien välillä. Toiseksi se tarjoaa ympäristön ohjelmiston käyttämiseen toiminnallisessa laitteiston varmennuksessa. Merkittävä etu tästä lähestymistavasta on integraatiovirheiden varhainen havaitseminen. Ohjelmiston yksikkötestejä voidaan suorittaa jo IP-lohkon tasolla oikeilla laitteistomalleilla, mikä on aiemmin ollut mahdollista vain kalliilla järjestelmätason prototyypeillä. Tämä mahdollistaa aikaisemman ohjelmisto- ja laitteistokehitystiimien välisen yhteistyön ja helpottaa siirtymistä perinteisiin järjestelmätason validointimenetelmiin

    MURAC: A unified machine model for heterogeneous computers

    Get PDF
    Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination

    MPSoCBench : um framework para avaliação de ferramentas e metodologias para sistemas multiprocessados em chip

    Get PDF
    Orientador: Rodolfo Jardim de AzevedoTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Recentes metodologias e ferramentas de projetos de sistemas multiprocessados em chip (MPSoC) aumentam a produtividade por meio da utilização de plataformas baseadas em simuladores, antes de definir os últimos detalhes da arquitetura. No entanto, a simulação só é eficiente quando utiliza ferramentas de modelagem que suportem a descrição do comportamento do sistema em um elevado nível de abstração. A escassez de plataformas virtuais de MPSoCs que integrem hardware e software escaláveis nos motivou a desenvolver o MPSoCBench, que consiste de um conjunto escalável de MPSoCs incluindo quatro modelos de processadores (PowerPC, MIPS, SPARC e ARM), organizado em plataformas com 1, 2, 4, 8, 16, 32 e 64 núcleos, cross-compiladores, IPs, interconexões, 17 aplicações paralelas e estimativa de consumo de energia para os principais componentes (processadores, roteadores, memória principal e caches). Uma importante demanda em projetos MPSoC é atender às restrições de consumo de energia o mais cedo possível. Considerando que o desempenho do processador está diretamente relacionado ao consumo, há um crescente interesse em explorar o trade-off entre consumo de energia e desempenho, tendo em conta o domínio da aplicação alvo. Técnicas de escalabilidade dinâmica de freqüência e voltagem fundamentam-se em gerenciar o nível de tensão e frequência da CPU, permitindo que o sistema alcance apenas o desempenho suficiente para processar a carga de trabalho, reduzindo, consequentemente, o consumo de energia. Para explorar a eficiência energética e desempenho, foram adicionados recursos ao MPSoCBench, visando explorar escalabilidade dinâmica de voltaegem e frequência (DVFS) e foram validados três mecanismos com base na estimativa dinâmica de energia e taxa de uso de CPUAbstract: Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before the definition of the final Multiprocessor System on Chip (MPSoC) architecture details. However, simulation can only be efficiently performed when using a modeling and simulation engine that supports system behavior description at a high abstraction level. The lack of MPSoC virtual platform prototyping integrating both scalable hardware and software in order to create and evaluate new methodologies and tools motivated us to develop the MPSoCBench, a scalable set of MPSoCs including four different ISAs (PowerPC, MIPS, SPARC, and ARM) organized in platforms with 1, 2, 4, 8, 16, 32, and 64 cores, cross-compilers, IPs, interconnections, 17 parallel version of software from well-known benchmarks, and power consumption estimation for main components (processors, routers, memory, and caches). An important demand in MPSoC designs is the addressing of energy consumption constraints as early as possible. Whereas processor performance comes with a high power cost, there is an increasing interest in exploring the trade-off between power and performance, taking into account the target application domain. Dynamic Voltage and Frequency Scaling techniques adaptively scale the voltage and frequency levels of the CPU allowing it to reach just enough performance to process the system workload while meeting throughput constraints, and thereby, reducing the energy consumption. To explore this wide design space for energy efficiency and performance, both for hardware and software components, we provided MPSoCBench features to explore dynamic voltage and frequency scalability (DVFS) and evaluated three mechanisms based on energy estimation and CPU usage rateDoutoradoCiência da ComputaçãoDoutora em Ciência da Computaçã

    Analyzing the effects of transient faults into applications

    Get PDF
    As computer chips implementation technologies evolve to obtain more performance, those computer chips are using smaller components, with bigger density of transistors and working with lower power voltages. All these factors turn the computer chips less robust and increase the probability of a transient fault. Transient faults may occur once and never more happen the same way in a computer system lifetime. There are distinct consequences when a transient fault occurs: the operating system might abort the execution if the change produced by the fault is detected by bad behavior of the application, but the biggest risk is that the fault produces an undetected data corruption that modifies the application final result without warnings (for example a bit flip in some crucial data). With the objective of researching transient faults in computer system's processor registers and memory we have developed an extension of HP's and AMD joint full system simulation environment, named COTSon. This extension allows the injection of faults that change a single bit in processor registers and memory of the simulated computer. The developed fault injection system makes it possible to: evaluate the effects of single bit flip transient faults in an application, analyze an application robustness against single bit flip transient faults and validate fault detection mechanism and strategies.L'evolució dels processadors en cerca de millors prestacions fa que els xips duguin transistors més petits i incloguin major quantitat y densitat de transistors, a més d'operar amb un voltatge més baix. Tots aquests factors fan que els processadors siguin menys robusts i augmenten la probabilitat de fallades transitòries. Les fallades transitòries poden ocórrer una vegada i no tornar a passar de la mateixa forma en la vida útil d'un sistema. Quan ocorren poden passar diferents conseqüències: el sistema operatiu pot avortar l'execució quan el canvi produït per la fallada és detectat per mal comportament de l'aplicació, però el risc major és que, amb el canvi produït, ocasioni una corrupció de dades que no sigui detectada i canviï el resultat final de l'aplicació sense que ningú ho sàpiga. Per a investigar sobre els efectes que les fallades transitòries poden ocasionar en els registres d'un processador i en les memòries d'un computador, hem desenvolupat una extensió del simulador d'ordinadors complet de HP (COTSon). L'extensió realitzada permet la injecció de fallades que canvien un bit en registres i en les memòries del computador simulat. La injecció de fallades permet: avaluar els efectes de les fallades transitòries que ocasionen el canvi d'un bit en una aplicació, analitzar la robustesa d'una aplicació després de fallades transitòries de canvis del valor d'un bit i validar mecanismes i estratègies de detecció de fallades.La evolución de los procesadores en busca de prestaciones mejores hace que los circuitos lleven transistores más pequeños e incluyan mayor cantidad y densidad de transistores, además de operar con un voltaje menor. Todos estos factores hacen que los procesadores sean menos robustos y aumenta la probabilidad de fallos transitorios. Los fallos transitorios pueden ocurrir una vez y no volver a pasar, de la misma forma, en la vida útil de un sistema. Cuando ocurren, pueden pasar distintas consecuencias: el sistema operativo puede abortar la ejecución cuando el cambio producido por el fallo es detectado por mal comportamiento de la aplicación, pero el riesgo mayor es que, con el cambio producido, se produzca una corrupción de datos que no sea detectada y cambie el resultado final de la aplicación sin que sea detectado. Para investigar sobre los efectos que los fallos transitorios pueden ocasionar en los registros de un procesador y en las memorias de un computador, hemos desarrollado una extensión del simulador de ordenadores completo de HP (COTSon). La extensión realizada permite la inyección de fallos que cambian un bit en registros y en las memorias del computador simulado. La inyección de fallos permite: evaluar los efectos de los fallos transitorios que ocasionan cambio de un bit en una aplicación, analizar la robustez de una aplicación tras fallos transitorios de cambios del valor de un bit y validar mecanismos y estrategias de detección de fallos
    corecore