2,671 research outputs found
Fast approximately timed simulation
International audienceIn this paper we present a technique for fast approximately timed simulation of software within a virtual prototyping framework. Our method performs a static analysis of the program control flow graph to construct annotations of the simulated program, combined with dynamic performance information. The static analysis estimates execution time based on a target architecture model. The delays introduced by instruction fetch and data cache misses are evaluated dynamically. At the end of each block, static and dynamic information are combined with branch target prediction to compute the total execution time of the blocks. As a result, we can provide approximate performance estimates with a high simulation speed that is still usable for software developers
Accelerating host-compiled simulation by modifying IR code: industrial application in the spatial domain
Space applications rely on long and complex design processes, as they must deal with strict non-functional requirements such as criticality, timeliness, reliability and safety. The huge number of analysis and evaluations performed requires powerful simulations technologies combining high simulation speed and accuracy. Host-compiled simulation is a powerful approach to achieve fast, timed simulation of software running in complex embedded systems. However, in the general term, there is still the need of improving the speed and accuracy of these solutions, and there is a lack of host-compiled approaches oriented to space applications. To solve the first point, this paper presents an alternative that modifies the standard solution of adding the modeling of the cross-compiled control flow in the host computer by modifying the compiler's intermediate representation. That way, the host binary naturally follows the cross-compiled binary flow, avoiding a separate modeling, and improving simulation speed while maintaining accuracy. Additionally, the paper focuses on LEON processor, commonly used by the European Space Agency (ESA).This work has been funded by FEDER/Ministerio de Ciencia, InnovaciĂłn y Universidades - Agencia Estatal de InvestigaciĂłn/ TEC2017-86722-C4-3-R and the EC through the FP7-JTI 621429 EMC2 project
Recommended from our members
Learning-based system-level power modeling of hardware IPs
Accurate power models for hardware components at high levels of abstraction are a critical component to enable system-level power analysis and optimization. Virtual platform prototypes are widely utilized to support early system-level design space exploration. There is, however, a lack of accurate and fast power models of hardware components at such high-levels of abstraction.
In this dissertation, we present novel learningâbased approaches for extending fast functional simulation models of white-, gray-, and black-box custom hardware intellectual property components (IPs) with accurate power estimates. Depending on the observability, we extend high-level functional models with the capability to capture data-dependent resource, block, or I/O activity without a significant loss in simulation speed. We further leverage state-of-the-art machine learning techniques to synthesize abstract power models that can predict cycle-, block-, and invocation-level power from low-level hardware implementations, where we introduce novel structural decomposition techniques to reduce model complexities and increase estimation accuracy.
Our white-box approach integrates with existing high-level synthesis (HLS) tools to automatically extract resource mapping information, which is used to trace data-dependent resource-level activity and drive a cycle-accurate online power-performance model during functional simulation. Our gray-box approach supports power estimation at coarser basic block granularity. It uses only limited information about block inputs and outputs to extract light-weight block-level activity from a functional simulation and drive a basic block-level power model that utilizes a control flow decomposition to improve accuracy and speed. It is faster than cycle-level models, while providing a finer granularity than invocation-level models, which allows to further navigate accuracy and speed trade-offs. We finally propose a novel approach for extending behavioral models of black-box hardware IPs with an invocation-level power estimate. Our black-box model only uses input and output history to track data-dependent pipeline behavior, where we introduce a specialized ensemble learning that is composed out of individually selected cycle-by-cycle models with reduced complexity and increased accuracy. The proposed approaches are fully automated by integrating with existing, commercial HLS tools for custom hardware synthesized by HLS. Results of applying our approaches to various industrialâstrength design examples show that our power models can predict cycleâ, basic block-, and invocation-level power consumption to within 10%, 9%, and 3% of a commercial gate-level power estimation tool, respectively, all while running at several order of magnitude faster speeds of 1-10Mcycles/sec.Electrical and Computer Engineerin
Simulation Native des SystÚmes Multiprocesseurs sur Puce à l'aide de la Virtualisation Assistée par le Matériel
L'intĂ©gration de plusieurs processeurs hĂ©tĂ©rogĂšnes en un seul systĂšme sur puce (SoC) est une tendance claire dans les systĂšmes embarquĂ©s. La conception et la vĂ©rification de ces systĂšmes nĂ©cessitent des plateformes rapides de simulation, et faciles Ă construire. Parmi les approches de simulation de logiciels, la simulation native est un bon candidat grĂące Ă l'exĂ©cution native de logiciel embarquĂ© sur la machine hĂŽte, ce qui permet des simulations Ă haute vitesse, sans nĂ©cessiter le dĂ©veloppement de simulateurs d'instructions. Toutefois, les techniques de simulation natives existantes exĂ©cutent le logiciel de simulation dans l'espace de mĂ©moire partagĂ©e entre le matĂ©riel modĂ©lisĂ© et le systĂšme d'exploitation hĂŽte. Il en rĂ©sulte de nombreux problĂšmes, par exemple les conflits l'espace d'adressage et les chevauchements de mĂ©moire ainsi que l'utilisation des adresses de la machine hĂŽte plutĂŽt des celles des plates-formes matĂ©rielles cibles. Cela rend pratiquement impossible la simulation native du code existant fonctionnant sur la plate-forme cible. Pour surmonter ces problĂšmes, nous proposons l'ajout d'une couche transparente de traduction de l'espace adressage pour sĂ©parer l'espace d'adresse cible de celui du simulateur de hĂŽte. Nous exploitons la technologie de virtualisation assistĂ©e par matĂ©riel (HAV pour Hardware-Assisted Virtualization) Ă cet effet. Cette technologie est maintenant disponibles sur plupart de processeurs grande public Ă usage gĂ©nĂ©ral. Les expĂ©riences montrent que cette solution ne dĂ©grade pas la vitesse de simulation native, tout en gardant la possibilitĂ© de rĂ©aliser l'Ă©valuation des performances du logiciel simulĂ©. La solution proposĂ©e est Ă©volutive et flexible et nous fournit les preuves nĂ©cessaires pour appuyer nos revendications avec des solutions de simulation multiprocesseurs et hybrides. Nous abordons Ă©galement la simulation d'exĂ©cutables cross- compilĂ©s pour les processeurs VLIW (Very Long Instruction Word) en utilisant une technique de traduction binaire statique (SBT) pour gĂ©nĂ©rĂ© le code natif. Ainsi il n'est pas nĂ©cessaire de faire de traduction Ă la volĂ©e ou d'interprĂ©tation des instructions. Cette approche est intĂ©ressante dans les situations oĂč le code source n'est pas disponible ou que la plate-forme cible n'est pas supportĂ© par les compilateurs reciblable, ce qui est gĂ©nĂ©ralement le cas pour les processeurs VLIW. Les simulateurs gĂ©nĂ©rĂ©s s'exĂ©cutent au-dessus de notre plate-forme basĂ©e sur le HAV et modĂ©lisent les processeurs de la sĂ©rie C6x de Texas Instruments (TI). Les rĂ©sultats de simulation des binaires pour VLIW montrent une accĂ©lĂ©ration de deux ordres de grandeur par rapport aux simulateurs prĂ©cis au cycle prĂšs.Integration of multiple heterogeneous processors into a single System-on-Chip (SoC) is a clear trend in embedded systems. Designing and verifying these systems require high-speed and easy-to-build simulation platforms. Among the software simulation approaches, native simulation is a good candidate since the embedded software is executed natively on the host machine, resulting in high speed simulations and without requiring instruction set simulator development effort. However, existing native simulation techniques execute the simulated software in memory space shared between the modeled hardware and the host operating system. This results in many problems, including address space conflicts and overlaps as well as the use of host machine addresses instead of the target hardware platform ones. This makes it practically impossible to natively simulate legacy code running on the target platform. To overcome these issues, we propose the addition of a transparent address space translation layer to separate the target address space from that of the host simulator. We exploit the Hardware-Assisted Virtualization (HAV) technology for this purpose, which is now readily available on almost all general purpose processors. Experiments show that this solution does not degrade the native simulation speed, while keeping the ability to accomplish software performance evaluation. The proposed solution is scalable as well as flexible and we provide necessary evidence to support our claims with multiprocessor and hybrid simulation solutions. We also address the simulation of cross-compiled Very Long Instruction Word (VLIW) executables, using a Static Binary Translation (SBT) technique to generated native code that does not require run-time translation or interpretation support. This approach is interesting in situations where either the source code is not available or the target platform is not supported by any retargetable compilation framework, which is usually the case for VLIW processors. The generated simulators execute on top of our HAV based platform and model the Texas Instruments (TI) C6x series processors. Simulation results for VLIW binaries show a speed-up of around two orders of magnitude compared to the cycle accurate simulators.SAVOIE-SCD - Bib.Ă©lectronique (730659901) / SudocGRENOBLE1/INP-Bib.Ă©lectronique (384210012) / SudocGRENOBLE2/3-Bib.Ă©lectronique (384219901) / SudocSudocFranceF
MPSoCBench : um framework para avaliação de ferramentas e metodologias para sistemas multiprocessados em chip
Orientador: Rodolfo Jardim de AzevedoTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Recentes metodologias e ferramentas de projetos de sistemas multiprocessados em chip (MPSoC) aumentam a produtividade por meio da utilização de plataformas baseadas em simuladores, antes de definir os Ășltimos detalhes da arquitetura. No entanto, a simulação sĂł Ă© eficiente quando utiliza ferramentas de modelagem que suportem a descrição do comportamento do sistema em um elevado nĂvel de abstração. A escassez de plataformas virtuais de MPSoCs que integrem hardware e software escalĂĄveis nos motivou a desenvolver o MPSoCBench, que consiste de um conjunto escalĂĄvel de MPSoCs incluindo quatro modelos de processadores (PowerPC, MIPS, SPARC e ARM), organizado em plataformas com 1, 2, 4, 8, 16, 32 e 64 nĂșcleos, cross-compiladores, IPs, interconexĂ”es, 17 aplicaçÔes paralelas e estimativa de consumo de energia para os principais componentes (processadores, roteadores, memĂłria principal e caches). Uma importante demanda em projetos MPSoC Ă© atender Ă s restriçÔes de consumo de energia o mais cedo possĂvel. Considerando que o desempenho do processador estĂĄ diretamente relacionado ao consumo, hĂĄ um crescente interesse em explorar o trade-off entre consumo de energia e desempenho, tendo em conta o domĂnio da aplicação alvo. TĂ©cnicas de escalabilidade dinĂąmica de freqĂŒĂȘncia e voltagem fundamentam-se em gerenciar o nĂvel de tensĂŁo e frequĂȘncia da CPU, permitindo que o sistema alcance apenas o desempenho suficiente para processar a carga de trabalho, reduzindo, consequentemente, o consumo de energia. Para explorar a eficiĂȘncia energĂ©tica e desempenho, foram adicionados recursos ao MPSoCBench, visando explorar escalabilidade dinĂąmica de voltaegem e frequĂȘncia (DVFS) e foram validados trĂȘs mecanismos com base na estimativa dinĂąmica de energia e taxa de uso de CPUAbstract: Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before the definition of the final Multiprocessor System on Chip (MPSoC) architecture details. However, simulation can only be efficiently performed when using a modeling and simulation engine that supports system behavior description at a high abstraction level. The lack of MPSoC virtual platform prototyping integrating both scalable hardware and software in order to create and evaluate new methodologies and tools motivated us to develop the MPSoCBench, a scalable set of MPSoCs including four different ISAs (PowerPC, MIPS, SPARC, and ARM) organized in platforms with 1, 2, 4, 8, 16, 32, and 64 cores, cross-compilers, IPs, interconnections, 17 parallel version of software from well-known benchmarks, and power consumption estimation for main components (processors, routers, memory, and caches). An important demand in MPSoC designs is the addressing of energy consumption constraints as early as possible. Whereas processor performance comes with a high power cost, there is an increasing interest in exploring the trade-off between power and performance, taking into account the target application domain. Dynamic Voltage and Frequency Scaling techniques adaptively scale the voltage and frequency levels of the CPU allowing it to reach just enough performance to process the system workload while meeting throughput constraints, and thereby, reducing the energy consumption. To explore this wide design space for energy efficiency and performance, both for hardware and software components, we provided MPSoCBench features to explore dynamic voltage and frequency scalability (DVFS) and evaluated three mechanisms based on energy estimation and CPU usage rateDoutoradoCiĂȘncia da ComputaçãoDoutora em CiĂȘncia da Computaçã
Modeling Power Consumption and Temperature in TLM Models
International audienceMany techniques and tools exist to estimate the power consumption and the temperature map of a chip. These tools help the hardware designers develop power efficient chips in the presence of temperature constraints. For this task, the application can be ignored or at least abstracted by some high level scenarios; at this stage, the actual embedded software is generally not available yet. However, after the hardware is defined, the embedded software can still have a significant influence on the power consumption; i.e., two implementations of the same application can consume more or less power. Moreover, the actual software powe
Quality-aware model-driven service engineering
Service engineering and service-oriented architecture as an integration and platform technology is a recent approach to software systems integration. Quality aspects
ranging from interoperability to maintainability to performance are of central importance for the integration of heterogeneous, distributed service-based systems. Architecture models can substantially influence quality attributes of the implemented software systems. Besides the benefits of explicit architectures on maintainability and reuse, architectural constraints such as styles, reference architectures and architectural patterns can influence observable software properties such as performance. Empirical performance evaluation is a process of measuring and evaluating the performance of implemented software. We present an approach for addressing the quality of services and service-based systems at the model-level in the context of model-driven service engineering. The focus on architecture-level models is a consequence of the black-box
character of services
CONTREX: Design of embedded mixed-criticality CONTRol systems under consideration of EXtra-functional properties
The increasing processing power of todayâs HW/SW platforms leads to the integration of more and more functions in a single device. Additional design challenges arise when these functions share computing resources and belong to different criticality levels. The paper presents the CONTREX European project and its preliminary results. CONTREX complements current activities in the area of predictable computing platforms and segregation mechanisms with techniques to consider the extra-functional properties, i.e., timing constraints, power, and temperature. CONTREX enables energy efficient and cost aware design through analysis and optimization of these properties with regard to application demands at different criticality levels
Metamodels and Transformations for Software and Data Integration
Metamodels define a foundation for describing software system interfaces which can be used during software or data integration processes. The report is part of the BIZYCLE project, which examines applicability of model-based methods, technologies and tools to the large-scale industrial software and data integration scenarios. The developed metamodels are thus part of the overall BIZYCLE process, comprising of semantic, structural, communication, behavior and property analysis, aiming at facilitating and improving standard integration practice. Therefore, the project framework will be briefly introduced first, followed by the detailed metamodel and transformation description as well as motivation/illustration scenarios
- âŠ