6 research outputs found

    Evolutionary system for prediction and optimization of hardware architecture performance

    Get PDF
    The design of computer architectures is a very complex problem. The multiple parameters make the number of possible combinations extremely high. Many researchers have used simulation, although it is a slow solution since evaluating a single point of the search space can take hours. In this work we propose using evolutionary multilayer perceptron (MLP) to compute the performance of an architecture parameter settings. Instead of exploring the search space, simulating many configurations, our method randomly selects some architecture configurations; those are simulated to obtain their performance, and then an artificial neural network is trained to predict the remaining configurations performance. Results obtained show a high accuracy of the estimations using a simple method to select the configurations we have to simulate to optimize the MLP. In order to explore the search space, we have designed a genetic algorithm that uses the MLP as fitness function to find the niche where the best architecture configurations (those with higher performance) are located. Our models need only a small fraction of the design space, obtaining small errors and reducing required simulation by two orders of magnitude.Peer ReviewedPostprint (published version

    Parallel and Distributed Optimization of Dynamic Data Structures for Multimedia Embedded Systems

    Get PDF
    Energy-efficient design of multimedia embedded systems demands optimizations in both hardware and software. Software optimization has no received much attention, although modern multimedia applications exhibit high resource utilization. In order to efficiently run this kind of applications in embedded systems, the dynamic memory subsystem needs to be optimized. A key role in this optimization is played by the Dynamic Data Types (DDTs) that reside in every reallife application. It would be desirable to organize this set of DDTs to achieve the best performance in the target embedded system. This problem is NP-complete, and cannot be fully explored. In these cases the use of parallel processing can be very useful because it allows not only to explore more solutions spending the same time, but also to implement new algorithms. In this work, we propose a method that uses parallel processing and evolutionary computation to explore DDTs in the design of embedded applications. We propose a parallel Multi-Objective Evolutionary Algorithm (MOEA) which combines NSGA-II and SPEA2. We use Discrete Event Systems Specification (DEVS) to implement this parallel evolutionary algorithm over Service Oriented Architecture (SOA). Parallelism improves the solutions found by the corresponding sequential algorithms, and it allows system designers to reach better solutions than previous approximation

    Optimization of Multimedia Embedded Applications using Parallel Genetic Algorithms

    Get PDF
    Energy-efficient design of multimedia embedded systems demands optimizations in both hardware and software. Software optimization has no received much attention, although modern multimedia applications exhibit high resource utilization. In order to efficiently run this kind of applications in embedded systems, the dynamic memory subsystem needs to be optimized. A key role in this optimization is played by the Dynamic Data Types (DDTs) that reside in every reallife application. It would be desirable to organize this set of DDTs to achieve the best performance in the target embedded system. This problem is NP-complete, and cannot be fully explored. In these cases the use of parallel processing can be very usefull because it allows not only to explore more solutions spending the same time, but also to implement new algorithms. In this work, we propose a method that uses parallel processing and evolutionary computation to explore DDTs in the design of embedded applications. We propose a parallel Multi-Objective Evolutionary Algorithm (MOEA) which combines NSGA-II and SPEA2. We use Discrete Event Systems Specification (DEVS) to implement this parallel evolutionary algorithm over Service Oriented Architecture (SOA). Parallelism improves the solutions found by the corresponding sequential algorithms, and it allows system designers to reach better solutions than previous approximation

    Simulation Native des Systèmes Multiprocesseurs sur Puce à l'aide de la Virtualisation Assistée par le Matériel

    Get PDF
    L'intégration de plusieurs processeurs hétérogènes en un seul système sur puce (SoC) est une tendance claire dans les systèmes embarqués. La conception et la vérification de ces systèmes nécessitent des plateformes rapides de simulation, et faciles à construire. Parmi les approches de simulation de logiciels, la simulation native est un bon candidat grâce à l'exécution native de logiciel embarqué sur la machine hôte, ce qui permet des simulations à haute vitesse, sans nécessiter le développement de simulateurs d'instructions. Toutefois, les techniques de simulation natives existantes exécutent le logiciel de simulation dans l'espace de mémoire partagée entre le matériel modélisé et le système d'exploitation hôte. Il en résulte de nombreux problèmes, par exemple les conflits l'espace d'adressage et les chevauchements de mémoire ainsi que l'utilisation des adresses de la machine hôte plutôt des celles des plates-formes matérielles cibles. Cela rend pratiquement impossible la simulation native du code existant fonctionnant sur la plate-forme cible. Pour surmonter ces problèmes, nous proposons l'ajout d'une couche transparente de traduction de l'espace adressage pour séparer l'espace d'adresse cible de celui du simulateur de hôte. Nous exploitons la technologie de virtualisation assistée par matériel (HAV pour Hardware-Assisted Virtualization) à cet effet. Cette technologie est maintenant disponibles sur plupart de processeurs grande public à usage général. Les expériences montrent que cette solution ne dégrade pas la vitesse de simulation native, tout en gardant la possibilité de réaliser l'évaluation des performances du logiciel simulé. La solution proposée est évolutive et flexible et nous fournit les preuves nécessaires pour appuyer nos revendications avec des solutions de simulation multiprocesseurs et hybrides. Nous abordons également la simulation d'exécutables cross- compilés pour les processeurs VLIW (Very Long Instruction Word) en utilisant une technique de traduction binaire statique (SBT) pour généré le code natif. Ainsi il n'est pas nécessaire de faire de traduction à la volée ou d'interprétation des instructions. Cette approche est intéressante dans les situations où le code source n'est pas disponible ou que la plate-forme cible n'est pas supporté par les compilateurs reciblable, ce qui est généralement le cas pour les processeurs VLIW. Les simulateurs générés s'exécutent au-dessus de notre plate-forme basée sur le HAV et modélisent les processeurs de la série C6x de Texas Instruments (TI). Les résultats de simulation des binaires pour VLIW montrent une accélération de deux ordres de grandeur par rapport aux simulateurs précis au cycle près.Integration of multiple heterogeneous processors into a single System-on-Chip (SoC) is a clear trend in embedded systems. Designing and verifying these systems require high-speed and easy-to-build simulation platforms. Among the software simulation approaches, native simulation is a good candidate since the embedded software is executed natively on the host machine, resulting in high speed simulations and without requiring instruction set simulator development effort. However, existing native simulation techniques execute the simulated software in memory space shared between the modeled hardware and the host operating system. This results in many problems, including address space conflicts and overlaps as well as the use of host machine addresses instead of the target hardware platform ones. This makes it practically impossible to natively simulate legacy code running on the target platform. To overcome these issues, we propose the addition of a transparent address space translation layer to separate the target address space from that of the host simulator. We exploit the Hardware-Assisted Virtualization (HAV) technology for this purpose, which is now readily available on almost all general purpose processors. Experiments show that this solution does not degrade the native simulation speed, while keeping the ability to accomplish software performance evaluation. The proposed solution is scalable as well as flexible and we provide necessary evidence to support our claims with multiprocessor and hybrid simulation solutions. We also address the simulation of cross-compiled Very Long Instruction Word (VLIW) executables, using a Static Binary Translation (SBT) technique to generated native code that does not require run-time translation or interpretation support. This approach is interesting in situations where either the source code is not available or the target platform is not supported by any retargetable compilation framework, which is usually the case for VLIW processors. The generated simulators execute on top of our HAV based platform and model the Texas Instruments (TI) C6x series processors. Simulation results for VLIW binaries show a speed-up of around two orders of magnitude compared to the cycle accurate simulators.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Automated Energy/Performance Macromodeling of Embedded Software

    No full text
    Efficient energy and performance estimation of embedded software is a critical part of any system-level design flow. Macromodeling based estimation is an attempt to speed up estimation by exploiting reuse that is inherent in the design process. Macromodeling involves pre-characterizing reusable software components to construct high-level models, which express the execution time or energy consumption of a sub-program as a function of suitable parameters. During simulation, macromodels can be used instead of detailed hardware models, resulting in orders of magnitude simulation speedup. However, in order to realize this potential, significant challenges need to be overcome in both the generation and use of macromodels--- including how to identify the parameters to be used in the macromodel, how to define the template function to which the macromodel is fitted, etc. This paper presents an automatic methodology to perform characterization-based high-level software macromodeling, which addresses the aforementioned issues. Given a sub-program to be macromodeled for execution time and/or energy consumption, the proposed methodology automates the steps of parameter identification, data collection through detailed simulation, macromodel template selection, and fitting. We propose a novel technique to identify potential macromodel parameters and perform data collection, which draws from the concept of data structure serialization used in distributed programming. We utilize symbolic regression techniques to concurrently filter out irrelevant macromodel parameters, construct a macromodel function, and derive the optimal coefficient values to minimize fitting error. Experiments with several realistic benchmarks suggest that the proposed methodology improves estimation accuracy and en..

    Automated Energy/Performance Macromodeling of Embedded Software

    No full text
    corecore