63 research outputs found
Impact of Code Compression on Estimated Worst-Case Execution Times
International audienceCode compression techniques might be useful to meet code size constraints in embedded systems. In the average case, the impact of code compression on the performance is double-edged: on one side, the number of accesses to memory hierarchy is reduced because several instructions are coded in a single word, and this is likely to reduce the execution time; on the other side, the decompression penalty increases the processing time of compressed instructions. Nevertheless, experimental results show that the execution time might be lowered by code compression. In this paper, our goal is to analyze the impact of code compression on the estimated Worst-Case Execution Time of critical tasks that must meet at the same time code size constraints and timing deadlines. Changes in the access patterns to the instruction cache are indeed likely to alter the accuracy of the cache analysis within the process of determining the WCET. Experimental results show that, besides reducing the code size, our code compression scheme also improves the WCET estimates in most of the cases
A framework to experiment optimizations for real-time and embedded software
Typical constraints on embedded systems include code size limits, upper
bounds on energy consumption and hard or soft deadlines. To meet these
requirements, it may be necessary to improve the software by applying various
kinds of transformations like compiler optimizations, specific mapping of code
and data in the available memories, code compression, etc. However, a
transformation that aims at improving the software with respect to a given
criterion might engender side effects on other criteria and these effects must
be carefully analyzed. For this purpose, we have developed a common framework
that makes it possible to experiment various code transfor-mations and to
evaluate their impact of various criteria. This work has been carried out
within the French ANR MORE project.Comment: International Conference on Embedded Real Time Software and Systems
(ERTS2), Toulouse : France (2010
Estudo e avaliação de conjuntos de instruções compactos
Orientador: Rodolfo Jardim de AzevedoTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Sistemas embarcados modernos são compostos de SoC heterogêneos, variando entre processadores de baixo e alto custo. Apesar de processadores RISC serem o padrão para estes dispositivos, a situação mudou recentemente: fabricantes estão construindo sistemas embarcados utilizando processadores RISC - ARM e MIPS - e CISC (x86). A adição de novas funcionalidades em software embarcados requer maior utilização da memória, um recurso caro e escasso em SoCs. Assim, o tamanho de código executável é crítico, porque afeta diretamente o número de misses na cache de instruções. Processadores CISC costumavam possuir maior densidade de código do que processadores RISC, uma vez que a codificação de instruções com tamanho variável beneficia as instruções mais usadas, os programas são menores. No entanto, com a adição de novas extensões e instruções mais longas, a densidade do CISC em aplicativos recentes tornou-se similar ao RISC. Nesta tese de doutorado, investigamos a compressibilidade de processadores RISC e CISC; SPARC e x86. Nós propomos uma extensão de 16-bits para o processador SPARC, o SPARC16. Apresentamos também, a primeira metodologia para gerar ISAs de 16-bits e avaliamos a compressão atingida em comparação com outras extensões de 16-bits. Programas do SPARC16 podem atingir taxas de compressão melhores do que outros ISAs, atingindo taxas de até 67%. O SPARC16 também reduz taxas de cache miss em até 9%, podendo usar caches menores do que processadores SPARC mas atingindo o mesmo desempenho; a redução pode chegar à um fator de 16. Estudamos também como novas extensões constantemente introduzem novas funcionalidades para o x86, levando ao inchaço do ISA - com o total de 1300 instruções em 2013. Alem disso, 57 instruções se tornam inutilizadas entre 1995 e 2012. Resolvemos este problema propondo um mecanismo de reciclagem de opcodes utilizando emulação de instruções legadas, sem quebrar compatibilidade com softwares antigos. Incluímos um estudo de caso onde instruções x86 da extensão AVX são recodificadas usando codificações menores, oriundas de instruções inutilizadas, atingindo até 14% de redução no tamanho de código e 53% de diminuição do número de cache misses. Os resultados finais mostram que usando nossa técnica, até 40% das instruções do x86 podem ser removidas com menos de 5% de perda de desempenhoAbstract: Modern embedded devices are composed of heterogeneous SoC systems ranging from low to high-end processor chips. Although RISC has been the traditional processor for these devices, the situation changed recently; manufacturers are building embedded systems using both RISC - ARM and MIPS - and CISC processors (x86). New functionalities in embedded software require more memory space, an expensive and rare resource in SoCs. Hence, executable code size is critical since performance is directly affected by instruction cache misses. CISC processors used to have a higher code density than RISC since variable length encoding benefits most used instructions, yielding smaller programs. However, with the addition of new extensions and longer instructions, CISC density in recent applications became similar to RISC. In this thesis, we investigate compressibility of RISC and CISC processors, namely SPARC and x86. We propose a 16-bit extension to the SPARC processor, the SPARC16. Additionally, we provide the first methodology for generating 16-bit ISAs and evaluate compression among different 16-bit extensions. SPARC16 programs can achieve better compression ratios than other ISAs, attaining results as low as 67%. SPARC16 also reduces cache miss rates up to 9%, requiring smaller caches than SPARC processors to achieve the same performance; a cache size reduction that can reach a factor of 16. Furthermore, we study how new extensions are constantly introducing new functionalities to x86, leading to the ISA bloat at the cost a complex microprocessor front-end design, area and energy consumption - the x86 ISA reached over 1300 different instructions in 2013. Moreover, analyzed x86 code from 5 Windows versions and 7 Linux distributions in the range from 1995 to 2012 shows that up to 57 instructions get unused with time. To solve this problem, we propose a mechanism to recycle instruction opcodes through legacy instruction emulation without breaking backward software compatibility. We present a case study of the AVX x86 SIMD instructions with shorter instruction encodings from other unused instructions to yield up to 14% code size reduction and 53% instruction cache miss reduction in SPEC CPU2006 floating-point programs. Finally, our results show that up to 40% of the x86 instructions can be removed with less than 5% of overhead through our technique without breaking any legacy codeDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã
Summarizing multiprocessor program execution with versatile, microarchitecture-independent snapshots
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 131-137).Computer architects rely heavily on software simulation to evaluate, refine, and validate new designs before they are implemented. However, simulation time continues to increase as computers become more complex and multicore designs become more common. This thesis investigates software structures and algorithms for quickly simulating modern cache-coherent multiprocessors by amortizing the time spent to simulate the memory system and branch predictors. The Memory Timestamp Record (MTR) summarizes the directory and cache state of a multiprocessor system in a compact data structure. A single MTR snapshot is versatile enough to reconstruct the microarchitectural state resulting from various coherence protocols and cache organizations. The MTR may be quickly updated by each simulated processor during a fast-forwarding phase and optionally stored off-line for reuse. To fill large branch prediction tables, we introduce Branch Predictor-based Compression (BPC) which compactly stores a branch trace so that it may be used to fill in any branch predictor structure. An entire BPC trace requires less space than single discrete predictor snapshots, and it may be decompressed 3-6x faster than performing functional simulation.by Kenneth C. Barr.Ph.D
Recommended from our members
Coupling the Thermodynamics, Kinetics and Geodynamics of Multiphase Reactive Transport in Earth’s Interior
Multiscale multiphase reactive transport is a central phenomenon governing geologic processes in Earth's interior. In the upper mantle, melts, produced by partial melting of the peridotitic mantle, and volatile-rich fluids, derived from dehydration of subducting plates, buoyantly ascend through the mantle's porous network. Reaction between these melts and fluids and the surrounding solid matrix control the composition of magmas that reach Earth's crust. Melt-rock reaction is strongly coupled to the dynamics of melt transport: not only do the transport pathways modulate the extent of chemical interaction between melts and the solid matrix, but the melt-rock reactions also feedback into the transport dynamics through reactive changes to bulk physical properties including permeability, density, and viscosity. These feedbacks can result in the emergence of self-organized transport networks, such as the network of high-porosity dunite channels beneath mid-ocean ridges. Understanding the various feedbacks between reaction and melt transport requires consistent coupling of multicomponent multiphase thermodynamics and geodynamics. However, the high-dimensionality of such coupled problems presents a major theoretical and computational challenge. Existing models of reactive multiphase flow have therefore tended to focus separately on the geochemistry of melt-rock interaction, or on the dynamics of melt transport, with simplified thermo-chemical couplings.
In this dissertation, I present a new thermodynamically consistent and tractable framework for integrating multicomponent thermodynamics and multiphase geodynamics. I use a non-equilibrium thermodynamic formulation to describe reaction as a time-dependent irreversible process alongside heat and mass transport. This theory is implemented using new thermodynamic software developed through the ENKI project. The main benefits of this approach are two-fold. Firstly, it extends the reach of existing multiphase computational thermodynamics to model macroscopic disequilibrium reaction paths --- this is the first step towards being able to model a host of metastable reaction phenomena in igneous and metamorphic systems. I model disequilibrium batch reaction for a simple system in chapter 2. Secondly, it allows self-consistent integration of multiphase thermodynamics in two-phase flow models, to better explore coupling between reaction and transport. This is demonstrated in chapter 5.
Chapter 1 gives a broad introduction to multiphase reactive flow and further discusses the motivation for this work. I outline past work and discuss the scope of problems in which coupling between reaction and transport plays a critical role in geodynamic and geochemical evolution.
In chapter 2 I present a general theory for integrating computational thermodynamics and geodynamics. This approach is based on the standard conservation equations for porous melt transport within a deformable solid matrix, but extends the governing equations to include multiple solid phases. The multiphase reactive coupling is described using a kinetic framework that includes explicit stoichiometric reactions between minerals, melts, and fluids. Using the theory of non-equilibrium thermodynamics, the macroscopic reaction rates are controlled by the reaction affinities --- providing closed-form expressions for the net reactive mass transfers. This formulation of disequilibrium reaction is the principal contribution of this dissertation. Coupled with the conservation equations it can describe both equilibrium and disequilibrium reaction paths and is applicable to a range of geological conditions. I outline approaches for modeling melt-mediated, fluid-mediated, and subsolidus grain-boundary-mediated reaction. In extension to previous theories of two-phase flow, this framework permits modeling of more realistic melting and crystallization reactions, including eutectic and peritectic melting. The theoretical framework is supported by software developed as part of the ENKI project. I briefly summarize the software infrastructure in this chapter.
In the remaining chapters I step through the workflow for implementing this approach for a series of model problems in the MgSiO--SiO binary system. The MgSiO--SiO subsystem is an important bounding binary for understanding mantle melting and represents the simplest subsystem for exploring coupled reactive transport dynamics. Widely used thermodynamic models of silicate melting (i.e. MELTS) do not extend to the binary, and existing binary melting models involve complex treatments of melt speciation to account for significant non-ideality at high silica contents. Here, I am concerned mostly with reaction for mafic compositions relevant to mantle magmatism. Therefore, in chapter 3 I present a simple thermodynamic model for melting in the MgSiO--SiO system. I use a numerically efficient asymmetric binary mixing model to describe solution in the melt, which is calibrated using a compilation of phase equilibrium experimental data. This chapter is not a self-contained study in and of itself, but rather sets up the thermodynamic model that I will use in the remaining chapters.
Chapter 4 applies the theoretical framework to a series of simple model problems for disequilibrium reaction and reactive melt transport in the MgSiO--SiO system. Disequilibrium reaction paths can be non-intuitive, and I start by modeling reaction in uniform batch systems. All of the calculations are consistent with the phase diagram in the equilibrium limit. More general conservation equations for disequilibrium reaction in open-system batch reactors are derived in Appendix C. I then integrate irreversible reaction with the dynamics of diffusion and advection of heat and mass to model the formation of reactive fronts around fusible heterogeneities, and a eutectic/peritectic disequilibrium steady-state melting column. This is the first self-consistent inclusion of eutectic/peritectic melting into magma dynamics.
Finally, in chapter 5 I apply this framework to explore the formation of dunite channels by incongruent open-system melting. I develop a series of 1-D and 2-D models to investigate the formation of dunite channels in a harzburgitic mantle within the MgSiO--SiO binary system. The models predict that influx of deep silica-poor melts promotes a reactive channeling instability that organizes melt into high-porosity dunite channels. During decompression melting in the absence of a basal melt flux, no channelization is observed. This implies that an additional flux of melt is required, either from melting of deep fusible heterogeneities, or from large-scale melt focusing toward the ridge axis at depth. Alternatively, flux melting of additional melt components could help drive reactive channelization in natural peridotite systems
Neighbor Discovery Proxy-Gateway for 6LoWPAN-based Wireless Sensor Networks
El propósito de este trabajo es el estudio de métodos para la interconexión de redes personales inalámbricas de área local de bajo consumo y redes de computadores tradicionales. En particular, este proyecto analiza los protocolos de red involucrados así como las posibles formas de interoperabilidad entre ellos, teniendo como meta la integración de redes inalámbricas de sensores IEEE 802.15.4 basadas en 6LoWPAN (una capa de adaptación que hace posible el transporte de paquetes IPv6 sobre IEEE 802.15.4) en redes Ethernet ya existentes, sin necesidad de cambios en la infraestructura de red. Dicha integración permitiría el desarrollo y expansión de aplicaciones de usuario utilizando la tradicional pila de protocolos TCP/IP en sistemas compuestos por dispositivos empotrados de bajo coste y bajo consumo. Para probar la viabilidad de los métodos desarrollados, se diseña, implementa y evalúa un sistema empotrado cuya función es llevar a cabo las tareas de integración descritas
Aerospace medicine and biology: A continuing bibliography with indexes (supplement 300)
This bibliography lists 232 reports, articles and other documents introduced into the NASA scientific and technical information system in July 1987
Robust and secure monitoring and attribution of malicious behaviors
Worldwide computer systems continue to execute malicious software that degrades the systemsâ performance and consumes network capacity by generating high volumes of unwanted traffic. Network-based detectors can effectively identify machines participating in the ongoing attacks by monitoring the traffic to and from the systems. But, network detection alone is not enough; it does not improve the operation of the Internet or the health of other machines connected to the network. We must identify malicious code running on infected systems, participating in global attack networks.
This dissertation describes a robust and secure approach that identifies malware present on infected systems based on its undesirable use of network. Our approach, using virtualization, attributes malicious traffic to host-level processes responsible for the traffic. The attribution identifies on-host processes, but malware instances often exhibit parasitic behaviors to subvert the execution of benign processes.
We then augment the attribution software with a host-level monitor that detects parasitic behaviors occurring at the user- and kernel-level. User-level parasitic attack detection happens via the system-call interface because it is a non-bypassable interface for user-level processes. Due to the unavailability of one such interface inside the kernel for drivers, we create a new driver monitoring interface inside the kernel to detect parasitic attacks occurring through this interface.
Our attribution software relies on a guest kernelâ s data to identify on-host processes. To allow secure attribution, we prevent illegal modifications of critical kernel data from kernel-level malware. Together, our contributions produce a unified research outcome --an improved malicious code identification system for user- and kernel-level malware.Ph.D.Committee Chair: Giffin, Jonathon; Committee Member: Ahamad, Mustaque; Committee Member: Blough, Douglas; Committee Member: Lee, Wenke; Committee Member: Traynor, Patric
- …