Search CORE

10 research outputs found

DIGITAL FX!32 running 32-Bit x86 applications on Alpha NT

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

Accelerating shared library execution in a DBT

Author: Franke Björn
Spink Tom
Publication venue: ACM
Publication date: 20/06/2024
Field of study

User-mode Dynamic Binary Translation (DBT) has recently received renewed interest, not least due to Apple's transition towards the Arm ISA, supported by a DBT compatibility layer for x86 legacy applications. While receiving praise for its performance, execution of legacy applications through Apple's Rosetta 2 technology still incurs a performance penalty when compared to direct host execution. A particular limitation of Rosetta 2 is that code is either executed exclusively as native Arm code, or as translated Arm code. In particular, mixed mode execution of native Arm code and translated code is not possible. This is a missed opportunity, especially in the case of shared libraries where both optimized x86 and Arm versions of the same library are available. In this paper, we develop mixed mode execution capabilities for shared libraries in a DBT system, eliminating the need to translate code where a highly optimised native version already exists. Our novel execution model intercepts calls to shared library functions in the DBT system and automatically redirects them to their faster host counterparts, making better use of the underlying host ISA. To ease the burden for the developer, we make use of an Interface Description Language (IDL) to capture library function signatures, from which relevant stubs and data marshalling code are generated automatically. We have implemented our novel mixed mode execution approach in the open-source QEMU DBT system, and demonstrate both ease of use and performance benefits for three popular libraries (standard C Math library, SQLite, and OpenSSL). Our evaluation confirms that with minimal developer effort, accelerated host execution of shared library functionality results in speedups between 2.7x and 6.3x on average, and up to 28x for x86 legacy applications on an Arm host system

University of St. Andrews - Pure

St Andrews Research Repository

ARM-on-ARM

Author: On Calvin
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2007
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 61-62).This thesis proposes and implements ANA, a new method for the simulation of ARM programs on the ARM platform. ANA is a lightweight ARM instruction interpreter that uses the hardware to do a lot of the work for the read-decode-execute piece of simulation. We compare this method to the two existing methods of full simulation and direct execution that have been traditionally used to achieve this. We demonstrate that despite some setbacks caused by the prefetching and caching behaviors of the ARM, ANA continues to be a very useful tool for prototyping and for increasing simulator performance. Finally, we identify the important role that ANA can play in our current efforts to virtualize the ARM.by Calvin On.M.Eng

DSpace@MIT

Versatile Object-oriented Real-Time Operating System

Author: Lee Rusty (Rusty Shawn), 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2001
Field of study

Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 79-80).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.As computer software has become more complex in response to increasing demands and greater levels of abstraction, so have computer operating systems. In order to achieve the desired level of functionality, operating systems have become less flexible and overly complex. The additional complexity and abstraction introduced often leads to less efficient use of hardware and increased hardware requirements. In embedded systems with limited hardware resources, efficient resource use is extremely important to the functionality of the resources. Therefore, operating system functionality not useful for the embedded system's applications is detrimental to the system. Component-based software provides a way to achieve both the efficient application-specific functionality required in embedded systems and the ability to extend this functionality to other applications. This thesis presents a component-based operating system, VORTOS, the Versatile Object-oriented Real-Time Operating System. VORTOS uses a virtual machine to abstract the hardware, eliminating the need for further portability abstractions within the operating system and application level components. The simple modular component architecture allows both the operating system and user applications to be extremely flexible by allowing them to utilize the particular components required, without sacrificing performance.by Rusty Lee.M.Eng.and S.B

DSpace@MIT

Análise de canais laterais de tempo em tradutores dinâmicos de binários

Author: Napoli Otávio Oliveira, 1994-
Publication venue: [s.n.]
Publication date: 16/08/2019
Field of study

Orientadores: Edson Borin, Diego de Freitas AranhaDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Ataques de canais laterais são um importante problema para os algoritmos criptográficos. Se o tempo de execução de uma implementação depende de uma informação secreta, um adversário pode recuperar a mesma através da medição de seu tempo. Diferentes abordagens surgiram recentemente para explorar o vazamento de informações em implementações criptográficas e para protegê-las contra esses ataques. Para tanto, a criptografia em tempo constante é uma pratica amplamente adotada visando descorrelacionar a dependencia entre um dado secreto e suas amostras de tempo. Apesar das contra-medidas serem eficazes para garantir execução dos algoritmos em um sistema evitando canais laterais de tempo, emuladores podem modificar e reintroduzir pontos de vazamento durante sua execução. Trabalhos recentes discutem os impactos dos compiladores Just-In-Time (JIT) de linguagens de alto nível no vazamento de informações a partir do tempo de execução. Entretanto, pouco foi dito sobre a emulação entre ISAs e seu impacto em vazamentos de tempo. Neste trabalho, nós investigamos o impacto de emuladores (tradutores dinâmicos de binários) entre ISAs na propriedade de tempo constante de implementações criptográficas. Utilizando métodos estatísticos e rotinas criptográficas validas, nós afirmamos a viabilidade de vazamentos de tempo em códigos gerados por tradutores dinâmicos de binários, usando diferentes técnicas de formação de regiões. Nós mostramos que a emulação pode ter um impacto significante, inserindo construções de tempo não constante durante sua tradução, levando a vazamentos de tempo significantes. Esses vazamentos podem ser observados em tradutores dinâmicos como o QEMU e o HQEMU durante a emulação de rotinas de bibliotecas criptográficas conhecidas, como a mbedTLS e podem ser rapidamente verificados. Por fim, para garantir a propriedade de tempo constante nós propusemos um modelo de mitigação para tradutores dinâmicos de binários baseado em transformações de compiladores, mitigando os canais laterais inseridosAbstract: Timing side-channel attacks are an important issue for cryptographic algorithms. If the execution time of an implementation depends on secret information, an adversary may recover the latter through measuring the former. Different approaches have recently emerged to exploit information leakage on cryptographic implementations and to protect them against these attacks. Therefore, implementation of constant-time cryptography is a widely adopted practice aiming to decorrelate the dependency between a secret data and its timing samples. Despite the countermeasures are effective to guarantee the execution of algorithms in a system by avoiding timing side-channels, emulators can modify and reintroduce leakage points during their execution. Recent works discusses the impact of high level language Just-In-Time (JIT) compilers in leakages through execution time. However, little has been said about Cross-ISA emulation through DBT and its impact on timing leakages. In this work, we investigate the impact of emulators (dynamic binary translators) on constant-time property of cryptographic implementations. By using statistical methods and cryptographic routines we asserted the feasibility of timing leaks in codes generated by a dynamic binary translator, even using different Region Formation Techniques. We show that the emulation may have a significant impact by inserting non constant-time constructions during its translations, leading to a significant timing leakage. This leakage is observed in dynamic binary translation systems such as QEMU and HQEMU when emulating routines from known cryptographic libraries, such mbedTLS and can be quickly verified. Finally, to guarantee the constant-time property we implemented a compiler transformation based on the if-conversion transformation in the dynamic binary translators, mitigating the inserted timing side-channelsMestradoCiência da ComputaçãoMestre em Ciência da Computação2014/50704-7FAPES

Repositorio da Producao Cientifica e Intelectual da Unicamp

Geração automática de ferramentas de inspeção de código para processadores especificados em ADL

Author: Schultz Max Ruben de Oliveira
Publication venue: Florianópolis, SC
Publication date: 01/01/2007
Field of study

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Ciência da Computação.Um sistema embarcado pode ter todos os seus componentes eletrônicos implementados em um único circuito integrado, dando origem ao assim chamado System-on-a-Chip (SoC). Um SoC é composto de uma ou mais CPUs e por componentes não programáveis, tais como memória(s), barramento(s) e periférico(s). A CPU escolhida pode ser um processador dedicado, denominado Application-Specific Instruction-Set Processor (ASIP). O projeto de SoCs requer ferramentas para a inspeção de código, a fim de se explorar a corretude do software embarcado a ser executado em cada CPU. Isto pode ser feito através da geração automática de ferramentas a partir de um modelo formal de CPU, cujas características podem ser descritas através do uso de Linguagens de Descrição de Arquiteturas (Architecture Description Language - ADLs). Como o redirecionamento manual das ferramentas para cada CPU explorada seria inviável devido à pressão do time-to-market, o redirecionamento automático é mandatório. Esta dissertação contribui com a expansão do módulo de geração de ferramentas de manipulação de código binário associado à ADL ArchC, através da geração automática de desmontadores e depuradores de código. As ferramentas de desmontagem e depuração de código foram validadas por meio de comparação com ferramentas nativas congêneres para modelos de arquiteturas RISC e CISC (i8051, MIPS, SPARC e PowerPC). Para fins de experimentação, foram usados os benchmarks MiBench e Dalton, evidenciando a corretude e a robustez das ferramentas. Além disso, mostra-se a integração do gerador de desmontadores no âmbito de um tradutor binário, proposto como resultado de trabalho cooperativo (também reportado em outras duas dissertações correlatas)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositório Institucional da UFSC

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

Author: Chang Ming-Yang
Publication venue: 'Academy Publisher'
Publication date
Field of study

[[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI

Tamkang University Institutional Repository

SimuBoost: Scalable Parallelization of Functional System Simulation

Author: Rittinghaus Marc
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 21/08/2019
Field of study

Für das Sammeln detaillierter Laufzeitinformationen, wie Speicherzugriffsmustern, wird in der Betriebssystem- und Sicherheitsforschung häufig auf die funktionale Systemsimulation zurückgegriffen. Der Simulator führt dabei die zu untersuchende Arbeitslast in einer virtuellen Maschine (VM) aus, indem er schrittweise Instruktionen interpretiert oder derart übersetzt, sodass diese auf dem Zustand der VM arbeiten. Dieser Prozess ermöglicht es, eine umfangreiche Instrumentierung durchzuführen und so an Informationen zum Laufzeitverhalten zu gelangen, die auf einer physischen Maschine nicht zugänglich sind. Obwohl die funktionale Systemsimulation als mächtiges Werkzeug gilt, stellt die durch die Interpretation oder Übersetzung resultierende immense Ausführungsverlangsamung eine substanzielle Einschränkung des Verfahrens dar. Im Vergleich zu einer nativen Ausführung messen wir für QEMU eine 30-fache Verlangsamung, wobei die Aufzeichnung von Speicherzugriffen diesen Faktor verdoppelt. Mit Simulatoren, die umfangreichere Instrumentierungsmöglichkeiten mitbringen als QEMU, kann die Verlangsamung um eine Größenordnung höher ausfallen. Dies macht die funktionale Simulation für lang laufende, vernetzte oder interaktive Arbeitslasten uninteressant. Darüber hinaus erzeugt die Verlangsamung ein unrealistisches Zeitverhalten, sobald Aktivitäten außerhalb der VM (z. B. Ein-/Ausgabe) involviert sind. In dieser Arbeit stellen wir SimuBoost vor, eine Methode zur drastischen Beschleunigung funktionaler Systemsimulation. SimuBoost führt die zu untersuchende Arbeitslast zunächst in einer schnellen hardwaregestützten virtuellen Maschine aus. Dies ermöglicht volle Interaktivität mit Benutzern und Netzwerkgeräten. Während der Ausführung erstellt SimuBoost periodisch Abbilder der VM (engl. Checkpoints). Diese dienen als Ausgangspunkt für eine parallele Simulation, bei der jedes Intervall unabhängig simuliert und analysiert wird. Eine heterogene deterministische Wiederholung (engl. heterogeneous deterministic Replay) garantiert, dass in dieser Phase die vorherige hardwaregestützte Ausführung jedes Intervalls exakt reproduziert wird, einschließlich Interaktionen und realistischem Zeitverhalten. Unser Prototyp ist in der Lage, die Laufzeit einer funktionalen Systemsimulation deutlich zu reduzieren. Während mit herkömmlichen Verfahren für die Simulation des Bauprozesses eines modernen Linux über 5 Stunden benötigt werden, schließt SimuBoost die Simulation in nur 15 Minuten ab. Dies sind lediglich 16% mehr Zeit, als der Bau in einer schnellen hardwaregestützten VM in Anspruch nimmt. SimuBoost ist imstande, diese Geschwindigkeit auch bei voller Instrumentierung zur Aufzeichnung von Speicherzugriffen beizubehalten. Die vorliegende Arbeit ist das erste Projekt, welches das Konzept der Partitionierung und Parallelisierung der Ausführungszeit auf die interaktive Systemvirtualisierung in einer Weise anwendet, die eine sofortige parallele funktionale Simulation gestattet. Wir ergänzen die praktische Umsetzung mit einem mathematischen Modell zur formalen Beschreibung der Beschleunigungseigenschaften. Dies erlaubt es, für ein gegebenes Szenario die voraussichtliche parallele Simulationszeit zu prognostizieren und gibt eine Orientierung zur Wahl der optimalen Intervalllänge. Im Gegensatz zu bisherigen Arbeiten legt SimuBoost einen starken Fokus auf die Skalierbarkeit über die Grenzen eines einzelnen physischen Systems hinaus. Ein zentraler Schlüssel hierzu ist der Einsatz moderner Checkpointing-Technologien. Im Rahmen dieser Arbeit präsentieren wir zwei neuartige Methoden zur effizienten und effektiven Kompression von periodischen Systemabbildern

KITopen

Trace Execution Automata In Dynamic Binary Translation

Author: Araujo G.
Borin E.
Porto J.
Wu Y.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2015
Field of study

Program performance can be dynamically improved by optimizing its frequent execution traces. Once traces are collected, they can be analyzed and optimized based on the dynamic information derived from the program's previous runs. The ability to record traces is thus central to any dynamic binary translation system. Recording traces, as well as loading them for use in different runs, requires code replication to represent the trace. This paper presents a novel technique which records execution traces by using an automaton called TEA (Trace Execution Automata). Contrary to other approaches, TEA stores traces implicitly, without the need to replicate execution code. TEA can also be used to simulate the trace execution in a separate environment, to store profile information about the generated traces, as well to instrument optimized versions of the traces. In our experiments, we showed that TEA decreases memory needs to represent the traces (nearly 80% savings). © 2011 Springer-Verlag.6161 LNCS99116Bala, V., Duesterwald, E., Banerjia, S., Dynamo: A transparent dynamic optimization system (2000) SIGPLAN Not., 35 (5), pp. 1-12Baraz, L., Devor, T., Etzion, O., Goldenberg, S., Skaletsky, A., Wang, Y., Zemach, Y., Ia-32 execution layer: A two-phase dynamic translator designed to support ia-32 applications on itanium®-based systems (2003) MICRO 36: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, p. 191Cifuentes, C., Emmerik, M.V., Uqbt: Adaptable binary translation at low cost (2000) Computer, 33 (3), pp. 60-66Dehnert, J.C., Grant, B.K., Banning, J.P., Johnson, R., Kistler, T., Klaiber, A., Mattson, J., The transmeta code morphing™software: Using speculation, recovery, and adaptive retranslation to address real-life challenges (2003) CGO 2003: Proceedings of the International Symposium on Code Generation and Optimization, pp. 15-24Duesterwald, E., Bala, V., Software profiling for hot path prediction: Less is more (2000) ASPLOS-IX:Proceedings of TheNinth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 202-211Ebcioǧlu, K., Altman, E.R., Daisy: Dynamic compilation for 100% architectural compatibility (1997) ISCA 1997: Proceedings of the 24th Annual International Symposium on Computer Architecture, pp. 26-37Ebcioǧlu, K., Altman, E.R., Gschwind, M., Sathaye, S., Optimizations and oracle parallelism with dynamic translation (1999) MICRO 32: Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture, pp. 284-295Friendly, D.H., Patel, S.J., Patt, Y.N., Putting the fill unit to work: Dynamic optimizations for trace cache microprocessors (1998) MICRO 31: Proceedings of the 31st Annual ACM/IEEE International Symposium on Microarchitecture, pp. 173-181Gal, A., Franz, M., (2006) Incremental Dynamic Code Generation with Trace Trees, , Tech. Rep. 06-16, Donald Bren School of Information and Computer Science, University of California, Irvine NovemberGal, A., Eich, B., Shaver, M., Anderson, D., Mandelin, D., Haghighat, M.R., Kaplan, B., Franz, M., Trace-based just-in-time type specialization for dynamic languages (2009) PLDI 2009: Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 465-478Gschwind, M., Ebcioǧlu, K., Altman, E., Sathaye, S., Binary translation and architecture convergence issues for ibm system/390 (2000) ICS 2000: Proceedings of the 14th International Conference on Supercomputing, pp. 336-347Hookway, R., Digital fx!32: Running 32-bit x86 applications on alpha nt (1997) COMPCON 1997: Proceedings of the 42nd IEEE International Computer Conference, p. 37Klaiber, A., (2000) The Technology behind Crusoe™ Processors, , Tansmeta Corporation JanuaryLuk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Hazelwood, K., Pin: Building customized program analysis tools with dynamic instrumentation (2005) PLDI 2005: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 190-200Porto, J.P., Araujo, G., Wu, Y., Borin, E., Wang, C., Compact trace trees in dynamic binary translators (2009) 2nd Workshop on Architectural and Micro- Architectural Support for Binary Translation, AMAS-BT 2009Rotenberg, E., Bennett, S., Smith, J.E., Trace cache: A low latency approach to high bandwidth instruction fetching (1996) MICRO 29: Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 24-35Sprangle, E., Carmean, D., Increasing processor performance by implementing deeper pipelines (2002) SIGARCH Comput. Archit. News, 30 (2), pp. 25-34Suganuma, T., Yasue, T., Nakatani, T., A region-based compilation technique for dynamic compilers (2006) ACM Trans. Program. Lang. Syst., 28 (1), pp. 134-174Wang, C., Hu, S., Kim, H., Nair, S.R., Breternitz, M., Ying, Z., Wu, Y., Stardbt: An efficient multi-platform dynamic binary translation system (2007) Asia-Pacific Computer Systems Architecture Conference, pp. 4-15Wimmer, C., Cintra, M.S., Bebenita, M., Chang, M., Gal, A., Franz, M., Phase detection using trace compilation (2009) PPPJ 2009: Proceedings of the 7th International Conference on Principles and Practice of Programming in Java, pp. 172-181Zaleski, M., Brown, A.D., Stoodley, K., Yeti: A gradually extensible trace interpreter (2007) VEE 2007: Proceedings of the 3rd International Conference on Virtual Execution Environments, pp. 83-9

Repositorio da Producao Cientifica e Intelectual da Unicamp