Search CORE

13 research outputs found

Balancing Performance and Productivity for the Development of Dynamic Binary Instrumentation Tools - A Case Study on Arm Systems

Author: Callaghan Guillermo
Gorgovan Cosmin
Luján Mikel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/02/2020
Field of study

Crossref

The University of Manchester - Institutional Repository

Navigating the Landscape for Real-time Localisation and Mapping for Robotics, Virtual and Augmented Reality

Author: Bodin Bruno
Clarkson James
Davison Andrew J
Debrunner Thomas
Franke Bjoern
Furber Steve
Gonzalez-de-Aledo Pablo
Gorgovan Cosmin
Kaszyk Kuba
Kelly Paul H. J.
Kotselidis Christos
Luján Mikel
Mawer John
Melot Nicolas
Nardi Luigi
Nisbet Andy
O'Boyle Michael
Palomar Oscar
Riley Graham
Rodchenko Andrey
Saeedi Sajad
Spink Tom
Tomusk Erik-Arne
Vespa Emanuele
Wagstaff Harry
Webb Andrew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/06/2018
Field of study

Visual understanding of 3D environments in real-time, at low power, is a huge computational challenge. Often referred to as SLAM (Simultaneous Localisation and Mapping), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, tools, and systems software needed to enable delivery of SLAM, by supporting applications specialists in selecting and configuring the appropriate algorithm and the appropriate hardware, and compilation pathway, to meet their performance, accuracy, and energy consumption goals. The major contributions we present are (1) tools and methodology for systematic quantitative evaluation of SLAM algorithms, (2) automated, machine-learning-guided exploration of the algorithmic and implementation design space with respect to multiple objectives, (3) end-to-end simulation tools to enable optimisation of heterogeneous, accelerated architectures for the specific algorithmic requirements of the various SLAM algorithmic approaches, and (4) tools for delivering, where appropriate, accelerated, adaptive SLAM solutions in a managed, JIT-compiled, adaptive runtime context.Comment: Proceedings of the IEEE 201

arXiv.org e-Print Archive

Edinburgh Research Explorer

Spiral - Imperial College Digital Repository

The University of Manchester - Institutional Repository

University of St. Andrews - Pure

St Andrews Research Repository

Optimising Dynamic Binary Modification Across ARM Microarchitectures

Author: Gorgovan Cosmin
Publication venue
Publication date: 01/08/2017
Field of study

The University of Manchester - Institutional Repository

Optimising Dynamic Binary Modification Across ARM Microarchitectures

Author: d'Antras Amanieu
Gorgovan Cosmin
Luján Mikel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Crossref

The University of Manchester - Institutional Repository

MAMBO: A low overhead dynamic binary modification tool for ARM

Author: D'Antras Amanieu
Gorgovan Cosmin
Lujan Mikel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2016
Field of study

As the ARM architecture expands beyond its traditional embedded domain, there is a growing interest in dynamic binary modification (DBM) tools for general-purpose multicore processors that are part of the ARM family. Existing DBM tools for ARM suffer from introducing large overheads in the execution of applications. The specific questions that this article addresses are (i) how to develop such DBM tools for the ARM architecture and (ii) whether new optimisations are plausible and needed. We describe the general design of MAMBO, a new DBM tool for ARM, which we release together with this publication, and introduce novel optimisations to handle indirect branches. In addition, we explore scenarios in which it may be possible to relax the transparency offered by DBM tools to allow extra optimisations to be applied. These scenarios arise from analysing the most typical usages: for example, application binaries without handcrafted assembly. The performance evaluation shows that MAMBO introduces small overheads for SPEC CPU2006 and PARSEC 3.0 when comparing with the execution times of the unmodified programs: a geometric mean overhead of 28% on a Cortex-A9 and of 34% on a Cortex-A15 for CPU2006, and between 27% and 32%, depending on the number of threads, for PARSEC on a Cortex-A15.</jats:p

Crossref

The University of Manchester - Institutional Repository

Evaluating the Impact of Optimizations for Dynamic Binary Modification on 64-bit RISC-V

Author: Callaghan Guillermo
Gorgovan Cosmin
Kressel John Alistair
Luján Mikel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/04/2023
Field of study

The University of Manchester - Institutional Repository

Optimizing Indirect Branches in Dynamic Binary Translators

Author: d'Antras Amanieu
Garside Jim
Gorgovan Cosmin
Luján Mikel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2016
Field of study

Dynamic binary translation is a technology for transparently translating and modifying a program at the machine code level as it is running. A significant factor in the performance of a dynamic binary translator is its handling of indirect branches. Unlike direct branches, which have a known target at translation time, an indirect branch requires translating a source program counter address to a translated program counter address every time the branch is executed. This translation can impose a serious runtime penalty if it is not handled efficiently. MAMBO-X64, a dynamic binary translator that translates 32-bit ARM (AArch32) code to 64-bit ARM (AArch64) code, uses three novel techniques to improve the performance of indirect branch translation. Together, these techniques allow MAMBO-X64 to achieve a very low performance overhead of only 10% on average compared to native execution of 32-bit programs. Hardware-assisted function returns use a software return address stack to predict the targets of function returns, making use of several novel optimizations while also exploiting hardware return address prediction. This technique has a significant impact on most benchmarks, reducing binary translation overhead compared to native execution by 40% on average and by 90% on some benchmarks. Branch table inference , an algorithm for detecting and translating branch tables, can reduce the overhead of translated code by up to 40% on some SPEC CPU2006 benchmarks. The remaining indirect branches are handled using a fast atomic hash table , which is optimized to work with multiple threads. This last technique translates indirect branches using a single shared hash table while avoiding expensive synchronization in performance-critical lookup code. This allows the performance to be on par with thread-private hash tables while having superior memory scalability. </jats:p

Crossref

The University of Manchester - Institutional Repository

HyperMAMBO-X64: Using Virtualization to Support High-performance Transparent Binary Translation

Author: d'Antras Amanieu
Garside Jim
Goodacre John
Gorgovan Cosmin
Luján Mikel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/04/2017
Field of study

Crossref

The University of Manchester - Institutional Repository

The Potential of Dynamic Binary Modification and CPU-FPGA SoCs for Simulation

Author: Andy Nisbet
Cosmin Gorgovan
John Mawer
Mikel Lujan
Oscar Palomar
Will Toms
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2017
Field of study

Crossref

The University of Manchester - Institutional Repository