Search CORE

679 research outputs found

Performance Debugging and Tuning using an Instruction-Set Simulator

Author: Magnusson Peter S.
Montelius Johan
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/1997
Field of study

Instruction-set simulators allow programmers a detailed level of insight into, and control over, the execution of a program, including parallel programs and operating systems. In principle, instruction set simulation can model any target computer and gather any statistic. Furthermore, such simulators are usually portable, independent of compiler tools, and deterministic-allowing bugs to be recreated or measurements repeated. Though often viewed as being too slow for use as a general programming tool, in the last several years their performance has improved considerably. We describe SIMICS, an instruction set simulator of SPARC-based multiprocessors developed at SICS, in its rôle as a general programming tool. We discuss some of the benefits of using a tool such as SIMICS to support various tasks in software engineering, including debugging, testing, analysis, and performance tuning. We present in some detail two test cases, where we've used SimICS to support analysis and performance tuning of two applications, Penny and EQNTOTT. This work resulted in improved parallelism in, and understanding of, Penny, as well as a performance improvement for EQNTOTT of over a magnitude. We also present some early work on analyzing SPARC/Linux, demonstrating the ability of tools like SimICS to analyze operating systems

CiteSeerX

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Instruction Set Simulator for Transport Triggered Architectures

Author: Jääskeläinen Pekka
Publication venue
Publication date: 01/09/2005
Field of study

Due to speciﬁc requirements of some of embedded system applications, general purpose processors are usually not the most optimal ones for the task at hand. Thus, there is a need for application-speciﬁc processors, which are tailored for the application and requirements at hand. However, processor design is a demanding task. Therefore, the processor design ﬂow needs to be automated as completely as possible. TTA Codesign Environment (TCE) is a toolset that provides a semi-automated processor design ﬂow, which includes "design space exploration", which is a process that helps to ﬁnd an optimal processor architecture for the given application semiautomatically. The processor paradigm utilized in TCE design ﬂow is called transport triggered architecture (TTA). TTA is a relatively simple and highly modularized processor architecture which allows easy customization. One of the leading ideas of TTA is to move complexity from the processor hardware to the compiler. Consequently, the most complicated tool in TCE is the compiler. Instruction set simulation is mainly needed in verifying the compiler output and in design space exploration. The project completed for this thesis consisted of design, implementation, and veriﬁcation of an instruction set simulator for TCE. The thesis describes the main requirements and most important software design decisions of the TCE instruction set simulator. In addition, the veriﬁcation of simulation correctness is described and performance benchmarks are presented. Finally, several improvement ideas and brief plans for implementing them are presented. /Kir1

Trepo - Institutional Repository of Tampere University

Processor Generator v1.3 (PG13)

Author: Crowley Patrick
Kotysh Eduard V.
Publication venue: Washington University Open Scholarship
Publication date: 30/05/2005
Field of study

This project presents a novel automated framework for microprocessor instruction set exploration that allows users to extend a basic MIPS ISA with new multimedia instructions (including custom vector instructions, a la AltiVec and MMX/SSE). The infrastructure provides users with an extension language that automatically incorporates extensions into a synthesizable processor pipeline model and an executable instruction set simulator. We implement popular AltiVec and MMX extensions using this framework and present experimental results that show significant performance gains of customized microprocessor

Washington University St. Louis: Open Scholarship

Holistic debugging - enabling instruction set simulation for software quality assurance

Author: Albertsson Lars
Publication venue
Publication date: 01/01/2006
Field of study

We present holistic debugging, a novel method for observing execution of complex and distributed software. It builds on an instruction set simulator, which provides reproducible experiments and non-intrusive probing of state in a distributed system. Instruction set simulators, however, only provide low-level information, so a holistic debugger contains a translation framework that maps this information to higher abstraction level observation tools, such as source code debuggers. We have created Nornir, a proof-of-concept holistic debugger, built on the simulator Simics. For each observed process in the simulated system, Nornir creates an abstraction translation stack, with virtual machine translators that map machine-level storage contents (e.g. physical memory, registers) provided by Simics, to application-level data (e.g. virtual memory contents) by parsing the data structures of operating systems and virtual machines. Nornir includes a modified version of the GNU debugger (GDB), which supports non-intrusive symbolic debugging of distributed applications. Nornir's main interface is a debugger shepherd, a programmable interface that controls multiple debuggers, and allows users to coherently inspect the entire state of heterogeneous, distributed applications. It provides a robust observation platform for construction of new observation tools

CiteSeerX

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Modeling and visualizing networked multi-core embedded software energy consumption

Author: Eder Kerstin
Kerrison Steve
Publication venue
Publication date: 09/09/2015
Field of study

In this report we present a network-level multi-core energy model and a software development process workflow that allows software developers to estimate the energy consumption of multi-core embedded programs. This work focuses on a high performance, cache-less and timing predictable embedded processor architecture, XS1. Prior modelling work is improved to increase accuracy, then extended to be parametric with respect to voltage and frequency scaling (VFS) and then integrated into a larger scale model of a network of interconnected cores. The modelling is supported by enhancements to an open source instruction set simulator to provide the first network timing aware simulations of the target architecture. Simulation based modelling techniques are combined with methods of results presentation to demonstrate how such work can be integrated into a software developer's workflow, enabling the developer to make informed, energy aware coding decisions. A set of single-, multi-threaded and multi-core benchmarks are used to exercise and evaluate the models and provide use case examples for how results can be presented and interpreted. The models all yield accuracy within an average +/-5 % error margin

arXiv.org e-Print Archive

Explore Bristol Research

Efficient Asynchronous Interrupt Handling in a Full-System Instruction Set Simulator

Author: Franke Bjoern
Spink Tom
Wagstaff Harry
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2016
Field of study

Instruction set simulators (ISS) have many uses in embedded software and hardware development and are typically based on dynamic binary translation (DBT), where frequently executed regions of guest instructions are compiled into host instructions using a just-in-time (JIT) compiler. Full-system simulation, which necessitates handling of asynchronous interrupts from e.g. timers and I/O devices, complicates matters as control flow is interrupted unpredictably and diverted from the current region of code. In this paper we present a novel scheme for handling of asynchronous interrupts, which integrates seamlessly into a region-based dynamic binary translator. We first show that our scheme is correct, i.e. interrupt handling is not deferred indefinitely, even in the presence of code regions comprising control flow loops. We demonstrate that our new interrupt handling scheme is efficient as we minimise the number of inserted checks. Interrupt handlers are also presented to the JIT compiler and compiled to native code, further enhancing the performance of our system. We have evaluated our scheme in an ARM simulator using a region-based JIT compilation strategy. We demonstrate that our solution reduces the number of dynamic interrupt checks by 73%, reduces interrupt service latency by 26% and improves throughput of an I/O bound workload by 7%, over traditional per-block schemes.Postprin

Crossref

Edinburgh Research Explorer

University of St. Andrews - Pure

St Andrews Research Repository

Design of an Architectural Model for the Coffee Processor Using ArchC

Author: Gual González Daniel
Publication venue
Publication date: 08/09/2010
Field of study

The present work is aimed to provide the clearest description possible of the COFFEE RISC core model written through the ArchC software and simulate its behaviour. In this sense, we explore the software applications used for instruction set simulation focusing on the ArchC tools and their features. According to the guidelines of this software, a cycle-accurate description of the COFFEE core architecture is developed, which is used to synthesize a timed instruction set simulator and an assembler. Our work also contains some elements of analysis concerning the ArchC tools and the resulting instruction set simulator in order to evaluate their characteristics and capabilities for hardware architecture modeling purposes. We did not emphasize only on the features of the ArchC tools at the current status of development but also the projection of this software for future implementations. /Kir1

Trepo - Institutional Repository of Tampere University

Designing a CPU model: from a pseudo-formal document to fast code

Author: Blanqui Frédéric
Helmstetter Claude
Joloboff Vania
Monin Jean-François
Shi Xiaomu
Publication venue
Publication date: 22/01/2011
Field of study

For validating low level embedded software, engineers use simulators that take the real binary as input. Like the real hardware, these full-system simulators are organized as a set of components. The main component is the CPU simulator (ISS), because it is the usual bottleneck for the simulation speed, and its development is a long and repetitive task. Previous work showed that an ISS can be generated from an Architecture Description Language (ADL). In the work reported in this paper, we generate a CPU simulator directly from the pseudo-formal descriptions of the reference manual. For each instruction, we extract the information describing its behavior, its binary encoding, and its assembly syntax. Next, after automatically applying many optimizations on the extracted information, we generate a SystemC/TLM ISS. We also generate tests for the decoder and a formal specification in Coq. Experiments show that the generated ISS is as fast and stable as our previous hand-written ISS.Comment: 3rd Workshop on: Rapid Simulation and Performance Evaluation: Methods and Tools (2011

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-CIRAD