88 research outputs found

    A general technique for deterministic model-cycle-level debugging

    Get PDF
    Efficient use of FPGA resources requires FPGA-based performance models of complex hardware to implement one model cycle, i.e., one time-step of the original synchronous system, in several implementation cycles. Generally implementation cycles have no simple relationship with model cycles, and it is tricky to reconstruct the state of the synchronous system at the model-cycle boundaries if only implementation-cycle-level control and information is provided. A good debugging facility needs to provide: complete control over the functioning of the target design being simulated; fast and easy access to all the significant target design state for both monitoring and modification; and some means of accomplishing deterministic execution when the target design is a multicore processor running a parallel application. Moreover, these features need to be provided in a manner which does not incur substantial resource and performance penalties. In this paper, we present a debugging technique based on the LI-BDN theory. We show how the technique facilitates deterministic model-cycle-level debugging. We used it to build the debugging infrastructure for Arete, which is an FPGA-based cycle-accurate multicore simulator. The resource and performance penalties of our debugging technique are minimal; in Arete the debugging infrastructure has area and performance overheads of 5% and 6%, respectively.IBM Researc

    Cycle-accurate multicore performance models on FPGAs

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 159-165).The goal of this project is to improve computer architecture by accelerating cycle-accurate performance modeling of multicore processors using FPGAs. Contributions include a distributed technique controlling simulation on a highly-parallel substrate, hardware design techniques to reduce development effort, and a specific framework for modeling shared-memory multicore processors paired with realistic On-Chip Networks.by Michael Pellauer.Ph.D

    Architectural support for secure and survivable embedded software

    Get PDF
    Attacks against vulnerable software have become a serious problem for industry and users alike. There have been many techniques proposed to combat these attacks which range from compiler modifications to additional architectural features. Most of these techniques focus on attack detection, while ignoring the problem of how to gracefully recover from such attacks. In this thesis we propose an architectural approach to attack detection and recovery which we call rollback and huddle. In our approach, a lightweight attack-detection module monitors a program\u27s execution as its state is continuously checkpointed. In the case of an attack, the program state is rolled back to a time before the attack occurred and an additional HW/SW module is loaded to gain extra insight into the attack and possibly repair the original vulnerability. Our approach is based on the observation that the vast majority of a program\u27s execution can be trusted. Therefore, we aim to minimize the performance overhead during normal execution. Once an attack has been detected, the system is put into a high alert mode where a larger performance overhead is tolerated to make use of more complex techniques and avoid system down-time. We introduce simple hardware modules that work alongside a standard computer architecture, and aid in attack detection, checkpoint creation, and attack recovery. Our experimental results show that this approach can be achieved with minimal run-time overhead and resource utilization

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Cycle-accurate modeling of multicore processors on FPGAs

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 169-176).We present a novel modeling methodology which enables the generation of a high-performance, cycle-accurate simulator from a cycle-level specification of the target design. We describe Arete, a full-system multicore processor simulator, developed using our modeling methodology. We provide details on Arete's resource-efficient and high-performance implementation on multiple FPGA platforms, and the architectural experiments performed using it. We present clear evidence that the use of simplified models in architectural studies can lead to wrong conclusions. Through two experiments performed using both cycle-accurate and simplified models, we show that on one hand there are substantial quantitative and qualitative differences in results, and on the other, the results match quite well.by Asif Imtiaz Khan.Ph.D

    An Inter-Processor Communication (IPC) Data Sharing Architecture in Heterogeneous MPSoC for OFDMA

    Get PDF
    Multiprocessor system-on-chip (MPSoC) promises better data management for parallel processing than conventional SoC. This feature is very suitable for wireless communication systems. Better data processing management can reduce resource utilization and can potentially reduce power consumption as well. Hence, this research aimed to minimize the orthogonal frequency-division multiple access (OFDMA) processing hardware by proposing a new data sharing architecture on a heterogeneous MPSoC platform that incorporates inter-processor communication (IPC), multi-processor, multi-bus, multi-frequency and parallel processing design of the medium access controller (MAC) layer. This MPSoC was designed based on a RISC processor with an AMBA multi-bus system. To achieve high throughput, the proposed MPSoC runs at two different frequencies, 40 MHz and 80 MHz. The proposed system was implemented and verified using FPGA. The verification results showed that the proposed system can work in real-time with a maximum throughput of 11 MBps using a 40 MHz system clock. The proposed MPSoC is a promising solution to perform OFDMA processing on 4G and 5G technologies

    Graphite: A Distributed Parallel Simulator for Multicores

    Get PDF
    This paper introduces the open-source Graphite distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development for future processors. Several techniques are used to achieve this performance including: direct execution, multi-machine distribution, analytical modeling, and lax synchronization. Graphite is capable of accelerating simulations by leveraging several machines. It can distribute simulation of an off-the-shelf threaded application across a cluster of commodity Linux machines with no modification to the source code. It does this by providing a single, shared address space and consistent single-process image across machines. Graphite is designed to be a simulation framework, allowing different component models to be easily replaced to either model different architectures or tradeoff accuracy for performance. We evaluate Graphite from a number of perspectives and demonstrate that it can simulate target architectures containing over 1000 cores on ten 8-core servers. Performance scales well as more machines are added with near linear speedup in many cases. Simulation slowdown is as low as 41x versus native execution for some applications. The Graphite infrastructure and existing models will be released as open-source software to allow the community to simulate their own architectures and extend and improve the framework

    MURAC: A unified machine model for heterogeneous computers

    Get PDF
    Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination
    corecore