Search CORE

1,084 research outputs found

Automatic Application-Specific Customization of Softcore Processor Microarchitecture, Masters Thesis, May 2006

Author: Padmanabhan Shobana
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2006
Field of study

Applications for constrained embedded systems are subject to strict runtime and resource utilization bounds. With soft core processors, application developers can customize the processor for their application, constrained by available hardware resources but aimed at high application performance. The more reconfigurable the processor is, the more options the application developers will have for customization and hence increased potential for improving application performance. However, such customization entails developing in-depth familiarity with all the parameters, in order to configure them effectively. This is typically infeasible, given the tight time-to-market pressure on the developers. Alternatively, developers could explore all possible configurations, but being exponential, this is infeasible even given only tens of parameters. This thesis presents an approach based on an assumption of parameter independence, for automatic microarchitecture customization. This approach is linear with the number of parameter values and hence, feasible and scalable. For the dimensions that we customize, namely application runtime and hardware resources, we formulate their costs as a constrained binary integer nonlinear optimization program. Though the results are not guaranteed to be optimal, we find they are near-optimal in practice. Our technique itself is general and can be applied to other design-space exploration problems

Washington University St. Louis: Open Scholarship

Architectural performance analysis of FPGA synthesized LEON processors

Author: Damman Corentin
Edison Gregory
Guet Fabrice
Hugues Jérôme
Noulard Eric
Santinelli Luca
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Current processors have gone through multiple internal opti- mization to speed-up the average execution time e.g. pipelines, branch prediction. Besides, internal communication mechanisms and shared resources like caches or buses have a sig- nificant impact on Worst-Case Execution Times (WCETs). Having an accurate estimate of a WCET is now a challenge. Probabilistic approaches provide a viable alternative to single WCET estimation. They consider WCET as a probabilistic distribution associated to uncertainty or risk. In this paper, we present synthetic benchmarks and associated analysis for several LEON3 configurations on FPGA targets. Benchmarking exposes key parameters to execution time variability allowing for accurate probabilistic modeling of system dynamics. We analyze the impact of architecture- level configurations on average and worst-case behaviors

Open Archive Toulouse Archive Ouverte

An Open Core System-on-chip Platform

Author: Srivastava Rishi R.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2004
Field of study

The design cycle required to produce a System-on-Chip can be reduced by providing pre-designed built-in features and functions such as configurable I/O, power and ground grids, block RAMs, timing generators and other embedded intellectual property (IP) blocks. A basic combination of such built-in features is known as a platform. The major objective of this thesis was to design and implement one such System-on-Chip platform using open IP cores targeting the TSMC-0.18 CMOS process. The integrated System-on-Chip platform, which contains approximately four million transistors, was synthesized using Synopsys - Design Compiler and placed and routed using Cadence - First Encounter, Silicon Ensemble. Design verification was done at the pre-synthesis, post-synthesis and post-layout levels using Mentor Graphics - ModelSim. Final layout was imported into Cadence - Virtuoso to perform design rule check. A tutorial was written to enable others to create derivative designs of this platform quickly

University of Tennessee, Knoxville: Trace

A Reconfigurable Processor for Heterogeneous Multi-Core Architectures

Author: Grudnitsky Artjom
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

A reconfigurable processor is a general-purpose processor coupled with an FPGA-like reconfigurable fabric. By deploying application-specific accelerators, performance for a wide range of applications can be improved with such a system. In this work concepts are designed for the use of reconfigurable processors in multi-tasking scenarios and as part of multi-core systems

KITopen

Integration and validation of embedded flight software on space-qualified multicore architectures

Author: Moya Riera Joan
Publication venue: Universitat Politècnica de Catalunya
Publication date: 07/10/2022
Field of study

In the recent decades, the importance of software on space missions has notably increased, reflecting the need to integrate advanced on-board functionalities. With multicore processors being lately introduced to host critical high-performance applications, the complexity to validate software has significantly raised with respect to single core architectures. While there has been a big step forward in avionics after the publication of the CAST-32A paper, the ECSS-E-ST-40C software engineering standard used by the European Space Agency (ESA) is still not providing validation support for multicore processors. Hence, it is expected that standardising guidelines to develop software on such platforms will become a recurring topic in the industry to match the demands of future space exploration missions

UPCommons. Portal del coneixement obert de la UPC

Microprocessor fault-tolerance via on-the-fly partial reconfiguration

Author: Di Carlo Stefano
Miele Andrea
Prinetto Paolo Ernesto
Trapanese Antonio
Publication venue: IEEE Computer Society
Publication date: 01/01/2010
Field of study

This paper presents a novel approach to exploit FPGA dynamic partial reconfiguration to improve the fault tolerance of complex microprocessor-based systems, with no need to statically reserve area to host redundant components. The proposed method not only improves the survivability of the system by allowing the online replacement of defective key parts of the processor, but also provides performance graceful degradation by executing in software the tasks that were executed in hardware before a fault and the subsequent reconfiguration happened. The advantage of the proposed approach is that thanks to a hardware hypervisor, the CPU is totally unaware of the reconfiguration happening in real-time, and there's no dependency on the CPU to perform it. As proof of concept a design using this idea has been developed, using the LEON3 open-source processor, synthesized on a Virtex 4 FPG

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Energy analysis and optimisation techniques for automatically synthesised coprocessors

Author: Morgan Paul
Publication venue
Publication date: 01/01/2008
Field of study

The primary outcome of this research project is the development of a methodology enabling fast automated early-stage power and energy analysis of configurable processors for system-on-chip platforms. Such capability is essential to the process of selecting energy efficient processors during design-space exploration, when potential savings are highest. This has been achieved by developing dynamic and static energy consumption models for the constituent blocks within the processors. Several optimisations have been identified, specifically targeting the most significant blocks in terms of energy consumption. Instruction encoding mechanism reduces both the energy and area requirements of the instruction cache; modifications to the multiplier unit reduce energy consumption during inactive cycles. Both techniques are demonstrated to offer substantial energy savings. The aforementioned techniques have undergone detailed evaluation and, based on the positive outcomes obtained, have been incorporated into Cascade, a system-on-chip coprocessor synthesis tool developed by Critical Blue, to provide automated analysis and optimisation of processor energy requirements. This thesis details the process of identifying and examining each method, along with the results obtained. Finally, a case study demonstrates the benefits of the developed functionality, from the perspective of someone using Cascade to automate the creation of an energy-efficient configurable processor for system-on-chip platforms

Glasgow Theses Service

OpenGrey Repository

Dusty Caches to Save Memory Traffic

Author: Friedman Scott J.
Publication venue: Washington University Open Scholarship
Publication date: 27/04/2005
Field of study

Reference counting is a garbage-collection technique that maintains a per-object count of the number of pointers to that object. When the count reaches zero, the object must be dead and can be collected. Although it is not an exact method, it is well suited for real-time systems and is widely implemented, sometimes in conjunction with other methods to increase the overall precision. A disadvantage of reference counting is the extra storage trac that is introduced. In this paper, we describe a new cache write-back policy that can substantially decrease the reference-counting traffic to RAM. We propose a new cache design that remembers the first-fetched value of a cache subblock, so that the subblock need not be written back to RAM unless a different value is present. We present results from experiments that show the effectiveness of this approach, particularly in mitigating the storage traffic due to reference counting

Washington University St. Louis: Open Scholarship

Design and Implementation of a Time Predictable Processor: Evaluation With a Space Case Study

Author: Abella Jaume
Andersson Jan
Bardizbanyan Alen
Cazorla Francisco J.
Cros Fabrice
Wartel Franck
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 29th Euromicro Conference on Real-Time Systems (ECRTS 2017)
Publication date: 01/01/2017
Field of study

Embedded real-time systems like those found in automotive, rail and aerospace, steadily require higher levels of guaranteed computing performance (and hence time predictability) motivated by the increasing number of functionalities provided by software. However, high-performance processor design is driven by the average-performance needs of mainstream market. To make things worse, changing those designs is hard since the embedded real-time market is comparatively a small market. A path to address this mismatch is designing low-complexity hardware features that favor time predictability and can be enabled/disabled not to affect average performance when performance guarantees are not required. In this line, we present the lessons learned designing and implementing LEOPARD, a four-core processor facilitating measurement-based timing analysis (widely used in most domains). LEOPARD has been designed adding low-overhead hardware mechanisms to a LEON3 processor baseline that allow capturing the impact of jittery resources (i.e. with variable latency) in the measurements performed at analysis time. In particular, at core level we handle the jitter of caches, TLBs and variable-latency floating point units; and at the chip level, we deal with contention so that time-composable timing guarantees can be obtained. The result of our applied study with a Space application shows how per-resource jitter is controlled facilitating the computation of high-quality WCET estimates

Dagstuhl Research Online Publication Server