58 research outputs found

    Timing instructions for RISC-V based hard real time edge devices

    Get PDF
    For real-time systems, the temporal behavior of software is as important as its logical behavior. Ensuring correct temporal behavior at runtime becomes challenging as the complexity of the system increases. A main reason for this is the measurement and control of timing spans all abstraction layers in computing, including programming languages, memory hierarchy, pipelining techniques, bus architectures, memory management and task scheduling. The majority of software solutions to temporal requirements rely on programmable timers/interrupts which adds additional overhead to the system. RISC-V provides an open and extendable ISA (Instruction Set Architecture) enabling a new era of innovation in processor customization and performance and power optimization. This work presents a new RISC-V ISA extension to obtain high-precision cycle-accurate temporal behavior of real-time systems with low overhead. The work proposes a programming model supported by a new custom instruction that measures and controls the execution time of real-time software. The proposed custom instruction-based timing extension is evaluated against a pure software solution with a traditional timer/interrupt solution with respect to the resulting instruction density vs. hardware area overhead

    Multi-core devices for safety-critical systems: a survey

    Get PDF
    Multi-core devices are envisioned to support the development of next-generation safety-critical systems, enabling the on-chip integration of functions of different criticality. This integration provides multiple system-level potential benefits such as cost, size, power, and weight reduction. However, safety certification becomes a challenge and several fundamental safety technical requirements must be addressed, such as temporal and spatial independence, reliability, and diagnostic coverage. This survey provides a categorization and overview at different device abstraction levels (nanoscale, component, and device) of selected key research contributions that support the compliance with these fundamental safety requirements.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015-65316-P, Basque Government under grant KK-2019-00035 and the HiPEAC Network of Excellence. The Spanish Ministry of Economy and Competitiveness has also partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC-2013-14717).Peer ReviewedPostprint (author's final draft

    Hybrid Performance Prediction Models for Fully-Connected Neural Networks on MPSoC

    Get PDF
    Predicting the performance of Artificial Neural Networks (ANNs) on embedded multi-core platforms is tedious. Concurrent accesses to shared resources are hard to model due to congestion effects on the shared communication medium, which affect the performance of the application. In this paper we present a hybrid modeling environment to enable fast yet accurate timing prediction for fully-connected ANNs deployed on multi-core platforms. The modeling flow is based on the integration of an analytical computation time model with a communication time model which are both calibrated through measurement inside a system level simulation using SystemC. The proposed flow enables the prediction of the end-to-end latency for different mappings of several fully-connected ANNs with an average of more than 99 % accuracy

    Setup of an Experimental Framework for Performance Modeling and Prediction of Embedded Multicore AI Architectures

    Get PDF
    Evaluation of performance for complex applications such as Artificial Intelligence (AI) algorithms and more specifically neural networks on Multi-Processor Systems on a Chip (MPSoC) is tedious. Finding an optimized partitioning of the application while predicting accurately the latency induced by communication bus congestion, is hard using traditional analysis methods. This document presents a performance prediction worklow based on SystemC simulation models for timing prediction of neural networks on MPSoC

    FPGA based in-memory AI computing

    Get PDF
    The advent of AI in vehicles of all kinds is simultaneously creating the need for more and most often also very large computing capacities. Depending on the type of vehicle, this gives rise to various problems: while overall hardware and engineering costs dominate for airplanes, in fully electrical cars the costs for computing hardware are more of a matter. Common in both domains are tight requirements on the size, weight and space of the hardware, especially for drones and satellites, where this is most challenging. For airplanes and especially for satellites, an additional challenge is the radiation resistance of the usually very memory-intensive AI systems. We therefore propose an FPGA-based in-memory AI computation methodology, which is so far only applicable for small AI systems, but works exclusively with the local memory elements of FPGAs: lookup tables (LUTs) and registers. By not using external and thus slow, inefficient and radiation-sensitive DRAM, but only local SRAM, we can make AI systems faster, lighter and more efficient than is possible with conventional GPUs or AI accelerators. All known radiation hardening techniques for FPGAs also work for our systems

    The Universal Safety Format in Action: Tool Integration and Practical Application

    Get PDF
    Designing software that meets the stringent requirements of functional safety standards imposes a significant development effort compared to conventional software. A key aspect is the integration of safety mechanisms into the functional design to ensure a safe state during operation even in the event of hardware errors. These safety mechanisms can be applied at different levels of abstraction during the development process and are usually implemented and integrated manually into the design. This does not only cause significant effort but does also reduce the overall maintainability of the software. To mitigate this, we present the Universal Safety Format (USF), which enables the generation of safety mechanisms based on the separation of concerns principle in a model-driven approach. Safety mechanisms are described as generic patterns using a transformation language independent from the functional design or any particular programming language. The USF was designed to be easily integrated into existing tools and workflows that can support different programming languages. Tools supporting the USF can utilize the patterns in a functional design to generate and integrate specific safety mechanisms for different languages using the transformation rules contained within the patterns. This enables not only the reuse of safety patterns in different designs, but also across different programming languages. The approach is demonstrated with an automotive use-case as well as different tools supporting the USF

    CONTREX: Design of embedded mixed-criticality CONTRol systems under consideration of EXtra-functional properties

    Get PDF
    The increasing processing power of today’s HW/SW platforms leads to the integration of more and more functions in a single device. Additional design challenges arise when these functions share computing resources and belong to different criticality levels. CONTREX complements current activities in the area of predictable computing platforms and segregation mechanisms with techniques to consider the extra-functional properties, i.e., timing constraints, power, and temperature. CONTREX enables energy efficient and cost aware design through analysis and optimization of these properties with regard to application demands at different criticality levels. This article presents an overview of the CONTREX European project, its main innovative technology (extension of a model based design approach, functional and extra-functional analysis with executable models and run-time management) and the final results of three industrial use-cases from different domain (avionics, automotive and telecommunication).The work leading to these results has received funding from the European Community’s Seventh Framework Programme FP7/2007-2011 under grant agreement no. 611146

    A RISC-V based platform supporting mixed timing-critical and high performance workloads

    No full text
    Existing hardware platforms are typically optimized for either real-time or high-performance applications, which poses challenges when running a mix of both on the same platform. This work aims to address this issue by proposing a hybrid platform that can effectively execute both types of applications without compromising timing predictability or performance optimization. The proposed solution presents a hybrid HW/SW architecture template capable of dynamically switching between real-time and high-performance execution modes at runtime. The integration and implementation of this architecture template are described on an FPGA, utilizing an open-source RISC-V processor system and FreeRTOS as the software management layer. We have successfully applied the TACLe benchmark suite for the evaluation of our proposed approach. Through an integrated measurement infrastructure, the software functionality, execution timing, and switching times are analyzed on a single-core implementation of the proposed architecture templat

    A DSL based approach for supporting custom RISC-V instruction extensions in LLVM

    No full text
    The RISC-V ISA allows the definition of custom instruction extensions to support application specific hardware acceleration and optimization. The main challenge with instruction extensions is the time-consuming process of consistently integrating them within the processor design and the compiler support, and of provisioning a testing and evaluation framework for the software developer. Our work proposes an automatable customization of an LLVM compiler based on a DSL (Domain Specific Language) driven approach, which can already be used for the definition of the instruction extension, its integration into the RISC-V ISA, and the automatic synthesis of the processor core and an instruction set simulator. We demonstrate the whole generation flow on the example of a customized MAC instruction as a simple example and discuss the identified challenges

    Design and Analysis of an Online Update Approach for Embedded Microprocessors

    No full text
    Software updates are already used in many systems for fixing bugs and for improving or extending their functionality. For many embedded systems with strong requirements on their availability, software updates are still not used because an update cycle usually causes a down time of the system. For servers in data centers with high availability requirements, so-called live patching solutions exist for many years. Live-Patching allows updating the software without affecting the availability of the system (i.e. no restart is required). In this work, we propose the application of live patching on small embedded microprocessors. We present a proof-of-concept implementation on a Xilinx MicroBlaze processor and compare the properties of our implementation, w.r.t. the amount of transmitted update data, memory requirements and update cycle duration against a state-of-the-art full-memory update
    • …
    corecore