124 research outputs found

    Run-time resource allocation for embedded Multiprocessor System-on-Chip using tree-based design space exploration

    Get PDF
    The dynamic nature of application workloads in modern MPSoC-based embedded systems is growing. To cope with the dynamism of application workloads at run time and to improve the efficiency of the underlying system architecture, this paper presents a novel run-time resource allocation algorithm for multimedia applications with the objective of minimizing energy consumption for predefined deadlines. This algorithm is based on a novel tree-based design space exploration (DSE) method, which is performed in two phases: design-time and run-time. During design time, application clustering is combined with the tree-based DSE, and after that, feature extraction and application classification is performed during run-time based on well-known machine learning techniques. We evaluated our algorithm using a heterogeneous MPSoC system with several applications that have different communication and computation behaviors. Our experimental results revealed that during runtime, more than 91% of the applications were classified correctly by our proposed algorithm to select the best resources for allocation. Therefore the results clearly confirm that our algorithm is effective

    High-Level Synthesis Hardware Design for FPGA-Based Accelerators: Models, Methodologies, and Frameworks

    Get PDF
    Hardware accelerators based on field programmable gate array (FPGA) and system on chip (SoC) devices have gained attention in recent years. One of the main reasons is that these devices contain reconfigurable logic, which makes them feasible for boosting the performance of applications. High-level synthesis (HLS) tools facilitate the creation of FPGA code from a high level of abstraction using different directives to obtain an optimized hardware design based on performance metrics. However, the complexity of the design space depends on different factors such as the number of directives used in the source code, the available resources in the device, and the clock frequency. Design space exploration (DSE) techniques comprise the evaluation of multiple implementations with different combinations of directives to obtain a design with a good compromise between different metrics. This paper presents a survey of models, methodologies, and frameworks proposed for metric estimation, FPGA-based DSE, and power consumption estimation on FPGA/SoC. The main features, limitations, and trade-offs of these approaches are described. We also present the integration of existing models and frameworks in diverse research areas and identify the different challenges to be addressed

    Mixed-Criticality Systems on Commercial-Off-the-Shelf Multi-Processor Systems-on-Chip

    Get PDF
    Avionics and space industries are struggling with the adoption of technologies like multi-processor system-on-chips (MPSoCs) due to strict safety requirements. This thesis propose a new reference architecture for MPSoC-based mixed-criticality systems (MCS) - i.e., systems integrating applications with different level of criticality - which are a common use case for aforementioned industries. This thesis proposes a system architecture capable of granting partitioning - which is, for short, the property of fault containment. It is based on the detection of spatial and temporal interference, and has been named the online detection of interference (ODIn) architecture. Spatial partitioning requires that an application is not able to corrupt resources used by a different application. In the architecture proposed in this thesis, spatial partitioning is implemented using type-1 hypervisors, which allow definition of resource partitions. An application running in a partition can only access resources granted to that partition, therefore it cannot corrupt resources used by applications running in other partitions. Temporal partitioning requires that an application is not able to unexpectedly change the execution time of other applications. In the proposed architecture, temporal partitioning has been solved using a bounded interference approach, composed of an offline analysis phase and an online safety net. The offline phase is based on a statistical profiling of a metric sensitive to temporal interference’s, performed in nominal conditions, which allows definition of a set of three thresholds: 1. the detection threshold TD; 2. the warning threshold TW ; 3. the α threshold. Two rules of detection are defined using such thresholds: Alarm rule When the value of the metric is above TD. Warning rule When the value of the metric is in the warning region [TW ;TD] for more than α consecutive times. ODIn’s online safety-net exploits performance counters, available in many MPSoC architectures; such counters are configured at bootstrap to monitor the selected metric(s), and to raise an interrupt request (IRQ) in case the metric value goes above TD, implementing the alarm rule. The warning rule is implemented in a software detection module, which reads the value of performance counters when the monitored task yields control to the scheduler and reset them if there is no detection. ODIn also uses two additional detection mechanisms: 1. a control flow check technique, based on compile-time defined block signatures, is implemented through a set of watchdog processors, each monitoring one partition. 2. a timeout is implemented through a system watchdog timer (SWDT), which is able to send an external signal when the timeout is violated. The recovery actions implemented in ODIn are: • graceful degradation, to react to IRQs of WDPs monitoring non-critical applications or to warning rule violations; it temporarily stops non-critical applications to grant resources to the critical application; • hard recovery, to react to the SWDT, to the WDP of the critical application, or to alarm rule violations; it causes a switch to a hot stand-by spare computer. Experimental validation of ODIn was performed on two hardware platforms: the ZedBoard - dual-core - and the Inventami board - quad-core. A space benchmark and an avionic benchmark were implemented on both platforms, composed by different modules as showed in Table 1 Each version of the final application was evaluated through fault injection (FI) campaigns, performed using a specifically designed FI system. There were three types of FI campaigns: 1. HW FI, to emulate single event effects; 2. SW FI, to emulate bugs in non-critical applications; 3. artificial bug FI, to emulate a bug in non-critical applications introducing unexpected interference on the critical application. Experimental results show that ODIn is resilient to all considered types of faul

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Design Space Exploration and Resource Management of Multi/Many-Core Systems

    Get PDF
    The increasing demand of processing a higher number of applications and related data on computing platforms has resulted in reliance on multi-/many-core chips as they facilitate parallel processing. However, there is a desire for these platforms to be energy-efficient and reliable, and they need to perform secure computations for the interest of the whole community. This book provides perspectives on the aforementioned aspects from leading researchers in terms of state-of-the-art contributions and upcoming trends

    Design Space Exploration for MPSoC Architectures

    Get PDF
    Multiprocessor system-on-chip (MPSoC) designs utilize the available technology and communication architectures to meet the requirements of the upcoming applications. In MPSoC, the communication platform is both the key enabler, as well as the key differentiator for realizing efficient MPSoCs. It provides product differentiation to meet a diverse, multi-dimensional set of design constraints, including performance, power, energy, reconfigurability, scalability, cost, reliability and time-to-market. The communication resources of a single interconnection platform cannot be fully utilized by all kind of applications, such as the availability of higher communication bandwidth for computation but not data intensive applications is often unfeasible in the practical implementation. This thesis aims to perform the architecture-level design space exploration towards efficient and scalable resource utilization for MPSoC communication architecture. In order to meet the performance requirements within the design constraints, careful selection of MPSoC communication platform, resource aware partitioning and mapping of the application play important role. To enhance the utilization of communication resources, variety of techniques such as resource sharing, multicast to avoid re-transmission of identical data, and adaptive routing can be used. For implementation, these techniques should be customized according to the platform architecture. To address the resource utilization of MPSoC communication platforms, variety of architectures with different design parameters and performance levels, namely Segmented bus (SegBus), Network-on-Chip (NoC) and Three-Dimensional NoC (3D-NoC), are selected. Average packet latency and power consumption are the evaluation parameters for the proposed techniques. In conventional computing architectures, fault on a component makes the connected fault-free components inoperative. Resource sharing approach can utilize the fault-free components to retain the system performance by reducing the impact of faults. Design space exploration also guides to narrow down the selection of MPSoC architecture, which can meet the performance requirements with design constraints.Siirretty Doriast

    Design and Programming Methods for Reconfigurable Multi-Core Architectures using a Network-on-Chip-Centric Approach

    Get PDF
    A current trend in the semiconductor industry is the use of Multi-Processor Systems-on-Chip (MPSoCs) for a wide variety of applications such as image processing, automotive, multimedia, and robotic systems. Most applications gain performance advantages by executing parallel tasks on multiple processors due to the inherent parallelism. Moreover, heterogeneous structures provide high performance/energy efficiency, since application-specific processing elements (PEs) can be exploited. The increasing number of heterogeneous PEs leads to challenging communication requirements. To overcome this challenge, Networks-on-Chip (NoCs) have emerged as scalable on-chip interconnect. Nevertheless, NoCs have to deal with many design parameters such as virtual channels, routing algorithms and buffering techniques to fulfill the system requirements. This thesis highly contributes to the state-of-the-art of FPGA-based MPSoCs and NoCs. In the following, the three major contributions are introduced. As a first major contribution, a novel router concept is presented that efficiently utilizes communication times by performing sequences of arithmetic operations on the data that is transferred. The internal input buffers of the routers are exchanged with processing units that are capable of executing operations. Two different architectures of such processing units are presented. The first architecture provides multiply and accumulate operations which are often used in signal processing applications. The second architecture introduced as Application-Specific Instruction Set Routers (ASIRs) contains a processing unit capable of executing any operation and hence, it is not limited to multiply and accumulate operations. An internal processing core located in ASIRs can be developed in C/C++ using high-level synthesis. The second major contribution comprises application and performance explorations of the novel router concept. Models that approximate the achievable speedup and the end-to-end latency of ASIRs are derived and discussed to show the benefits in terms of performance. Furthermore, two applications using an ASIR-based MPSoC are implemented and evaluated on a Xilinx Zynq SoC. The first application is an image processing algorithm consisting of a Sobel filter, an RGB-to-Grayscale conversion, and a threshold operation. The second application is a system that helps visually impaired people by navigating them through unknown indoor environments. A Light Detection and Ranging (LIDAR) sensor scans the environment, while Inertial Measurement Units (IMUs) measure the orientation of the user to generate an audio signal that makes the distance as well as the orientation of obstacles audible. This application consists of multiple parallel tasks that are mapped to an ASIR-based MPSoC. Both applications show the performance advantages of ASIRs compared to a conventional NoC-based MPSoC. Furthermore, dynamic partial reconfiguration in terms of relocation and security aspects are investigated. The third major contribution refers to development and programming methodologies of NoC-based MPSoCs. A software-defined approach is presented that combines the design and programming of heterogeneous MPSoCs. In addition, a Kahn-Process-Network (KPN) –based model is designed to describe parallel applications for MPSoCs using ASIRs. The KPN-based model is extended to support not only the mapping of tasks to NoC-based MPSoCs but also the mapping to ASIR-based MPSoCs. A static mapping methodology is presented that assigns tasks to ASIRs and processors for a given KPN-model. The impact of external hardware components such as sensors, actuators and accelerators connected to the processors is also discussed which makes the approach of high interest for embedded systems

    Real-time multi-domain optimization controller for multi-motor electric vehicles using automotive-suitable methods and heterogeneous embedded platforms

    Get PDF
    Los capítulos 2,3 y 7 están sujetos a confidencialidad por el autor. 145 p.In this Thesis, an elaborate control solution combining Machine Learning and Soft Computing techniques has been developed, targeting a chal lenging vehicle dynamics application aiming to optimize the torque distribution across the wheels with four independent electric motors.The technological context that has motivated this research brings together potential -and challenges- from multiple dom ains: new automotive powertrain topologies with increased degrees of freedom and controllability, which can be approached with innovative Machine Learning algorithm concepts, being implementable by exploiting the computational capacity of modern heterogeneous embedded platforms and automated toolchains. The complex relations among these three domains that enable the potential for great enhancements, do contrast with the fourth domain in this context: challenging constraints brought by industrial aspects and safe ty regulations. The innovative control architecture that has been conce ived combines Neural Networks as Virtual Sensor for unmeasurable forces , with a multi-objective optimization function driven by Fuzzy Logic , which defines priorities basing on the real -time driving situation. The fundamental principle is to enhance vehicle dynamics by implementing a Torque Vectoring controller that prevents wheel slip using the inputs provided by the Neural Network. Complementary optimization objectives are effici ency, thermal stress and smoothness. Safety -critical concerns are addressed through architectural and functional measures.Two main phases can be identified across the activities and milestones achieved in this work. In a first phase, a baseline Torque Vectoring controller was implemented on an embedded platform and -benefiting from a seamless transition using Hardware-in -the -Loop - it was integrated into a real Motor -in -Wheel vehicle for race track tests. Having validated the concept, framework, methodology and models, a second simulation-based phase proceeds to develop the more sophisticated controller, targeting a more capable vehicle, leading to the final solution of this work. Besides, this concept was further evolved to support a joint research work which lead to outstanding FPGA and GPU based embedded implementations of Neural Networks. Ultimately, the different building blocks that compose this work have shown results that have met or exceeded the expectations, both on technical and conceptual level. The highly non-linear multi-variable (and multi-objective) control problem was tackled. Neural Network estimations are accurate, performance metrics in general -and vehicle dynamics and efficiency in particular- are clearly improved, Fuzzy Logic and optimization behave as expected, and efficient embedded implementation is shown to be viable. Consequently, the proposed control concept -and the surrounding solutions and enablers- have proven their qualities in what respects to functionality, performance, implementability and industry suitability.The most relevant contributions to be highlighted are firstly each of the algorithms and functions that are implemented in the controller solutions and , ultimately, the whole control concept itself with the architectural approaches it involves. Besides multiple enablers which are exploitable for future work have been provided, as well as an illustrative insight into the intricacies of a vivid technological context, showcasing how they can be harmonized. Furthermore, multiple international activities in both academic and professional contexts -which have provided enrichment as well as acknowledgement, for this work-, have led to several publications, two high-impact journal papers and collateral work products of diverse nature

    Real-time multi-domain optimization controller for multi-motor electric vehicles using automotive-suitable methods and heterogeneous embedded platforms

    Get PDF
    Los capítulos 2,3 y 7 están sujetos a confidencialidad por el autor. 145 p.In this Thesis, an elaborate control solution combining Machine Learning and Soft Computing techniques has been developed, targeting a chal lenging vehicle dynamics application aiming to optimize the torque distribution across the wheels with four independent electric motors.The technological context that has motivated this research brings together potential -and challenges- from multiple dom ains: new automotive powertrain topologies with increased degrees of freedom and controllability, which can be approached with innovative Machine Learning algorithm concepts, being implementable by exploiting the computational capacity of modern heterogeneous embedded platforms and automated toolchains. The complex relations among these three domains that enable the potential for great enhancements, do contrast with the fourth domain in this context: challenging constraints brought by industrial aspects and safe ty regulations. The innovative control architecture that has been conce ived combines Neural Networks as Virtual Sensor for unmeasurable forces , with a multi-objective optimization function driven by Fuzzy Logic , which defines priorities basing on the real -time driving situation. The fundamental principle is to enhance vehicle dynamics by implementing a Torque Vectoring controller that prevents wheel slip using the inputs provided by the Neural Network. Complementary optimization objectives are effici ency, thermal stress and smoothness. Safety -critical concerns are addressed through architectural and functional measures.Two main phases can be identified across the activities and milestones achieved in this work. In a first phase, a baseline Torque Vectoring controller was implemented on an embedded platform and -benefiting from a seamless transition using Hardware-in -the -Loop - it was integrated into a real Motor -in -Wheel vehicle for race track tests. Having validated the concept, framework, methodology and models, a second simulation-based phase proceeds to develop the more sophisticated controller, targeting a more capable vehicle, leading to the final solution of this work. Besides, this concept was further evolved to support a joint research work which lead to outstanding FPGA and GPU based embedded implementations of Neural Networks. Ultimately, the different building blocks that compose this work have shown results that have met or exceeded the expectations, both on technical and conceptual level. The highly non-linear multi-variable (and multi-objective) control problem was tackled. Neural Network estimations are accurate, performance metrics in general -and vehicle dynamics and efficiency in particular- are clearly improved, Fuzzy Logic and optimization behave as expected, and efficient embedded implementation is shown to be viable. Consequently, the proposed control concept -and the surrounding solutions and enablers- have proven their qualities in what respects to functionality, performance, implementability and industry suitability.The most relevant contributions to be highlighted are firstly each of the algorithms and functions that are implemented in the controller solutions and , ultimately, the whole control concept itself with the architectural approaches it involves. Besides multiple enablers which are exploitable for future work have been provided, as well as an illustrative insight into the intricacies of a vivid technological context, showcasing how they can be harmonized. Furthermore, multiple international activities in both academic and professional contexts -which have provided enrichment as well as acknowledgement, for this work-, have led to several publications, two high-impact journal papers and collateral work products of diverse nature

    Parallel Architectures for Many-Core Systems-On-Chip in Deep Sub-Micron Technology

    Get PDF
    Despite the several issues faced in the past, the evolutionary trend of silicon has kept its constant pace. Today an ever increasing number of cores is integrated onto the same die. Unfortunately, the extraordinary performance achievable by the many-core paradigm is limited by several factors. Memory bandwidth limitation, combined with inefficient synchronization mechanisms, can severely overcome the potential computation capabilities. Moreover, the huge HW/SW design space requires accurate and flexible tools to perform architectural explorations and validation of design choices. In this thesis we focus on the aforementioned aspects: a flexible and accurate Virtual Platform has been developed, targeting a reference many-core architecture. Such tool has been used to perform architectural explorations, focusing on instruction caching architecture and hybrid HW/SW synchronization mechanism. Beside architectural implications, another issue of embedded systems is considered: energy efficiency. Near Threshold Computing is a key research area in the Ultra-Low-Power domain, as it promises a tenfold improvement in energy efficiency compared to super-threshold operation and it mitigates thermal bottlenecks. The physical implications of modern deep sub-micron technology are severely limiting performance and reliability of modern designs. Reliability becomes a major obstacle when operating in NTC, especially memory operation becomes unreliable and can compromise system correctness. In the present work a novel hybrid memory architecture is devised to overcome reliability issues and at the same time improve energy efficiency by means of aggressive voltage scaling when allowed by workload requirements. Variability is another great drawback of near-threshold operation. The greatly increased sensitivity to threshold voltage variations in today a major concern for electronic devices. We introduce a variation-tolerant extension of the baseline many-core architecture. By means of micro-architectural knobs and a lightweight runtime control unit, the baseline architecture becomes dynamically tolerant to variations
    • …
    corecore