136 research outputs found

    221202

    Get PDF
    This PhD dissertation has resulted in six publications in which one of the papers received Best Paper Award at ICESS 2021. All the papers were published in reputed venues for real-time systems research, i.e., RTSS 2020, RTNS 2021, ICESS 2021, RTCSA 2022, RTSS 2022, Elsevier’s Journal of System Architecture. Two more papers are expected to be published soon.Multicore platforms share the hardware resources such as caches, interconnects, and main memory among all the cores. Due to such sharing, tasks running on different cores compete to access these shared resources which can potentially result in shared resource contention. This shared resource contention can increase the execution times of tasks in a non-deterministic manner. Consequently, the shared resource contention is problematic for hard real-time systems, i.e., systems that run tasks with stringent timing requirements. To address this issue, this PhD dissertation builds novel solutions to model and analyze the shared resource contention that can be suffered by tasks executing on a multicore system. The shared resource contention aware schedulability analysis is then derived by integrating the maximum shared resource contention that can be suffered by the tasks.This work was supported by the CISTER Research Unit (UIDP/UIDB/04234/2020), financed by National Funds through FCT/MCTES (Portuguese Foundation for Science and Technology); by project ADACORSA (ECSEL/0010/2019 - JU grant nr. 876019) financed through National Funds from FCT and European funds through the EU ECSEL JU. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and Austria, Sweden, Spain, Italy, France, Portugal, Ireland, Finland, Slovenia, Poland, Netherlands, Turkey - Disclaimer: This document reflects only the author’s view and the Commission is not responsible for any use that may be made of the information it contains. This work is also a result of the work developed under project Aero.Next Portugal (nº C645727867-00000066) and FLY-PT (grant nº 46079, POCI-01-0247-FEDER-046079), also funded by FCT under PhD grant 2020.09532.BD.info:eu-repo/semantics/publishedVersio

    A survey of techniques for reducing interference in real-time applications on multicore platforms

    Get PDF
    This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.This work was supported in part by the Comunidad de Madrid Government "Nuevas TĂ©cnicas de Desarrollo de Software de Tiempo Real Embarcado Para Plataformas. MPSoC de PrĂłxima GeneraciĂłn" under Grant IND2019/TIC-17261

    Scalability of broadcast performance in wireless network-on-chip

    Get PDF
    Networks-on-Chip (NoCs) are currently the paradigm of choice to interconnect the cores of a chip multiprocessor. However, conventional NoCs may not suffice to fulfill the on-chip communication requirements of processors with hundreds or thousands of cores. The main reason is that the performance of such networks drops as the number of cores grows, especially in the presence of multicast and broadcast traffic. This not only limits the scalability of current multiprocessor architectures, but also sets a performance wall that prevents the development of architectures that generate moderate-to-high levels of multicast. In this paper, a Wireless Network-on-Chip (WNoC) where all cores share a single broadband channel is presented. Such design is conceived to provide low latency and ordered delivery for multicast/broadcast traffic, in an attempt to complement a wireline NoC that will transport the rest of communication flows. To assess the feasibility of this approach, the network performance of WNoC is analyzed as a function of the system size and the channel capacity, and then compared to that of wireline NoCs with embedded multicast support. Based on this evaluation, preliminary results on the potential performance of the proposed hybrid scheme are provided, together with guidelines for the design of MAC protocols for WNoC.Peer ReviewedPostprint (published version

    211004

    Get PDF
    Today multicore processors are used in most modern systems that require computational logic. However, their applicability in systems with stringent timing requirements is still an ongoing research. This is due to the difficulty of ensuring the timing correctness of tasks executing on a multicore platform that comprises a number of shared hardware resources, e.g., caches, memory bus and the main memory. Concurrent accesses to any of these shared resources can generate uncontrolled interference, which complicates the estimations of tasks' worst-case execution time (WCET) and the worst-case response time (WCRT). The use of the 3-phase task execution model helps in upper bounding the contention due to the sharing of bus/main memory in multicore systems. It divides the execution of tasks into distinct memory and execution phases, where tasks can only access the bus/main memory during their memory phases. This makes bus/memory access patterns of tasks more predictable, enabling a preciser computation of bus/memory contention. In this work, we show how the bus contention can be computed for the 3-phase task model considering a work-conserving, i.e., round-robin (RR) based, arbitration policy at the memory bus. This is different from existing works that analyze the time-division multiple access (TDMA) and first-come-first-serve (FCFS) based bus arbitration policies. First, we present a solution to model the bus contention that can be suffered/caused by tasks executing on the same/remote cores of a multicore system under an RR-based bus arbitration scheme. We then evaluate the impact of resulting bus contention on taskset schedulability. Experimental results show that our proposed RR-based bus contention analysis can improve taskset schedulability by up to 100 percentage points than the TDMA-based analysis and up to 40 percentage points than the FCFS-based bus contention analysis.This work was partially supported by National Funds through FCT/MCTES (Portuguese Foundation for Science and Technology), within the CISTER Research Unit (UIDB-UIDP/04234/2020); also by the Operational Competitiveness Programme and Internationalization (COMPETE 2020) under the PT2020 Partnership Agreement, through the European Regional Development Fund (ERDF), and by national funds through the FCT, within project POCI-01-0145-FEDER-029119 (PREFECT); also by the European Union’s Horizon 2020 - The EU Framework Programme for Research and Innovation 2014-2020, under grant agreement No. 732505. Project “TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER000020” financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement; also by FCT, under PhD grant 2020.09532.BD.info:eu-repo/semantics/publishedVersio

    220801

    Get PDF
    Multicore platforms are being increasingly adopted in Cyber-Physical Systems (CPS) due to their advantages over single-core processors, such as raw computing power and energy efficiency. Typically, multicore platforms use a shared memory bus that connects the cores to the off-chip main memory. This sharing of memory bus may cause tasks running on different cores to compete for access to the main memory whenever data/instructions are need to be read/written from/to the main memory. Such competition is problematic, as it may cause variations in the execution time of tasks in a non-deterministic way. To reduce the complexity of analysing this problem, the 3-phase task model was proposed that divides tasks' executions into distinct memory and execution phases. The distinctive memory phases are then scheduled to eliminate/minimize main memory contention between concurrently executing tasks. However, 3-phase tasks running on different cores may still compete to access the shared memory bus/main memory in order to execute memory phases. This paper presents a partitioned scheduling-based approach that allows one to derive memory bus contention-aware worst-case response time of tasks that follow the 3-phase task model. In particular, the bus-contention analysis is derived by considering two memory access models, i.e., (i) dedicated memory access model, where a core having allowed to access the main memory via memory bus is permitted to execute more than one memory phase, and (ii) fair memory access model, that restrict each core to execute only one memory phase in its allocated bus access. Both these models represent different system and application requirements, and the resulting bus contention of tasks may vary depending on the considered model. To evaluate the effectiveness of the proposed bus contention analysis, we compare its performance against an existing analysis in the state-of-the-art by performing (i) case-study experiments, using benchmarks from the Mälardalen Benchmark suite, and (ii) empirical evaluation using synthetic task sets. Results show that our proposed analysis can improve task set schedulability of 3-phase tasks by up to 88 percentage points.This work was partially supported by European Union’s Horizon 2020 -The EU Framework Programme for Research and Innovation 2014-2020, under grant agreement No. 732505. Project “TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER000020” 845 financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement; also by National Funds through FCT/MCTES (Portuguese Foundation for Science and Technology), within the CISTER Research Unit (UIDP/UIDB/04234/2020); by FCT and the Portuguese National Innovation Agency (ANI), under the CMU Portugal partnership, through the European Regional Development Fund (ERDF) of the Operational Competitiveness Programme and Internationaliza850 tion (COMPETE 2020), under the PT2020 Partnership Agreement, within project FLOYD (POCI-01-0247- FEDER-045912), also by FCT under PhD grant 2020.09532.BD.info:eu-repo/semantics/publishedVersio

    A Survey on Cache Management Mechanisms for Real-Time Embedded Systems

    Get PDF
    © ACM, 2015. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Computing Surveys, {48, 2, (November 2015)} http://doi.acm.org/10.1145/2830555Multicore processors are being extensively used by real-time systems, mainly because of their demand for increased computing power. However, multicore processors have shared resources that affect the predictability of real-time systems, which is the key to correctly estimate the worst-case execution time of tasks. One of the main factors for unpredictability in a multicore processor is the cache memory hierarchy. Recently, many research works have proposed different techniques to deal with caches in multicore processors in the context of real-time systems. Nevertheless, a review and categorization of these techniques is still an open topic and would be very useful for the real-time community. In this article, we present a survey of cache management techniques for real-time embedded systems, from the first studies of the field in 1990 up to the latest research published in 2014. We categorize the main research works and provide a detailed comparison in terms of similarities and differences. We also identify key challenges and discuss future research directions.King Saud University NSER

    Timing Predictability in Future Multi-Core Avionics Systems

    Full text link

    SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures

    Get PDF
    Near-Data-Processing (NDP) architectures present a promising way to alleviate data movement costs and can provide significant performance and energy benefits to parallel applications. Typically, NDP architectures support several NDP units, each including multiple simple cores placed close to memory. To fully leverage the benefits of NDP and achieve high performance for parallel workloads, efficient synchronization among the NDP cores of a system is necessary. However, supporting synchronization in many NDP systems is challenging because they lack shared caches and hardware cache coherence support, which are commonly used for synchronization in multicore systems, and communication across different NDP units can be expensive. This paper comprehensively examines the synchronization problem in NDP systems, and proposes SynCron, an end-to-end synchronization solution for NDP systems. SynCron adds low-cost hardware support near memory for synchronization acceleration, and avoids the need for hardware cache coherence support. SynCron has three components: 1) a specialized cache memory structure to avoid memory accesses for synchronization and minimize latency overheads, 2) a hierarchical message-passing communication protocol to minimize expensive communication across NDP units of the system, and 3) a hardware-only overflow management scheme to avoid performance degradation when hardware resources for synchronization tracking are exceeded. We evaluate SynCron using a variety of parallel workloads, covering various contention scenarios. SynCron improves performance by 1.27Ă—\times on average (up to 1.78Ă—\times) under high-contention scenarios, and by 1.35Ă—\times on average (up to 2.29Ă—\times) under low-contention real applications, compared to state-of-the-art approaches. SynCron reduces system energy consumption by 2.08Ă—\times on average (up to 4.25Ă—\times).Comment: To appear in the 27th IEEE International Symposium on High-Performance Computer Architecture (HPCA-27
    • …
    corecore