711 research outputs found

    Buffer-aware Worst Case Timing Analysis of Wormhole Network On Chip

    Get PDF
    A buffer-aware worst-case timing analysis of wormhole NoC is proposed in this paper to integrate the impact of buffer size on the different dependencies relationship between flows, i.e. direct and indirect blocking flows, and consequently the timing performance. First, more accurate definitions of direct and indirect blocking flows sets have been introduced to take into account the buffer size impact. Then, the modeling and worst-case timing analysis of wormhole NoC have been detailed, based on Network Calculus formalism and the newly defined blocking flows sets. This introduced approach has been illustrated in the case of a realistic NoC case study to show the trade off between latency and buffer size. The comparative analysis of our proposed Buffer-aware timing analysis with conventional approaches is conducted and noticeable enhancements in terms of maximum latency have been proved

    Comparaison de strategies de calcul de bornes sur NoC

    Get PDF
    The Kalray MPPA2-256 processor integrates 256 processing cores and 32 management cores on a chip. Theses cores are grouped into clusters, and clusters are connected by a high-performance network on chip (NoC). This NoC provides some hardware mechanisms (egress traffic limiters) that can be configured to offer bounded latencies. This paper presents how network calculus can be used to bound these latencies while computing the routes of data flows, using linear programming. Then, its shows how other approaches can also be used and adapted to analyze this NoC. Their performances are then compared on three case studies: two small coming from previous studies, and one realistic with 128 or 256 flows. On theses cases studies, it shows that modeling the shaping introduced by links is of major importance to get accurate bounds. And when packets are of constant size, the Total Flow Analysis gives, on average, bounds 20%-25% smaller than all other methods

    Least Upper Delay Bound for VBR Flows in Networks-on- Chip with Virtual Channels

    Get PDF
    Real-time applications such as multimedia and gaming require stringent performance guarantees, usually enforced by a tight upper bound on the maximum end-to-end delay. For FIFO multiplexed on-chip packet switched networks we consider worst-case delay bounds for Variable Bit-Rate (VBR) flows with aggregate scheduling, which schedules multiple flows as an aggregate flow. VBR Flows are characterized by a maximum transfer size, peak rate, burstiness, and average sustainable rate. Based on network calculus, we present and prove theorems to derive per-flow end-to-end Equivalent Service Curves (ESC) which are in turn used for computing Least Upper Delay Bounds (LUDBs) of individual flows. In a realistic case study we find that the end-to-end delay bound is up to 46.9% more accurate than the case without considering the traffic peak behavior. Likewise, results also show similar improvements for synthetic traffic patterns. The proposed methodology is implemented in C++ and has low run-time complexity, enabling quick evaluation for large and complex SoCs

    Buffer-Aware Worst-Case Timing Analysis of Wormhole NoCs Using Network Calculus

    Get PDF
    Abstract—Conducting worst-case timing analyses for wormhole Networks-on-chip (NoCs) is a fundamental aspect to guarantee real-time requirements, but it is known to be a challenging issue due to complex congestion patterns that can occur. In that respect, we introduce in this paper a new buffer-aware timing analysis of wormhole NoCs based on Network Calculus. Our main idea consists in considering the flows serialization phenomena along the path of a flow of interest (f.o.i), by paying the bursts of interfering flows only at the first convergence point, and refining the interference patterns for the f.o.i accounting for the limited buffer size. Moreover, we aim to handle such an issue for wormhole NoCs, implementing a fixed priority-preemptive arbitration of Virtual Channels (VCs), that can be assigned to an arbitrary number of traffic classes with different priority levels, i.e. VC sharing, and each traffic class may contain an arbitrary number of flows, i.e. priority sharing. It is worth noting that such characteristics cover a large panel of wormhole NoCs. The derived delay bounds are analyzed and compared to available results of existing approaches, based on Scheduling Theory as well as Compositional Performance Analysis (CPA). In doing this, we highlight a noticeable enhancement of the delay bounds tightness in comparison to CPA approach, and the inherent safe bounds of our proposal in comparison to Scheduling Theory approaches. Finally, we perform experiments on a manycore platform, to confront our timing analysis predictions to experimental data and assess its tightness

    Beyond the Accuracy-Complexity Tradeoffs of Compositional Analyses using Network Calculus for Complex Networks

    Get PDF
    Achieving the accuracy-complexity tradeoffs for compositional timing analyses using Network Calculus is still a hot research topic. In this specific area, we propose in this paper an improved version of the Total Flow Analysis (TFA) algorithm, called TFA++, when taking into account the impact of the finite transmission capacity of the network links on the input and output traffic models at each network node. First, we review the existing analysis algorithms by identifying their main limitations in terms of accuracy and complexity, through a simple but representative network example. Afterwards, we define the TFA++ algorithm and we detail the main steps of the followed methodology to compute the delay upper bounds. Moreover, we conduct comparative analyses of the derived delay bounds and analysis times with the different algorithms, with respect to the network size and load. In doing this, we highlight noticeable enhancements of both metrics under TFA++, in comparison to the existing algorithms; thus the high accuracy and low complexity of TFA++. Finally, this statement has been asserted through a representative avionics case

    Performance Evaluation and Design Tradeoffs of On-Chip Interconnect Architectures

    Get PDF
    Network-on-Chip (NoC) has been proposed as an alternative to bus-based schemes to achieve high performance and scalability in System-on-Chip (SoC) design. Performance analysis and evaluation of on-chip interconnect architectures are widely based on simulations, which become computationally expensive, especially for large-scale NoCs. In this paper, a Network Calculusbased methodology is presented to analyze and evaluate the performance and cost metrics, such as latency and energy consumption. The 2D Mesh, Spidergong and WK-recursive on-chip interconnect architectures are analyzed using this methodology and results are compared with those produced using simulations. The values obtained by simulations and by analysis show similar trends in the same order of magnitude. Furthermore, WK outperforms the other on-chip interconnects in all considered metric

    Analytical approaches for performance evaluation of networks-on-chip

    Get PDF
    This tutorial reviews four popular mathematical formalisms – dataflow analysis, schedulability analysis, network calculus, and queueing theory – and how they have been applied to the analysis of Network-on-Chip (NoC) performance. We review the basic concepts and results of each formalism and provide examples of how they have been used in on-chip communication performance analysis. The tutorial also discusses the respective strengths and weaknesses of each formalism, their suitability for a specific purpose, and the attempts that have been made to bridge these analytical approaches. Finally, we conclude the tutorial by discussing open research issues
    • …
    corecore