59 research outputs found

    Predictable and composable system-on-chip memory controllers

    Get PDF
    Contemporary System-on-Chip (SoC) become more and more complex, as increasing integration results in a larger number of concurrently executing applications. These applications consist of tasks that are mapped on heterogeneous multi-processor platforms with distributed memory hierarchies, where SRAMs and SDRAMs are shared by a variety of arbiters. Some applications have real-time requirements, meaning that they must perform a particular computation before a deadline to guarantee functional correctness, or to prevent quality degradation. Mapping the applications on the platform such that all real-time requirements are satisfied is very challenging. The number of possible mappings of tasks to processing elements and data structures to memories may be large, and appropriate configuration settings must be determined once the mapping is chosen. Verifying that a particular mapping satisfies all application requirements is typically done by system-level simulation. However, resource sharing causes interference between applications, making their temporal behaviors inter-dependent. All concurrently executing applications must hence be verified together, causing the verification complexity of the system to increase exponentially with the number of applications. Together these factors contribute to making the integration and verification process a dominant part of SoC development, both in terms of time and money. Predictable and composable systems are proposed to manage the increasing verification complexity. Predictable systems provide lower bounds on application performance, while applications in composable systems are completely isolated and cannot affect each other’s temporal behavior by even a single clock cycle. Predictable systems enable formal verification that covers all possible interactions with the platform. However, this assumes that the behavior of an application is captured in a performance model, which is not the case for many applications. Composability offers a complementary verification approach by letting these applications be verified independently by simulation with linear verification complexity. A limitation of current predictable and composable systems is that there are no memory controllers supporting the concepts in a general way. Current SRAM controllers can be shared in a predictable way with a variety of arbiters, but are only composable if statically scheduled or shared using time-division multiplexing. Existing SDRAM controllers are not composable, and are either unpredictable or limited to applications that are statically scheduled. This thesis addresses the limitations of current predictable and composable systems by proposing a general predictable and composable memory controller, thereby addressing the mapping and verification problem in embedded systems. The proposed memory controller is divided into a front-end and a back-end. The back-end is specific for DDR2/DDR3 SDRAM and makes the memory behave in a predictable manner using precomputed memory patterns that are dynamically combined at run time. The front-end contains buffering and an arbiter in the class of Latency-Rate (LR) servers, which is a class with many well-known predictable arbiters. We extend this class with a Credit-Controlled Static-Priority (CCSP) arbiter that is developed specifically for shared resources with latency-critical requestors and high loads, such as memories. Three key features of CCSP are: 1) It accommodates latency-critical requestors with low bandwidth requirements without wasting bandwidth. 2) Over-allocated bandwidth can be made negligible at an increased area cost, without affecting latency. 3) It has a small implementation that runs fast enough to keep up with most DDR2/DDR3 memories. The proposed front-end is general and can be used with other predictable resources, such as SRAM controllers. The proposed memory controller hence supports multiple arbiter and memory types, thus addressing the diversity in modern SoCs. The combination of front-end and predictable memory behaves like a LR server, which is the shared resource abstraction used in this work. In essence, a LR server guarantees a requestor a minimum bandwidth and a maximum latency, enabling formal verification of real-time requirements. The LR server model is compatible with several commonly used formal analysis frameworks, such as network calculus and data-flow analysis. Our memory controller hence allows any combination of predictable memory and LR arbiter to be used transparently for formal verification of applications with any of these frameworks. Sharing a predictable memory at run-time results in interference between requestors, making the memory controller non-composable. This is addressed by adding a Delay Block to the front-end that delays all signals sent from the front-end to a requestor to always emulate worst-case interference. This makes requestors unable to affect each other’s temporal behavior, which is sufficient to guarantee composability on the level of applications. Our predictable memory controller hence offers composable service with a variety of memory and arbiter types, which widely extends the scope of composable platforms. Another benefit of this approach is that it enables composable service to be dynamically enabled and disabled, enabling requestors that do not require composable service to use slack bandwidth to improve performance. The predictable and composable memory controller is supported by a configuration flow that automatically computes memory patterns and arbiter settings to satisfy given bandwidth and latency requirements. The flow uses abstraction to separate the configuration of the memory and the arbiter, enabling settings to be computed in a streamlined fashion for all supported memories and arbiters

    Classification and analysis of predictable memory patterns

    Get PDF
    The verification complexity of real-time requirements in embedded systems grows exponentially with the number of applications, as resource sharing prevents independent verification using simulation-based approaches. Formal verification is a promising alternative, although its applicability is limited to systems with predictable hardware and software. SDRAM memories are common examples of essential hardware components with unpredictable timing behavior, typically preventing use of formal approaches. A predictable SDRAM controller has been proposed that provides guarantees on bandwidth and latency by dynamically scheduling memory patterns, which are statically computed sequences of SDRAM commands. However, the proposed patterns become increasingly inefficient as memories become faster, making them unsuitable for DDR3 SDRAM. This paper extends the memory pattern concept in two ways. Firstly, we introduce a burst count parameter that enables patterns to have multiple SDRAM bursts per bank, which is required for DDR3 memories to be used efficiently. Secondly, we present a classification of memory pattern sets into four categories based on the combination of patterns that cause worst-case bandwidth and latency to be provided. Bounds on bandwidth and latency are derived that apply to all pattern types and burst counts, as opposed to the single case covered by earlier work. Experimental results show that these extensions are required to support the most efficient pattern sets for many use-cases. We also demonstrate that the burst count parameter increases efficiency in presence of large requests and enables a wider range of real-time requirements to be satisfied

    A reconfigurable real-time SDRAM controller for mixed time-criticality systems

    Get PDF
    Verifying real-time requirements of applications is increasingly complex on modern Systems-on-Chips (SoCs). More applications are integrated into one system due to power, area and cost constraints. Resource sharing makes their timing behavior interdependent, and as a result the verification complexity increases exponentially with the number of applications. Predictable and composable virtual platforms solve this problem by enabling verification in isolation, but designing SoC resources suitable to host such platforms is challenging. This paper focuses on a reconfigurable SDRAM controller for predictable and composable virtual platforms. The main contributions are: 1) A run-time reconfigurable SDRAM controller architecture, which allows trade-offs between guaranteed bandwidth, response time and power. 2) A methodology for offering composable service to memory clients, by means of composable memory patterns. 3) A reconfigurable Time-Division Multiplexing (TDM) arbiter and an associated reconfiguration protocol. The TDM slot allocations can be changed at run time, while the predictable and composable performance guarantees offered to active memory clients are unaffected by the reconfiguration. The SDRAM controller has been implemented as a TLM-level SystemC model, and in synthesizable VHDL for use on an FPGA

    Composability and Predictability for Independent Application Development, Verification and Execution

    Get PDF
    System-on-chip (SOC) design gets increasingly complex, as a growing number of applications are integrated in modern systems. Some of these applications have real-time requirements, such as a minimum throughput or a maximum latency. To reduce cost, system resources are shared between applications, making their timing behavior inter-dependent. Real-time requirements must hence be verified for all possible combinations of concurrently executing applications, which is not feasible with commonly used simulation-based techniques. This chapter addresses this problem using two complexity-reducing concepts: composability and predictability. Applications in a composable system are completely isolated and cannot affect each other’s behaviors, enabling them to be independently verified. Predictable systems, on the other hand, provide lower bounds on performance, allowing applications to be verified using formal performance analysis. Five techniques to achieve composability and/or predictability in SOC resources are presented and we explain their implementation for processors, interconnect, and memories in our platform

    Dynamic Command Scheduling for Real-Time Memory Controllers

    Full text link
    Memory controller design is challenging as real-time embedded systems feature an increasing diversity of real-time and non-real-time applications with variable transaction sizes. To satisfy the requirements of the applications, tight bounds on the worst-case execution time (WCET) of memory transactions must be provided to real-time applications, while the lowest possible average execution time must be given to the rest. Existing real-time memory controllers cannot efficiently achieve this goal as they either bound the WCET by sacrificing the average execution time, or are not scalable to directly support variable transaction sizes, or both. In this paper, we propose to use dynamic command scheduling, which is capable of efficiently dealing with transactions with variable sizes. The three main contributions of this paper are: 1) a back-end architecture for a real-time memory controller with a dynamic command scheduling algorithm, 2) a formalization of the timings of the memory transactions for the proposed architecture and algorithm, and 3) two techniques to bound the WCET of transactions with both fixed and variable sizes, respectively. We experimentally evaluate the proposed memory controller and compare both the worst-case and average-case execution times of transactions to a state-of-the-art semi-static approach. The results demonstrate that dynamic command scheduling outperforms the semi-static approach by 33.4% in the average case and performs at least equally well in the worst case. We also show the WCET is tight for transactions with fixed and variable sizes, respectively

    Classification and Analysis of Predictable Memory Patterns

    Full text link
    The verification complexity of real-time requirements in embedded systems grows exponentially with the number of applications, as resource sharing prevents independent verification using simulation-based approaches. Formal verification is a promising alternative, although its applicability is limited to systems with predictable hardware and software. SDRAM memories are common examples of essential hardware components with unpredictable timing behavior, typically preventing use of formal approaches. A predictable SDRAM controller has been proposed that provides guarantees on bandwidth and latency by dynamically scheduling memory patterns, which are statically computed sequences of SDRAM commands. However, the proposed patterns become increasingly inefficient as memories become faster, making them unsuitable for DDR3 SDRAM. This paper extends the memory pattern concept in two ways. Firstly, we introduce a burst count parameter that enables patterns to have multiple SDRAM bursts per bank, which is required for DDR3 memories to be used efficiently. Secondly, we present a classification of memory pattern sets into four categories based on the combination of patterns that cause worst-case bandwidth and latency to be provided. Bounds on bandwidth and latency are derived that apply to all pattern types and burst counts, as opposed to the single case covered by earlier work. Experimental results show that these extensions are required to support the most efficient pattern sets for many use-cases. We also demonstrate that the burst count parameter increases efficiency in presence of large requests and enables a wider range of real-time requirements to be satisfied

    Combination of searches for Higgs boson pairs in pp collisions at \sqrts = 13 TeV with the ATLAS detector

    Get PDF
    This letter presents a combination of searches for Higgs boson pair production using up to 36.1 fb(-1) of proton-proton collision data at a centre-of-mass energy root s = 13 TeV recorded with the ATLAS detector at the LHC. The combination is performed using six analyses searching for Higgs boson pairs decaying into the b (b) over barb (b) over bar, b (b) over barW(+)W(-), b (b) over bar tau(+)tau(-), W+W-W+W-, b (b) over bar gamma gamma and W+W-gamma gamma final states. Results are presented for non-resonant and resonant Higgs boson pair production modes. No statistically significant excess in data above the Standard Model predictions is found. The combined observed (expected) limit at 95% confidence level on the non-resonant Higgs boson pair production cross-section is 6.9 (10) times the predicted Standard Model cross-section. Limits are also set on the ratio (kappa(lambda)) of the Higgs boson self-coupling to its Standard Model value. This ratio is constrained at 95% confidence level in observation (expectation) to -5.0 &lt; kappa(lambda) &lt; 12.0 (-5.8 &lt; kappa(lambda) &lt; 12.0). In addition, limits are set on the production of narrow scalar resonances and spin-2 Kaluza-Klein Randall-Sundrum gravitons. Exclusion regions are also provided in the parameter space of the habemus Minimal Supersymmetric Standard Model and the Electroweak Singlet Model. For complete list of authors see http://dx.doi.org/10.1016/j.physletb.2019.135103</p

    Prompt and non-prompt J/psi elliptic flow in Pb plus Pb collisions at root S-NN=5.02 TeV with the ATLAS detector

    Get PDF
    The elliptic flow of prompt and non-prompt J/ \u3c8 was measured in the dimuon decay channel in Pb+Pb collisions at sNN=5.02&nbsp;TeV with an integrated luminosity of 0.42nb-1 with the ATLAS detector at the LHC. The prompt and non-prompt signals are separated using a two-dimensional simultaneous fit of the invariant mass and pseudo-proper decay time of the dimuon system from the J/ \u3c8 decay. The measurement is performed in the kinematic range of dimuon transverse momentum and rapidity 9 &lt; pT&lt; 30 GeV , | y| &lt; 2 , and 0\u201360% collision centrality. The elliptic flow coefficient, v2, is evaluated relative to the event plane and the results are presented as a function of transverse momentum, rapidity and centrality. It is found that prompt and non-prompt J/ \u3c8 mesons have non-zero elliptic flow. Prompt J/ \u3c8v2 decreases as a function of pT, while for non-prompt J/ \u3c8 it is, with limited statistical significance, consistent with a flat behaviour over the studied kinematic region. There is no observed dependence on rapidity or centrality

    A measurement of material in the ATLAS tracker using secondary hadronic interactions in 7 TeV pp collisions

    Get PDF
    Knowledge of the material in the ATLAS inner tracking detector is crucial in understanding the reconstruction of charged-particle tracks, the performance of algorithms that identify jets containing b-hadrons and is also essential to reduce background in searches for exotic particles that can decay within the inner detector volume. Interactions of primary hadrons produced in pp collisions with the material in the inner detector are used to map the location and amount of this material. The hadronic interactions of primary particles may result in secondary vertices, which in this analysis are reconstructed by an inclusive vertex-finding algorithm. Data were collected using minimum-bias triggers by the ATLAS detector operating at the LHC during 2010 at centre-of-mass energy √ s = 7 TeV, and correspond to an integrated luminosity of 19 nb−1 . Kinematic properties of these secondary vertices are used to study the validity of the modelling of hadronic interactions in simulation. Secondary-vertex yields are compared between data and simulation over a volume of about 0.7 m3 around the interaction point, and agreement is found within overall uncertainties

    Search for squarks and gluinos in final states with hadronically decaying tau-leptons, jets, and missing transverse momentum using pp collisions at root s = 13 TeV with the ATLAS detector

    Get PDF
    A search for supersymmetry in events with large missing transverse momentum, jets, and at least one hadronically decaying τ-lepton is presented. Two exclusive final states with either exactly one or at least two τ-leptons are considered. The analysis is based on proton-proton collisions at √s=13  TeV corresponding to an integrated luminosity of 36.1  fb⁻¹ delivered by the Large Hadron Collider and recorded by the ATLAS detector in 2015 and 2016. No significant excess is observed over the Standard Model expectation. At 95% confidence level, model-independent upper limits on the cross section are set and exclusion limits are provided for two signal scenarios: a simplified model of gluino pair production with τ-rich cascade decays, and a model with gauge-mediated supersymmetry breaking (GMSB). In the simplified model, gluino masses up to 2000 GeV are excluded for low values of the mass of the lightest supersymmetric particle (LSP), while LSP masses up to 1000 GeV are excluded for gluino masses around 1400 GeV. In the GMSB model, values of the supersymmetry-breaking scale are excluded below 110 TeV for all values of tanβ in the range 2 ≤ tanβ ≤ 60, and below 120 TeV for tanβ > 30.M. Aaboud … D. Duvnjak … P. Jackson … J.L. Oliver … A. Petridis … A. Qureshi … A.S. Sharma … M.J. White … et al. [The ATLAS Collaboration
    corecore