7 research outputs found

    A FLEXIBLE HIGH-BANDWIDTH LOW-LATENCY MULTI-PORT MEMORY CONTROLLER

    Get PDF
    Multi-port memory controllers (MPMCs) have become increasingly important in many modern applications due to the tremendous growth in bandwidth requirement. Many approaches so far have focused on improving either the memory access latency or the bandwidth utilization for specific applications. Moreover, the application systems are likely to require certain adjustments to connect with an MPMC, since the MPMC interface is limited to a single-clock and single-data-width domain. In this paper, we propose efficient techniques to improve the flexibility, latency, and bandwidth of an MPMC. Firstly, MPMC interfaces employ a pair of dual-clock dualport FIFOs at each port, so any multi-clock multi-data-width application system can connect to an MPMC without requiring extra resources. Secondly, memory access latency is significantly reduced because parallel FIFOs temporarily keep the data transfer between the application system and memory. Lastly, a proposed arbitration scheme, namely window-based first-come-first-serve, considerably enhances the bandwidth utilization. Depending on the applications, MPMC can be properly configured by updating several internal configuration registers. The experimental results in an Altera Cyclone V FPGA prove that MPMC is fully operational at 150 MHz and supports up to 32 concurrent connections at various clocks and data widths. More significantly, achieved bandwidth utilization is approximately 93.2% of the theoretical bandwidth, and the access latency is minimized as compared to previous designs

    Power/Performance Trade-Offs in Real-Time SDRAM Command Scheduling

    Full text link

    Memory-map selection for firm real-time SDRAM controllers

    No full text
    A modern real-time embedded system must support multiple concurrently running applications. To reduce costs, critical SoC components like SDRAM memories are often shared between applications with a variety of firm real-time requirements. To guarantee that the system works as intended, the memory controller must be configured such that all the realtime requirements of all sharing applications are satisfied. The attainable worst-case bandwidth, latency, and power of the memory depend largely on memory map configuration. Sharing SDRAM amongst multiple applications is challenging, since their requirements might call for different memory maps. This paper presents an exploration of the memory-map design space. Two contributions improve the memory-map selection procedure. The first contribution reduces the minimum access granularity by interleaving requests over a configurable number of banks instead of all banks. This technique is beneficial for worst-case performance in terms of bandwidth, latency and power. As a second contribution, we present a methodology to derive a memory-map configuration, i.e. the access granularity and number of interleaved banks, from a specification of the real-time application requirements and an overall memory power budget

    Memory-map selection for firm real-time SDRAM controllers

    No full text
    A modern real-time embedded system must support multiple concurrently running applications. To reduce costs, critical SoC components like SDRAM memories are often shared between applications with a variety of firm real-time requirements. To guarantee that the system works as intended, the memory controller must be configured such that all the realtime requirements of all sharing applications are satisfied. The attainable worst-case bandwidth, latency, and power of the memory depend largely on memory map configuration. Sharing SDRAM amongst multiple applications is challenging, since their requirements might call for different memory maps. This paper presents an exploration of the memory-map design space. Two contributions improve the memory-map selection procedure. The first contribution reduces the minimum access granularity by interleaving requests over a configurable number of banks instead of all banks. This technique is beneficial for worst-case performance in terms of bandwidth, latency and power. As a second contribution, we present a methodology to derive a memory-map configuration, i.e. the access granularity and number of interleaved banks, from a specification of the real-time application requirements and an overall memory power budget

    Architektur- und Leistungsanalyse eines Mehgenerationen-SDRAM-Controllers für gemischte Kritikalitätssysteme

    Get PDF
    Due to their high-density and low-cost, DDR SDRAM are the prevailing choice for implementing the main memory of a computer system. Nevertheless, the aforementioned benefits come at the cost of a complex two-stage access protocol, which ultimately means that the time required to serve a memory request depends on the history of previous requests. Otherly stated, DDR SDRAMs are a stateful resource. The main goal of this dissertation is to design a controller that leverages the state of DDR SDRAMs in a mixed criticality environment. More specifically, the controller should provide good average performance for best-effort requestors without compromising timing guarantees for critical requestors. With that regard, this dissertation firstly identifies two challenges of growing relevance for the design of memory controllers for the mixed criticality domain. The first challenge is the data bus turnaround time. The second challenge is the rank-to-rank switching time and only affects multi-rank modules. After pinpointing the two aforementioned challenges, this dissertation proposes a SDRAM controller to tackle them. The proposed controller bundles read and write operations in their corresponding ranks, thus minimizing the number of data bus turnarounds and rank switching events. As a consequence, the average performance of the controller is improved. However, the bundling is carefully designed so that real-time guarantees for critical requestors can be extracted. Moreover, as it will become clear, both the operation of the controller and the corresponding analysis of the temporal properties are described in terms of a generation-independent notation. This is a desirable feature because different SDRAM generations have different architectural features and possibly, timing constraints. Finally, an extensive comparison with the related work is performed. Furthermore, trends in worst-case latency over DDR SDRAM from different speed bins and generations are presented and thoroughly discussed.Aufgrund ihrer hohen Dichte und geringen Kosten sind DDR SDRAM die vorherrschende Wahl für die Implementierung des Hauptspeichers eines Computersystems. Die oben genannten Vorteile gehen jedoch zu Lasten eines komplexen zweistufigen Zugriffsprotokolls, was letztendlich bedeutet, dass die Zeit, die benötigt wird, um eine Speicheranforderung zu bedienen, von der Historie früherer Zugriffe abhängt. Anders ausgedrückt, DDR SDRAM sind eine zustandsabhängige Ressource, was die Umsetzung gemischter Kritikalitäten weiter erschwert, da unterschiedliche Ebenen der Kritikalität widersprüchliche Bedürfnisse haben. Das Hauptziel dieser Dissertation ist es, einen Controller zu entwickeln, der den Zustand der DDR-SDRAMs in einer gemischten Kritikalitätsumgebung nutzt. Genauer gesagt, der Controller soll eine gute durchschnittliche Leistung für best-effort Zugriffe ermöglichen, ohne die Garantien für kritische Zugriffe zu gefährden. In diesem Zusammenhang identifiziert diese Dissertation zunächst zwei Herausforderungen von wachsender Relevanz für das Design von Speichercontrollern für Systeme gemischter Kritikalität. Die erste Herausforderung ist die notwendige Zeit zur Richtungsänderung des Datenbusses. Die zweite Herausforderung ist die Rang-zu-Rang-Schaltzeit und betrifft nur Module mit mehreren Rängen. Nach dem Aufzeigen der beiden oben genannten Herausforderungen, schlägt diese Dissertation einen SDRAM Controller vor, um sie anzugehen. Der vorgeschlagene Controller bündelt Lese und Schreib Operationen in ihren entsprechenden Rängen, wodurch die Anzahl der Richtungsänderungen des Datenbusses und die Anzahl der Rangwechsel minimiert wird. Dadurch wird die durchschnittliche Leistung des Controllers verbessert. Die Bündelung ist so konzipiert, dass Echtzeit-Garantien für kritische Zugriffe abgeleitet werden können. Darüber hinaus werden, wie sich zeigen wird, sowohl das Verhalten des Controllers als auch die entsprechende Analyse der zeitlichen Eigenschaften in Form einer generationsunabhängigen Notation beschrieben. Dies ist ein wünschenswertes Merkmal, da verschiedene SDRAM Generationen unterschiedliche architektonische Merkmale und zeitliche Beschränkungen haben. Abschließend wird ein ausführlicher Vergleich mit inhaltlich verwandten Arbeiten durchgeführt. Außerdem werden Trends in der Worst-Case-Latenz von DDR SDRAM aus verschiedenen Geschwindigkeitsklassen und Generationen vorgestellt und ausführlich diskutiert

    Predictable Shared Memory Resources for Multi-Core Real-Time Systems

    Get PDF
    A major challenge in multi-core real-time systems is the interference problem on the shared hardware components amongst cores. Examples of these shared components include buses, on-chip caches, and off-chip dynamic random access memories (DRAMs). The problem arises because different cores in the system interfere with each other, while competing to access the shared hardware components. It is a challenging problem for real-time systems because operations of one core affect the temporal behaviour of other cores, which complicates the timing analysis of the system. We address this problem by making the following contributions. 1) For shared buses, we propose CArb, a predictable and criticality-aware arbiter, which provides guaranteed and differential service to tasks based on their requirements. In addition, we utilize CArb to mitigate overheads resulting from system switching among different modes. 2) For the cache hierarchy, we address the problem of maintaining cache coherence in multi-core real-time systems by modifying current coherence protocols such that data sharing is viable for real-time systems in a manner amenable for timing analysis. The proposed solution provides performance improvements, does not impose any scheduling restrictions, and does not require any source-code modifications. 3) At the shared DRAM level, we propose PMC, a programmable memory controller that provides latency guarantees for running tasks upon accessing the off-chip DRAM, while assigning differential memory services to tasks based on their bandwidth and latency requirements. In addition to PMC, we conduct a latency-based analysis on DRAM memory controllers (MCs). Our analysis provides both best-case and worst-case bounds on the latency that any request suffers upon accessing the DRAM. The analysis comprehensively covers all possible interactions of successive requests considering all possible DRAM states. Finally, we formally model request interrelations and DRAM command interactions. We use these models to develop an automated validation framework along with benchmark suites to validate and evaluate PMC and any other MC, which we release as an open-source tool
    corecore