137 research outputs found

    U-EDF: An Unfair But Optimal Multiprocessor Scheduling Algorithm for Sporadic Tasks

    Full text link
    A multiprocessor scheduling algorithm named U-EDF, was presented in [1] for the scheduling of periodic tasks with implicit deadlines. It was claimed that U-EDF is optimal for periodic tasks (i.e. it can meet all deadlines of every schedulable task set) and extensive simulations showed a drastic improvement in the number of task preemptions and migrations in comparison to state-of-the-art optimal algorithms. However, there was no proof of its optimality and U-EDF was not designed to schedule sporadic tasks. In this work, we propose a generalization of U-EDF for the scheduling of sporadic tasks with implicit deadlines, and we prove its optimality. Contrarily to all other existing optimal multiprocessor scheduling algorithms for sporadic tasks, U-EDF is not based on the fairness property. Instead, it extends the main principles of EDF so that it achieves optimality while benefiting from a substantial reduction in the number of preemptions and migrations. © 2012 IEEE.SCOPUS: cp.pinfo:eu-repo/semantics/publishe

    Power-Aware Real-Time Scheduling upon Identical Multiprocessor Platforms

    Get PDF
    In this paper, we address the power-aware scheduling of sporadic constrained-deadline hard real-time tasks using dynamic voltage scaling upon multiprocessor platforms. We propose two distinct algorithms. Our first algorithm is an off-line speed determination mechanism which provides an identical speed for each processor. That speed guarantees that all deadlines are met if the jobs are scheduled using EDF. The second algorithm is an on-line and adaptive speed adjustment mechanism which reduces the energy consumption while the system is running.Comment: The manuscript corresponds to the final version of SUTC 2008 conferenc

    Impact of gate-level clustering on automated system partitioning of 3D-ICs

    Full text link
    When partitioning gate-level netlists using graphs, it is beneficial to cluster gates to reduce the order of the graph and preserve some characteristics of the circuit that the partitioning might degrade. Gate clustering is even more important for netlist partitioning targeting 3D system integration. In this paper, we make the argument that the choice of clustering method for 3D-ICs partitioning is not trivial and deserves careful consideration. To support our claim, we implemented three clustering methods that were used prior to partitioning two synthetic designs representing two extremes of the circuits medium/long interconnect diversity spectrum. Automatically partitioned netlists are then placed and routed in 3D to compare the impact of clustering methods on several metrics. From our experiments, we see that the clustering method indeed has a different impact depending on the design considered and that a circuit-blind, universal partitioning method is not the way to go, with wire-length savings of up to 31%, total power of up to 22%, and effective frequency of up to 15% compared to other methods. Furthermore, we highlight that 3D-ICs open new opportunities to design systems with a denser interconnect, drastically reducing the design utilization of circuits that would not be considered viable in 2D.Comment: 8 pages, 6 figure

    Techniques Optimizing the Number of Processors to Schedule Multi-threaded Tasks

    Full text link
    These last years, we have witnessed a dramatic increase in the number of cores available in computational platforms. Concurrently, a new coding paradigm dividing tasks into smaller execution instances called threads, was developed to take advantage of the inherent parallelism of multiprocessor platforms. However, only few methods were proposed to efficiently schedule hard real-time multi-threaded tasks on multiprocessor. In this paper, we propose techniques optimizing the number of processors needed to schedule such sporadic parallel tasks with constrained deadlines. We first define an optimization problem determining, for each thread, an intermediate (artificial) deadline minimizing the number of processors needed to schedule the whole task set. The scheduling algorithm can then schedule threads as if they were independent sequential sporadic tasks. The second contribution is an efficient and nevertheless optimal algorithm that can be executed online to determine the thread's deadlines. Hence, it can be used in dynamic systems were all tasks and their characteristics are not known a priori. We finally prove that our techniques achieve a resource augmentation bound of 2 when the threads are scheduled with algorithms such as U-EDF, PD2, LLREF, DP-Wrap, etc. © 2012 IEEE.SCOPUS: cp.pinfo:eu-repo/semantics/publishe

    MemPool-3D: Boosting Performance and Efficiency of Shared-L1 Memory Many-Core Clusters with 3D Integration

    Full text link
    Three-dimensional integrated circuits promise power, performance, and footprint gains compared to their 2D counterparts, thanks to drastic reductions in the interconnects' length through their smaller form factor. We can leverage the potential of 3D integration by enhancing MemPool, an open-source many-core design with 256 cores and a shared pool of L1 scratchpad memory connected with a low-latency interconnect. MemPool's baseline 2D design is severely limited by routing congestion and wire propagation delay, making the design ideal for 3D integration. In architectural terms, we increase MemPool's scratchpad memory capacity beyond the sweet spot for 2D designs, improving performance in a common digital signal processing kernel. We propose a 3D MemPool design that leverages a smart partitioning of the memory resources across two layers to balance the size and utilization of the stacked dies. In this paper, we explore the architectural and the technology parameter spaces by analyzing the power, performance, area, and energy efficiency of MemPool instances in 2D and 3D with 1 MiB, 2 MiB, 4 MiB, and 8 MiB of scratchpad memory in a commercial 28 nm technology node. We observe a performance gain of 9.1% when running a matrix multiplication on the MemPool-3D design with 4 MiB of scratchpad memory compared to the MemPool 2D counterpart. In terms of energy efficiency, we can implement the MemPool-3D instance with 4 MiB of L1 memory on an energy budget 15% smaller than its 2D counterpart, and even 3.7% smaller than the MemPool-2D instance with one-fourth of the L1 scratchpad memory capacity.Comment: Accepted for publication in DATE 2022 -- Design, Automation and Test in Europe Conferenc

    PathFinding - Design Methodology for 3D-Stacked Integrated Circuits: Tool Chain and Case Studies

    No full text
    info:eu-repo/semantics/nonPublishe

    Library-level characterization of sub-10nm processing nodes

    No full text
    info:eu-repo/semantics/publishe

    Mathematical morphology and Parallel reconfigurable systems based on FPGAs

    No full text
    info:eu-repo/semantics/publishe

    3-D Integration from System Design Perspective

    No full text
    SOC 2009info:eu-repo/semantics/publishe

    3D-ICs: Technology, System design challenges, Solutions & Benefits

    No full text
    info:eu-repo/semantics/nonPublishe
    • …
    corecore