532 research outputs found

    Dynamic Energy and Thermal Management of Multi-Core Mobile Platforms: A Survey

    Get PDF
    Multi-core mobile platforms are on rise as they enable efficient parallel processing to meet ever-increasing performance requirements. However, since these platforms need to cater for increasingly dynamic workloads, efficient dynamic resource management is desired mainly to enhance the energy and thermal efficiency for better user experience with increased operational time and lifetime of mobile devices. This article provides a survey of dynamic energy and thermal management approaches for multi-core mobile platforms. These approaches do either proactive or reactive management. The upcoming trends and open challenges are also discussed

    Heuristics for Routing and Spiral Run-time Task Mapping in NoC-based Heterogeneous MPSOCs

    Full text link
    This paper describes a new Spiral Dynamic Task Mapping heuristic for mapping applications onto NoC-based Heterogeneous MPSoC. The heuristic proposed in this paper attempts to map the tasks of an applications that are most related to each other in spiral manner and to find the best possible path load that minimizes the communication overhead. In this context, we have realized a simulation environment for experimental evaluations to map applications with varying number of tasks onto an 8x8 NoC-based Heterogeneous MPSoCs platform, we demonstrate that the new mapping heuristics with the new modified dijkstra routing algorithm proposed are capable of reducing the total execution time and energy consumption of applications when compared to state-of the-art run-time mapping heuristics reported in the literature

    Mapeo estático y dinámico de tareas en sistemas multiprocesador, basados en redes en circuito integrado

    Get PDF
    RESUMEN: Las redes en circuito integrado (NoC) representan un importante paradigma de uso creciente para los sistemas multiprocesador en circuito integrado (MPSoC), debido a su flexibilidad y escalabilidad. Las estrategias de tolerancia a fallos han venido adquiriendo importancia, a medida que los procesos de manufactura incursionan en dimensiones por debajo del micrómetro y la complejidad de los diseños aumenta. Este artículo describe un algoritmo de aprendizaje incremental basado en población (PBIL), orientado a optimizar el proceso de mapeo en tiempo de diseño, así como a encontrar soluciones de mapeo óptimas en tiempo de ejecución, para hacer frente a fallos de único nodo en la red. En ambos casos, los objetivos de optimización corresponden al tiempo de ejecución de las aplicaciones y al ancho de banda pico que aparece en la red. Las simulaciones se basaron en un algoritmo de ruteo XY determinístico, operando sobre una topología de malla 2D para la NoC. Los resultados obtenidos son prometedores. El algoritmo propuesto exhibe un desempeño superior a otras técnicas reportadas cuando el tamaño del problema aumenta.ABSTARCT: Due to its scalability and flexibility, Network-on-Chip (NoC) is a growing and promising communication paradigm for Multiprocessor System-on-Chip (MPSoC) design. As the manufacturing process scales down to the deep submicron domain and the complexity of the system increases, fault-tolerant design strategies are gaining increased relevance. This paper exhibits the use of a Population-Based Incremental Learning (PBIL) algorithm aimed at finding the best mapping solutions at design time, as well as to finding the optimal remapping solution, in presence of single-node failures on the NoC. The optimization objectives in both cases are the application completion time and the network's peak bandwidth. A deterministic XY routing algorithm was used in order to simulate the traffic conditions in the network which has a 2D mesh topology. Obtained results are promising. The proposed algorithm exhibits a better performance, when compared with other reported approaches, as the problem size increases

    Exploring Task Mappings on Heterogeneous MPSoCs using a Bias-Elitist Genetic Algorithm

    Get PDF
    Exploration of task mappings plays a crucial role in achieving high performance in heterogeneous multi-processor system-on-chip (MPSoC) platforms. The problem of optimally mapping a set of tasks onto a set of given heterogeneous processors for maximal throughput has been known, in general, to be NP-complete. The problem is further exacerbated when multiple applications (i.e., bigger task sets) and the communication between tasks are also considered. Previous research has shown that Genetic Algorithms (GA) typically are a good choice to solve this problem when the solution space is relatively small. However, when the size of the problem space increases, classic genetic algorithms still suffer from the problem of long evolution times. To address this problem, this paper proposes a novel bias-elitist genetic algorithm that is guided by domain-specific heuristics to speed up the evolution process. Experimental results reveal that our proposed algorithm is able to handle large scale task mapping problems and produces high-quality mapping solutions in only a short time period.Comment: 9 pages, 11 figures, uses algorithm2e.st

    Predictable multi-processor system on chip design for multimedia applications

    Get PDF
    The design of multimedia systems has become increasingly complex due to consumer requirements. Consumers demand the functionalities offered by a huge desktop from these systems. Many of these systems are mobile. Therefore, power consumption and size of these devices should be small. These systems are increasingly becoming multi-processor based (MPSoCs) for the reasons of power and performance. Applications execute on these systems in different combinations also known as use-cases. Applications may have different performance requirements in each use-case. Currently, verification of all these use-cases takes bulk of the design effort. There is a need for analysis based techniques so that the platforms have a predictable behaviour and in turn provide guarantees on performance without expending precious man hours on verification. In this dissertation, techniques and architectures have been developed to design and manage these multi-processor based systems efficiently. The dissertation presents predictable architectural components for MPSoCs, a Predictable MPSoC design strategy, automatic platform synthesis tool, a run-time system and an MPSoC simulation technique. The introduction of predictability helps in rapid design of MPSoC platforms. Chapter 1 of the thesis studies the trends in modern multimedia applications and processor architectures. The chapter further highlights the problems in the design of MPSoC platforms and emphasizes the need of predictable design techniques. Predictable design techniques require predictable application and architectural components. The chapter further elaborates on Synchronous Data Flow Graphs which are used to model the applications throughout this thesis. The chapter presents the architecture template used in this thesis and enlists the contributions of the thesis. One of the contributions of this thesis is the design of a predictable component called communication assist. Chapter 2 of the thesis describes the architecture of this communication assist. The communication assist presented in this thesis not only decouples the communication from computation but also provides timing guarantees. Based on this communication assist, an MPSoC platform generation technique has been presented that can design MPSoC platforms capable of satisfying the throughput constraints of multiple applications in all use-cases. The technique is presented in Chapter 3. The design strategy uses three simple steps for platform design. In the first step it finds the required number of processors. The second step minimizes the communication interconnect between the processors and the third step minimizes the communication memory requirement of the platform. Further in Chapter 4, a tool has been developed to generate CA-based platforms for FPGAs. The output of this tool can be used to synthesize platforms on real hardware with the help of FPGA synthesis tools. The applications executing on these platforms often exhibit dynamism e.g. variation in task execution times and change in application throughput requirements. Further, new applications may often be added by consumers at run-time. Resource managers have been presented in literature to handle such dynamic situations. However, the scalability of these resource managers becomes an issue with the increase in number of processors and applications. Chapter 5 presents distributed run-time resource management techniques. Two versions of distributed resource managers have been presented which are scalable with the number of applications and processors. MPSoC platforms for real-time applications are designed assuming worst-case task execution times. It is known that the difference between average-case and worst-case behaviour can be quite large. Therefore, knowing the average case performance is also important for the system designer, and software simulation is often employed to estimate this. However, simulation in software is slow and does not scale with the number of applications and processing elements. In Chapter 6, a fast and scalable simulation methodology is introduced that can simulate the execution of multiple applications on an MPSoC platform. It is based on parallel execution of SDF (Synchronous Data Flow) models of applications. The simulation methodology uses Parallel Discrete Event Simulation (PDES) primitives and it is termed as "Smart Conservative PDES". The methodology generates a parallel simulator which is synthesizable on FPGAs. The framework can also be used to model dynamic arbitration policies which are difficult to analyse using models. The generated platform is also useful in carrying out Design Space Exploration as shown in the thesis. Finally, Chapter 7 summarizes the main findings and (practical) implications of the studies described in previous chapters of this dissertation. Using the contributions mentioned in the thesis, a designer can design and implement predictable multiprocessor based systems capable of satisfying throughput constraints of multiple applications in given set of use-cases, and employ resource management strategies to deal with dynamism in the applications. The chapter also describes the main limitations of this dissertation and makes suggestions for future research

    Energy efficient run-time mapping and thread partitioning of concurrent OpenCL applications on CPU-GPU MPSoCs

    Get PDF
    Heterogeneous Multi-Processor Systems-on-Chips (MPSoCs) containing CPU and GPU cores are typically required to execute applications concurrently. However, as will be shown in this paper, existing approaches are not well suited for concurrent applications as they are developed either by considering only a single application or they do not exploit both CPU and GPU cores at the same time. In this paper, we propose an energy-efficient run-time mapping and thread partitioning approach for executing concurrent OpenCL applications on both GPU and GPU cores while satisfying performance requirements. Depending upon the performance requirements, for each concurrently executing application, the mapping process finds the appropriate number of CPU cores and operating frequencies of CPU and GPU cores, and the partitioning process identifies an efficient partitioning of the applications’ threads between CPU and GPU cores. We validate the proposed approach experimentally on the Odroid-XU3 hardware platform with various mixes of applications from the Polybench benchmark suite. Additionally, a case-study is performed with a real-world application SLAMBench. Results show an average energy saving of 32% compared to existing approaches while still satisfying the performance requirements

    A Hybrid Task Mapping Algorithm for Heterogeneous MPSoCs

    Get PDF
    • …
    corecore