386 research outputs found

    Resource-aware scheduling for 2D/3D multi-/many-core processor-memory systems

    Get PDF
    This dissertation addresses the complexities of 2D/3D multi-/many-core processor-memory systems, focusing on two key areas: enhancing timing predictability in real-time multi-core processors and optimizing performance within thermal constraints. The integration of an increasing number of transistors into compact chip designs, while boosting computational capacity, presents challenges in resource contention and thermal management. The first part of the thesis improves timing predictability. We enhance shared cache interference analysis for set-associative caches, advancing the calculation of Worst-Case Execution Time (WCET). This development enables accurate assessment of cache interference and the effectiveness of partitioned schedulers in real-world scenarios. We introduce TCPS, a novel task and cache-aware partitioned scheduler that optimizes cache partitioning based on task-specific WCET sensitivity, leading to improved schedulability and predictability. Our research explores various cache and scheduling configurations, providing insights into their performance trade-offs. The second part focuses on thermal management in 2D/3D many-core systems. Recognizing the limitations of Dynamic Voltage and Frequency Scaling (DVFS) in S-NUCA many-core processors, we propose synchronous thread migrations as a thermal management strategy. This approach culminates in the HotPotato scheduler, which balances performance and thermal safety. We also introduce 3D-TTP, a transient temperature-aware power budgeting strategy for 3D-stacked systems, reducing the need for Dynamic Thermal Management (DTM) activation. Finally, we present 3QUTM, a novel method for 3D-stacked systems that combines core DVFS and memory bank Low Power Modes with a learning algorithm, optimizing response times within thermal limits. This research contributes significantly to enhancing performance and thermal management in advanced processor-memory systems

    Gestión de recursos energéticamente eficiente para aplicaciones paralelas basadas en tareas en entornos multi-aplicación

    Get PDF
    Tesis de la Universidad Complutense de Madrid, Facultad de Informática, leída el 28/01/2021The end of Dennard scaling, as well as the arrival of the post-Moore era, has meant a big change in the way performance and energy efficiency are achieved by modern processors. From a constant increase of the clock frequency as the main method to increase performance at the beginning of the 2000s, the increase in the number of cores inside processors running at relatively conservative frequencies has stabilised as the current trend to increase both performance and energy efficiency. The increase of the heterogeneity in the systems, both inside the processors comprising different types of cores (e.g., big LITTLE architectures) or adding specific compute units (like multimedia extensions), as well as in the platform by the addition of other specific compute units (like GPUs), offering different performance and energy-efficiency trade-offs. Together with the increase in the number of cores, the processor evolution has been accompanied by the addition of different techologies that allow processors to adapt dynamically to the changes in the environment and running aplications. Among others, techiniques like dynamic voltage and frequiency scaling, power capping or cache partitioning are widely used nowadays to increase the performance and/or energy-efficiency...El fin del escalado de Dennard, así como la llegada de la era post-Moore ha supuesto una gran revolución en la forma de obtener el rendimiento y eficiencia energética en los procesadores modernos. Desde un incremento constante en la frecuencia relativamente moderadas se ha impuesto como la tendencia actual para incrementar tanto el rendimiento como la eficiencia energética. El aumento del número de núcleos dentro del procesado ha venido acompañado en los últimos años por el aumento de la heterogeneidad en la plataforma, tanto dentro del procesador incorporando distintos tipos de núcleos en el mismo procesador (e.g., la arquitectura big.LITTLE) como añadiendo unidades de cómputo específicas (e.g., extensiones multimedia), como la incorporación de otros elementos de computo específicos, ofreciendo diferentes grados de rendimiento y eficiencia energética. La evolución de los procesadores no solo ha venido dictada por el aumento del número de núcleos, sino que ha venido acompañada por la incorporación de diferentes técnicas permitiendo la adaptación de las arquitecturas de forma dinámica al entorno así como a las aplicaciones en ejecución. Entre otras, técnicas como el escalado de frecuencia, la limitación de consumo o el particionado de la memoria caché son ampliamente utilizadas en la actualidad como métodos para incrementar el consumo y/o la eficiencia energética...Fac. de InformáticaTRUEunpu

    Towards Optimal Application Mapping for Energy-Efficient Many-Core Platforms

    Get PDF
    Siirretty Doriast

    Enhanced applicability of loop transformations

    Get PDF

    Real-Time Sensor Networks and Systems for the Industrial IoT

    Get PDF
    The Industrial Internet of Things (Industrial IoT—IIoT) has emerged as the core construct behind the various cyber-physical systems constituting a principal dimension of the fourth Industrial Revolution. While initially born as the concept behind specific industrial applications of generic IoT technologies, for the optimization of operational efficiency in automation and control, it quickly enabled the achievement of the total convergence of Operational (OT) and Information Technologies (IT). The IIoT has now surpassed the traditional borders of automation and control functions in the process and manufacturing industry, shifting towards a wider domain of functions and industries, embraced under the dominant global initiatives and architectural frameworks of Industry 4.0 (or Industrie 4.0) in Germany, Industrial Internet in the US, Society 5.0 in Japan, and Made-in-China 2025 in China. As real-time embedded systems are quickly achieving ubiquity in everyday life and in industrial environments, and many processes already depend on real-time cyber-physical systems and embedded sensors, the integration of IoT with cognitive computing and real-time data exchange is essential for real-time analytics and realization of digital twins in smart environments and services under the various frameworks’ provisions. In this context, real-time sensor networks and systems for the Industrial IoT encompass multiple technologies and raise significant design, optimization, integration and exploitation challenges. The ten articles in this Special Issue describe advances in real-time sensor networks and systems that are significant enablers of the Industrial IoT paradigm. In the relevant landscape, the domain of wireless networking technologies is centrally positioned, as expected

    Energy efficient hardware acceleration of multimedia processing tools

    Get PDF
    The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
    corecore