328 research outputs found
Resource-aware scheduling for 2D/3D multi-/many-core processor-memory systems
This dissertation addresses the complexities of 2D/3D multi-/many-core processor-memory systems, focusing on two key areas: enhancing timing predictability in real-time multi-core processors and optimizing performance within thermal constraints. The integration of an increasing number of transistors into compact chip designs, while boosting computational capacity, presents challenges in resource contention and thermal management. The first part of the thesis improves timing predictability. We enhance shared cache interference analysis for set-associative caches, advancing the calculation of Worst-Case Execution Time (WCET). This development enables accurate assessment of cache interference and the effectiveness of partitioned schedulers in real-world scenarios. We introduce TCPS, a novel task and cache-aware partitioned scheduler that optimizes cache partitioning based on task-specific WCET sensitivity, leading to improved schedulability and predictability. Our research explores various cache and scheduling configurations, providing insights into their performance trade-offs. The second part focuses on thermal management in 2D/3D many-core systems. Recognizing the limitations of Dynamic Voltage and Frequency Scaling (DVFS) in S-NUCA many-core processors, we propose synchronous thread migrations as a thermal management strategy. This approach culminates in the HotPotato scheduler, which balances performance and thermal safety. We also introduce 3D-TTP, a transient temperature-aware power budgeting strategy for 3D-stacked systems, reducing the need for Dynamic Thermal Management (DTM) activation. Finally, we present 3QUTM, a novel method for 3D-stacked systems that combines core DVFS and memory bank Low Power Modes with a learning algorithm, optimizing response times within thermal limits. This research contributes significantly to enhancing performance and thermal management in advanced processor-memory systems
Fundamentals
Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters
SCHEDULING REAL-TIME GRAPH-BASED WORKLOADS
Developments in the semiconductor industry in the previous decades have made possible computing platforms with very large computing capacities that, in turn, have stimulated the rapid progress of computationally intensive computer vision (CV) algorithms with highly parallelizable structure (often represented as graphs). Applications using such algorithms are the foundation for the transformation of semi-autonomous systems (e.g., advanced driver-assist systems) to future fully-autonomous systems (e.g., self-driving cars). Enabling mass-produced safety-critical systems with full autonomy requires real-time execution guarantees as a part of system certification.Since multiple CV applications may need to share the same hardware platform due to size, weight, power, and cost constraints, system component isolation is necessary to avoid explosive interference growth that breaks all execution guarantees. Existing software certification processes achieve component isolation through time partitioning, which can be broken by accelerator usage, which is essential for high-efficacy CV algorithms.The goal of this dissertation is to make a first step towards providing real-time guarantees for safety-critical systems by analyzing the scheduling of highly parallel accelerator-using workloads isolated in system components. The specific contributions are threefold.First, a general method for graph-based workloads’ response-time-bound reduction through graph structure modifications is introduced, leading to significant response-time-bound reductions. Second, a generalized real-time task model is introduced that enables real-time response-time bounds for a wider range of graph-based workloads. A proposed response-time analysis for the introduced model accounts for potential accelerator usage within tasks. Third, a scheduling approach for graph-based workloads in a single system component is proposed that ensures the temporal isolation of system components. A response-time analysis for workloads with accelerator usage is presented alongside a non-mandatory schedulability-improvement step. This approach can help to enable component-wise certification in the considered systems.Doctor of Philosoph
Turku Centre for Computer Science – Annual Report 2013
Due to a major reform of organization and responsibilities of TUCS, its role, activities, and even structures have been under reconsideration in 2013. The traditional pillar of collaboration at TUCS, doctoral training, was reorganized due to changes at both universities according to the renewed national system for doctoral education. Computer Science and Engineering and Information Systems Science are now accompanied by Mathematics and Statistics in newly established doctoral programs at both University of Turku and Åbo Akademi University. Moreover, both universities granted sufficient resources to their respective programmes for doctoral training in these fields, so that joint activities at TUCS can continue. The outcome of this reorganization has the potential of proving out to be a success in terms of scientific profile as well as the quality and quantity of scientific and educational results.
International activities that have been characteristic to TUCS since its inception continue strong. TUCS’ participation in European collaboration through EIT ICT Labs Master’s and Doctoral School is now more active than ever. The new double degree programs at MSc and PhD level between University of Turku and Fudan University in Shaghai, P.R.China were succesfully set up and are
now running for their first year. The joint students will add to the already international athmosphere of the ICT House.
The four new thematic reseach programmes set up acccording to the decision by the TUCS Board have now established themselves, and a number of events and other activities saw the light in 2013. The TUCS Distinguished Lecture Series managed to gather a large audience with its several prominent speakers. The development of these and other research centre activities continue, and
new practices and structures will be initiated to support the tradition of close academic collaboration.
The TUCS’ slogan Where Academic Tradition Meets the Exciting Future has proven true throughout these changes. Despite of the dark clouds on the national and European economic sky, science and higher education in the field have managed to retain all the key ingredients for success. Indeed, the future of ICT and Mathematics in Turku seems exciting.</p
Mixed Criticality Systems - A Review : (13th Edition, February 2022)
This review covers research on the topic of mixed criticality systems that has been published since Vestal’s 2007 paper. It covers the period up to end of 2021. The review is organised into the following topics: introduction and motivation, models, single processor analysis (including job-based, hard and soft tasks, fixed priority and EDF scheduling, shared resources and static and synchronous scheduling), multiprocessor analysis, related topics, realistic models, formal treatments, systems issues, industrial practice and research beyond mixed-criticality. A list of PhDs awarded for research relating to mixed-criticality systems is also included
A Survey of Fault-Tolerance Techniques for Embedded Systems from the Perspective of Power, Energy, and Thermal Issues
The relentless technology scaling has provided a significant increase in processor performance, but on the other hand, it has led to adverse impacts on system reliability. In particular, technology scaling increases the processor susceptibility to radiation-induced transient faults. Moreover, technology scaling with the discontinuation of Dennard scaling increases the power densities, thereby temperatures, on the chip. High temperature, in turn, accelerates transistor aging mechanisms, which may ultimately lead to permanent faults on the chip. To assure a reliable system operation, despite these potential reliability concerns, fault-tolerance techniques have emerged. Specifically, fault-tolerance techniques employ some kind of redundancies to satisfy specific reliability requirements. However, the integration of fault-tolerance techniques into real-time embedded systems complicates preserving timing constraints. As a remedy, many task mapping/scheduling policies have been proposed to consider the integration of fault-tolerance techniques and enforce both timing and reliability guarantees for real-time embedded systems. More advanced techniques aim additionally at minimizing power and energy while at the same time satisfying timing and reliability constraints. Recently, some scheduling techniques have started to tackle a new challenge, which is the temperature increase induced by employing fault-tolerance techniques. These emerging techniques aim at satisfying temperature constraints besides timing and reliability constraints. This paper provides an in-depth survey of the emerging research efforts that exploit fault-tolerance techniques while considering timing, power/energy, and temperature from the real-time embedded systems’ design perspective. In particular, the task mapping/scheduling policies for fault-tolerance real-time embedded systems are reviewed and classified according to their considered goals and constraints. Moreover, the employed fault-tolerance techniques, application models, and hardware models are considered as additional dimensions of the presented classification. Lastly, this survey gives deep insights into the main achievements and shortcomings of the existing approaches and highlights the most promising ones
FIFTY YEARS OF MICROPROCESSOR EVOLUTION: FROM SINGLE CPU TO MULTICORE AND MANYCORE SYSTEMS
Nowadays microprocessors are among the most complex electronic systems that man has ever designed. One small silicon chip can contain the complete processor, large memory and logic needed to connect it to the input-output devices. The performance of today's processors implemented on a single chip surpasses the performance of a room-sized supercomputer from just 50 years ago, which cost over $ 10 million [1]. Even the embedded processors found in everyday devices such as mobile phones are far more powerful than computer developers once imagined. The main components of a modern microprocessor are a number of general-purpose cores, a graphics processing unit, a shared cache, memory and input-output interface and a network on a chip to interconnect all these components [2]. The speed of the microprocessor is determined by its clock frequency and cannot exceed a certain limit. Namely, as the frequency increases, the power dissipation increases too, and consequently the amount of heating becomes critical. So, silicon manufacturers decided to design new processor architecture, called multicore processors [3]. With aim to increase performance and efficiency these multiple cores execute multiple instructions simultaneously. In this way, the amount of parallel computing or parallelism is increased [4]. In spite of mentioned advantages, numerous challenges must be addressed carefully when more cores and parallelism are used.This paper presents a review of microprocessor microarchitectures, discussing their generations over the past 50 years. Then, it describes the currently used implementations of the microarchitecture of modern microprocessors, pointing out the specifics of parallel computing in heterogeneous microprocessor systems. To use efficiently the possibility of multi-core technology, software applications must be multithreaded. The program execution must be distributed among the multi-core processors so they can operate simultaneously. To use multi-threading, it is imperative for programmer to understand the basic principles of parallel computing and parallel hardware. Finally, the paper provides details how to implement hardware parallelism in multicore systems
High Performance Embedded Computing
Nowadays, the prevalence of computing systems in our lives is so ubiquitous that we live in a cyber-physical world dominated by computer systems, from pacemakers to cars and airplanes. These systems demand for more computational performance to process large amounts of data from multiple data sources with guaranteed processing times. Actuating outside of the required timing bounds may cause the failure of the system, being vital for systems like planes, cars, business monitoring, e-trading, etc. High-Performance and Time-Predictable Embedded Computing presents recent advances in software architecture and tools to support such complex systems, enabling the design of embedded computing devices which are able to deliver high-performance whilst guaranteeing the application required timing bounds. Technical topics discussed in the book include: Parallel embedded platforms Programming models Mapping and scheduling of parallel computations Timing and schedulability analysis Runtimes and operating systemsThe work reflected in this book was done in the scope of the European project P SOCRATES, funded under the FP7 framework program of the European Commission. High-performance and time-predictable embedded computing is ideal for personnel in computer/communication/embedded industries as well as academic staff and master/research students in computer science, embedded systems, cyber-physical systems and internet-of-things
- …