153 research outputs found
Modeling, Analysis, and Hard Real-time Scheduling of Adaptive Streaming Applications
In real-time systems, the application's behavior has to be predictable at
compile-time to guarantee timing constraints. However, modern streaming
applications which exhibit adaptive behavior due to mode switching at run-time,
may degrade system predictability due to unknown behavior of the application
during mode transitions. Therefore, proper temporal analysis during mode
transitions is imperative to preserve system predictability. To this end, in
this paper, we initially introduce Mode Aware Data Flow (MADF) which is our new
predictable Model of Computation (MoC) to efficiently capture the behavior of
adaptive streaming applications. Then, as an important part of the operational
semantics of MADF, we propose the Maximum-Overlap Offset (MOO) which is our
novel protocol for mode transitions. The main advantage of this transition
protocol is that, in contrast to self-timed transition protocols, it avoids
timing interference between modes upon mode transitions. As a result, any mode
transition can be analyzed independently from the mode transitions that
occurred in the past. Based on this transition protocol, we propose a hard
real-time analysis as well to guarantee timing constraints by avoiding
processor overloading during mode transitions. Therefore, using this protocol,
we can derive a lower bound and an upper bound on the earliest starting time of
the tasks in the new mode during mode transitions in such a way that hard
real-time constraints are respected.Comment: Accepted for presentation at EMSOFT 2018 and for publication in IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
(TCAD) as part of the ESWEEK-TCAD special issu
ALOHA: A Unified Platform-Aware Evaluation Method for CNNs Execution on Heterogeneous Systems at the Edge
CNN design and deployment on embedded edge-processing systems is an error-prone and effort-hungry process, that poses the need for accurate and effective automated assisting tools. In such tools, pre-evaluating the platform-aware CNN metrics such as latency, energy cost, and throughput is a key requirement for successfully reaching the implementation goals imposed by use-case constraints. Especially when more complex parallel and heterogeneous computing platforms are considered, currently utilized estimation methods are inaccurate or require a lot of characterization experiments and efforts. In this paper, we propose an alternative method, designed to be flexible, easy to use, and accurate at the same time. Considering a modular platform and execution model that adequately describes the details of the platform and the scheduling of different CNN operators on different platform processing elements, our method captures precisely operations and data transfers and their deployment on computing and communication resources, significantly improving the evaluation accuracy. We have tested our method on more than 2000 CNN layers, targeting an FPGA-based accelerator and a GPU platform as reference example architectures. Results have shown that our evaluation method increases the estimation precision by up to 5× for execution time, and by 2\times for energy, compared to other widely used analytical methods. Moreover, we assessed the impact of the improved platform-awareness on a set of neural architecture search experiments, targeting both hardware platforms, and enforcing 2 sets of latency constraints, performing 5 trials on each search space, for a total number of 20 experiments. The predictability is improved by 4\times , reaching, with respect to alternatives, selection results clearly more similar to those obtained with on-hardware measurements
Developing an energy efficient real-time system
Increasing number of battery operated devices creates a need for energy-efficient real-time operating system for such devices. Designing a truly energy-efficient system is a multi-staged effort; this thesis consists of three main tasks that address different aspects of energy efficiency of a real-time system (RTS).
The first chapter introduces an energy-efficient algorithm that alternates processor frequency using DVFS to schedule tasks on cores. Speed profiles is calculated for every task that gives information about how long a task would run for and at what processor speed. We pair tasks with similar speed profiles to give us a resultant merged speed profile that can be efficient scheduled on a cluster. Experiments carried out on ODROID-XU3 are compared with a reference approach that provides energy saving of up to 20%.
The second chapter proposes power-aware techniques to segregate a task set over a heterogeneous platform such that the overall energy consumption is minimized. With the help of calculated speed profiles, second contribution of this work feasibly partitions a given task set into individual sets for a cluster based homogeneous platform. Various heuristics are proposed that are compared against a baseline approach with simulation results.
The final chapter of this thesis focuses on the importance of having an underlying energy-efficient operating system. We discuss an energy-efficient way of porting a real-time operating system (RTOS), QP, over TMS320F28377S along with modifications to make the Operating System (OS) consume minimal energy for its operation --Abstract, page iii
- …