Abstract-Smartphones are facing a grand challenge in extending their battery life to sustain an increasing level of processing demand while subject to miniaturized form factor. Dynamic Voltage Scaling (DVS) has emerged as a critical technique to leverage power management by lowering the supply voltage and frequency of processors. In this paper, based on the DVS technique, we propose a novel Energy-aware Dynamic Task Scheduling (EDTS) algorithm to minimize total energy consumption for smartphones while satisfying stringent time constraints for applications. This algorithm utilizes the results from a static scheduling algorithm and aggressively reduces energy consumptions on the fly. Experimental results indicate that the EDTS algorithm can significantly reduce energy consumption for smartphones, compared to the critical path scheduling method and the parallelism-based scheduling algorithm.
I. INTRODUCTION
With the unprecedented popularity of battery-powered smartphones in recent years, modern computation, communication, and entertainment are increasingly moving onto these devices. Meanwhile, the ever increasing demands in rich interactive applications, such as digital cameras, multimedia, GPS navigation units, and web browsers, have severely aggravated energy consumption problem for smartphones. Miniaturization is another vital feature of current smartphones. While the deep sub-micron to nanometer fabrication technique is an enabler to reduce the size of smartphones, this technology also exacerbates the energy consumption problem.
In order to enhance energy efficiency and process various tasks with different performances requirements, highend smartphones are designed as heterogeneous systems, * Meikang Qiu is the corresponding author. mqiu@engr.uky.edu which integrate multiple processors with distinct processing power, such as PowerVR SGX 5XT from Imagination [1] . Although multiprocessors can offer greater computation per unit of power, leading to longer battery life [2] , it is critical to investigate tighter energy budget strategies to guarantee functionalities of smartphones.
Most applications on smartphones are not delay-tolerant, and acceleration of them is often at higher expenses of energy consumption. In order to balance performance and power consumption for these applications, smartphones are usually designed with Dynamic voltage scaling (DVS) by integrating static CMOS logic into microprocessors [3] - [5] . DVS is a powerful technique to reduce energy consumption, and widely employed in various embedded systems. With the aid of this technology, different performance levels for applications can be achieved by adjusting the operating frequency of processors. By introducing model-set instructions, the voltage supply can be switched between several voltage modes, which makes it possible to implement DVS by software. Typically, DVS can be exclusively implemented in existing Real-time Operating Systems (RTOSs).
In this paper, we focus on optimizing energy consumption for heterogeneous smartphones while satisfying time-constraints of applications.We propose an Energy-aware Dynamic Task Scheduling (EDTS) to examine task communications online and minimize total energy consumption based on DVS techniques. For example, Figure 1 shows the basic architecture for smartphones, we implement our techniques in the OS level. This algorithm utilizes the results from a static scheduling algorithm and attempts to aggressively reduce energy consumption. Experimental results show that compared to the critical path scheduling method and the parallelism-based scheduling approach, our online scheduling mechanism can reduce total energy consumption by 23.1% and 34.2% on average, respectively, while meeting the given timing constraints.
The major contributions of this paper are four-fold: (1) we propose a dynamic scheduling algorithm for dealing with the runtime variations; (2) we use a critical path based static scheduling algorithm Data Flow Graph Critical Path (DFGCP) to obtain near-optimal solutions; (3) we propose an optimal algorithm Critical Path Assignment (CPA) for the critical path with a dynamic programming approach; (4) we consider both overheads for the voltage level transfer and the communication cost between different tasks, based on DVS power models.
II. RELATED WORK
In the past few years, numerous methodologies for lowpower smartphone system design have been proposed at operating system level [2] , [6] as well as architecture level [7] . A scheduler was proposed in [2] to monitor workloads for systems and adaptively schedule real-time tasks while considering the worst-case CPU demands. Through modification of the real-time scheduler and task management services in operating systems, this scheduler can boost system performance and save power consumption for heavy workloads and critical tasks. Targeting multimedia applications, the authors in [6] proposed a soft realtime CPU scheduler for mobile devices to reduce energy consumption. While these studies focus on independent tasks, we consider dependencies and real-time constraints between tasks.
A myriads of endeavors have been put forward in tackling runtime variations with DVS. For example, Gruian [8] applied a stochastic DVS technique on hard realtime systems by taking into account task dependencies. Depending on the probability distribution of the execution conditions for tasks, Lorch and Smith [9] proposed an approach to modify scaling algorithms while maintaining their performance. However, these methods assume task priorities and estimate CPU requirements off-line. We propose a two-phase scheduling algorithm, EDTS, which schedules tasks online based on the static scheduling results of an initial scheduling.
Energy-aware static scheduling is usually based on the information of the average case or worst case task execution estimation [10] . At runtime, the real execution time and energy consumption may exhibit high variations [11] , [12] , due to process variability, physical faults, and voltage/frequency changes. In our model, we expect each core in the same processor can adjust its voltage and frequency independently.
In [13] , the authors jointly presented a host of runtime and compilation techniques to conceal the heterogeneity of smartphones from developers. By investigating various features of HTC and Apple, Li and Ortiz, et al. [14] pointed out that the most significant challenge of reuse in smartphones is the design of software to accommodate heterogeneity of these devices. However, our work focuses on using a dynamic programming task scheduling technique to reduce energy consumption for smartphones with DVS enabled.
III. BASIC CONCEPTS AND MODELS
In this section, we introduce basic concepts that will be used in later sections.
A. Data Flow Graph (DFG)
In general, the tasks in smartphone applications are not stand-alone. A certain number of tasks will have precedence relationships due to different functionality of each task and communications between them. We use a Directed Acyclic Graph (DAG) to model the precedence constraints of smartphone applications.
, , , ⟩ is a node-weighted DAG, where
⟩ is a set of task nodes; ⊆ × is an edge set that defines the precedence relations among nodes in . For example, an edge ( → ) in the graph indicates that task cannot be executed until task completes. and are sets of execution time and energy consumption for all nodes in , respectively.
is a set of communication cost between tasks.
The execution time of a task can be profiled by average case execution time (ACET) or worst case execution time (WCET) when the task is executed on a processor core. We assume that the WCET and ACET of a task are always measured at the highest voltage level (i.e., with fastest speed). Our approach uses ACET for the static scheduling. An edge ∈ is associated with a weight that represents the worst-case communication cost between two dependent tasks when they are scheduled on two different processors. Generally, the communication cost between two tasks is negligible when they are executed on the same processor. There is a timing constraint for the whole task graph, which defines the time bound to finish execution the entire task graph.
B. Energy Model
The dynamic power consumption ( ) of CMOS circuits integrated in smartphones is calculated by,
where is the supply voltage, is the operating frequency, and is the effective switching capacitance. DVS reduces dynamic power consumption according to quadratic dependence on voltage.
The frequency is represented as in Equation (2).
where ℎ represents the threshold voltage, and is a device-dependent constant.
is a technology-dependent constant, which varies between 1 and 2.
IV. AN MOTIVATIONAL EXAMPLE
Figure 2(a) shows a simple application on a smartphone with 2 different voltage levels, namely 1 and 2 . This application includes 3 tasks, and the execution time and energy consumption of each voltage mode are shown in Figure 2 (b). Our objective is to schedule all tasks in the graph with the minimum energy consumption while satisfying a given time constraint.
Based on Figure 2 (a), the critical path (CP) of the task graph is 1 → 2 → 3 . Assuming the timing constraint ( ) of the smartphone application is 9 time units. Figure 2(c) illustrates the procedure to achieve the minimal energy consumption by our proposed static scheduling algorithm. The voltage level assignment for each task is recorded in a two-dimensional matrix [ ] ( represents a task, represents a time period, and represents the voltage mode assignment to task , respectively). From Figure 2 (c), we can see that the minimum energy consumption, 46, is achieved, assigning the voltage mode 2 → 1 , 1 → 2 , and 2 → 3 , respectively. From the result of 3 [9] we can obtain the assignment as follows: (1) Starting from the minimum energy consumption at 3 [9] , we know 2 is assigned to 3 and its execution time 3 (2) is 4 (for ( ), represents a node number, represents a voltage level), as shown in Figure 2(b) . (2) Calculating the sub-optimal combination of task modes before adding task 3 . We can get the index for 2 [ ] by subtracting 3 (2) from : − 3 (2) = 9 − 4 = 5. Then we arrive at the location 2 [5] , which means that the optimal energy consumption to execute all the tasks from the root to 2 is 38. By checking the mode assignment, we can see that 1 is allocated to 2 . Therefore, the execution time of task 2 is 2 (1) = 2. (3) In a similar way, we can determine that 2 is assigned to 1 and its execution time is 1 (2) = 3. Thus, the total execution time from 1 to 3 is 3 + 2 + 4 = 9 (which is not greater than 9) and the total energy consumed is 10 + 28 + 8 = 46.
V. ALGORITHMS
In this section, an algorithm, Energy-aware Dynamic Task Scheduling (EDTS), is devised to minimize the total energy consumption while satisfying the timing constraint.
For real-time applications on smartphones, we use the following major steps to implement the energy-aware scheduling. First, we partition and map the tasks in a DAG onto the microprocessors of a smartphone platform. Then, an initial schedule of DAG with the task execution order and communication links is obtained. Second, we identify the critical path (CP) by finding the path with the longest execution time. If there are more than one longest path in the graph, we select the one with the largest energy consumption in the DAG . Third, based on the ACETs for all tasks in the graph, we can obtain a static schedule by our static scheduling algorithm. Finally, within each scanning period, the whole task graph is dynamically scheduled and the execution order of each task is determined by our dynamic scheduling algorithm.
During partitioning and mapping the tasks in a DAG, we consider related architectural constraints, heterogeneity, and resource capacities of smartphone platforms. The available energy of each processor may vary over time for different applications. Whenever the resource availability varies too much, the DAG needs to be repartitioned and re-mapped onto processors to maintain energy efficiency. We adopt the partitioning scheme, VPIS, proposed in [15] to schedule tasks onto microprocessors, with the consideration of various constraints and conditions. Our objective is to balance the load and minimize the total system energy consumption.
A. The Critical Path Assignment (CPA) Optimal Algorithm
We use a dynamic programming method to solve the energy-aware scheduling problem for smartphone systems. Given the timing constraint , a DAG , and an assignment , we give several definitions as follows:
Definition 5.1: Assignment : An allocation scheme assigns a specific voltage mode to each task in a DAG.
Definition 5.2:
: A subgraph , which starts from the root of the task graph till the node . Definition 5.3: ( ) and ( ): The total energy consumption and the total execution time of under the assignment .
In our algorithm, each step achieves a currently minimum total energy consumption of while satisfying various timing constraints.
A table , ( represents a node number, and represents time) will be built, where each entry of this table stores the smallest energy that has been obtained.
In every step of our algorithm, we will consider at least one task. When two tasks are added together, total energy consumption is the sum of their energy consumption,
For each entry, we only keep the smallest total energy consumption and the corresponding voltage level assignment. When there is more than one solution, we keep the one with the smallest total execution time. If the total execution times are also the same, all solutions will be kept. When a critical path is found, we will use the optimal algorithm, CPA, to get the optimal solution for the energy-aware scheduling problem. The algorithm is shown in Algorithm V.1.
In algorithm CPA, we first build a local table , for each node. The table , only stores energy consumption of a node under different voltage levels. In the next step of the algorithm, when = 1, there is only one node. We set the initial value, and let 1, = 1, (line 1). Then we build the microprocessors. If the two tasks are implemented on the same core, is 0; otherwise, is 1. Finally, we keep the smallest total energy and the corresponding voltage selection. The energy in , is the minimum total energy for graph under the timing constraint . For example, for the DFG shown in Figure 3 (b), the initial parameters are shown in Figure 3(a) . We compute the corresponding B table of node 1 and 2 as follows. Use our CPA to find the minimal total energy consumptions and corresponding voltage assignments; 4: When we found a solution, then set Flag ← Yes; 5: end if 6: if Flag == Yes then 7: Output the assignment of G; 8: else 9: Output "No Solution"; exit; 10: end if 11: For the nodes on the non-critical path (non-CP), we will use CPA algorithm to find the minimal energy consumptions and keep the corresponding voltage levels. 12: Add together the energy of CP and non-CP, we get the minimal total energy consumptions.
In DFGCP algorithm, we first find a critical path (CP) of the DFG . If the total execution time of the CP is larger than the timing constraint , we will use the CPA algorithm to find the minimal total energy consumption and the corresponding voltage selections. In each step, we will consider the voltage level transfer overheads when using DVS. For each node, if it is not on the same processor with its parent nodes, the communication cost with the parents will be considered. Finally, if we find a solution for CP within , the algorithm continues using CPA to find the optimal solution for non-CP paths. At this time, we fix the assignments of the overlapping nodes of CP and non-CP paths.
Time Complexity: DFGCP is a polynomial time algorithm. The complexity of the CPA algorithm is (| | * * ), where | | is the number of nodes and is the given timing constraint.
is the maximum number of voltage levels. We use CPA to compute every path once. The total number of paths is bounded by (| | 2 ). Hence, CPA is a polynomial time algorithm. For a sparse graph, the number of paths is very small, assuming a constant , then the complexity is approximately linear and the amount of computation time is very small.
C. The EDTS Dynamic Scheduling Algorithm

Algorithm V.3 The EDTS Algorithm
Require: different voltage levels, a DFG G=⟨ , , , , ⟩, and a timing constraint . Ensure: A dynamic scheduling for the DFG.
1: Get the initial scheduling by DFGCP algorithm; 2: Topologically sort the nodes, getting node sequence ∈ ; 3: for each node , 1 ≤ ≤ | |, not visited, get the one with the earliest start time do 4: if required execution time is substantially different from ACET then 5: Mark it as visited; 6: Run DFGCP algorithm for the remaining nodes and find the new static schedule with minimal energy consumption while satisfying the new timing constraint
, where
is the time used); 7: else 8: Continue; 9: end if 10: Finish node , and update system energy overhead and the information (such as the starting time) of nodes that are dependent on ; 11: if current static schedule is not followed then 12: Run DFGCP algorithm for the remaining nodes and find the new static schedule with minimal energy consumption while satisfying the new timing constraint
is the time used); 13: else 14: Continue to the next node; 15: end if 16 : end for
The DFGCP static scheduling algorithm gives a solution by assuming all tasks run at ACETs. However, in real-life scenarios, we do not know in advance the actual execution time of a task for smartphone applications. The information of these tasks will change greatly in runtime, thus even an optimal static schedule can become invalid in the dynamic case. In this subsection, we present an aggressive dynamic programming based online scheduling algorithm, called Energy-aware Dynamic Task Scheduling (EDTS). EDTS algorithm uses the results from DFGCP static scheduling algorithm, which obtains a near-optimal schedule based on the knowledge of ACET of each task.
The actual execution time of a task may be greater or less than its ACET, we first obtain a static schedule with DFGCP by assuming every task takes its ACET. However, if every task aggressively runs at this statically computed average case speed during runtime, some of them may miss their deadlines. Our EDTS algorithm uses the path information to track any changes of tasks in smartphone applications. When a task node is finished, EDTS checks whether the schedule is followed. If not, then the remaining task graph will be recomputed with the DFGCP static scheduling algorithm. Also, in the course of the implementation of each node, whenever the variation of execution time exceeds the pre-specified threshold value, DFGCP will be used to recompute. For example, we set difference ratios to be ±5% between the real execution time and its ACET used previously in DFGCP. The new computation will only implement the remaining subgraph with the updated ACET values.
Time Complexity: Our dynamic scheduling algorithm, EDTS, progressively improves performance based on the schedule obtained by the static scheduling, DFGCP. The EDTS algorithm is shown in Algorithm V.3. For a sparse graph, the complexity of this algorithm is (| |(| | * * )), where | | is the number of nodes, is the given timing constraint, and is the maximum number of voltage levels. Hence, EDTS is a polynomial time algorithm. For general task graphs, since DFGCP is a polynomial time algorithm and EDTS calls (| |) times of DFGCP, EDTS is also polynomial.
VI. EXPERIMENTS
In this section, we conduct experiments with the EDTS algorithm on a set of benchmarks including Wave Digital filter (WDF), Infinite Impulse filter (IIR), Differential Pulse-Code Modulation device (DPCM), Two dimensional filter (2D), Floyd-Steinberg algorithm (Floyd), and Allpole filter. The number of tasks for these benchmarks has been augmented with the unfolding technique (the unfolding rate is 5). The proposed run-time system has been implemented and a simulation framework to evaluate its effectiveness has been built. The dynamic processor loads are obtained through measurements on a 600MHz Crusoe processor. The execution time (ACET and WCET) and energy consumption are based on the profiling. The execution time of each node follows a Gaussian distribution.
We conducted experiments using three different methods: Method 1: Dynamic version parallelism-base (PS) algorithm [16] ; Method 2: Critical path dynamical scheduling (CPDS) [17] ; Method 3: Our EDTS algorithm. Method 1 uses a greedy technique to further reclaim the slack generated during runtime. Initially all tasks are assigned with a statically computed processing speed. All the available slacks from a task due to its earlier completion are given to the next expected task running on the same processor. The speed for the next expected task will be adjusted based on its ready time [17] .
The experiments are conducted based on the power model of 70nm processor [18] . Then energy consumption per cycle can be calculated by using Equation (9) proposed in [18] . The power is derived from the formula = / . In experiments, we use different voltage types with a descending processing speed in 1 , 2 , ⋅ ⋅ ⋅, . The time and energy overheads during a voltage transition among the above voltage levels are calculated based on Equations (15) and (20) in paper [19] . We compare our results with those from Method 1 and Method 2 on a PC with a P4 2.1G processor running on Red Hat Linux 9.0.
The experimental results are shown in Figure 4 (a) to Figure 4 (c) when the number of "TC" is 2000, 3000, and 4000, respectively. In these figures, M1, M2, and EDTS represent the Method 1, Method 2, and our proposed dynamic scheduling algorithm, respectively.
As shown in these figures, our algorithm achieves significant energy reduction, compared to Method 1 and Method 2. For example, with 3 voltage levels, compared with Method 1 and Method 2, EDTS shows 30.1% and 20.2% reduction in energy consumption, respectively. This is mainly because our method uses the optimal algorithm CPA to implement the energy-aware static scheduling.
Hence, our EDTS algorithm can significantly improve the performance of smartphone systems. We can see that with more voltage-level selections, the reduction in total energy consumption is more prominent. For example, with 3 voltage levels, compared to Method 1, EDTS shows an average 30.1% reduction in total energy consumption, while using 5 voltage levels, the reduction can be achieved up to 34.2%.
VII. CONCLUSION
Smartphones are power-hungry devices. This paper studied how to minimize total energy consumption while satisfying application timing constraints for smartphone systems. We proposed a highly efficient algorithm, Energyaware Dynamic Task Scheduling (EDTS), which utilizes the results from a static scheduling algorithm and aggressively reduces energy consumption. Experimental results across a suite of benchmarks showed that our algorithm can achieve significantly higher energy efficiency for smartphones.
