341 research outputs found

    Speed-scaling with no Preemptions

    Full text link
    We revisit the non-preemptive speed-scaling problem, in which a set of jobs have to be executed on a single or a set of parallel speed-scalable processor(s) between their release dates and deadlines so that the energy consumption to be minimized. We adopt the speed-scaling mechanism first introduced in [Yao et al., FOCS 1995] according to which the power dissipated is a convex function of the processor's speed. Intuitively, the higher is the speed of a processor, the higher is the energy consumption. For the single-processor case, we improve the best known approximation algorithm by providing a (1+ϵ)αB~α(1+\epsilon)^{\alpha}\tilde{B}_{\alpha}-approximation algorithm, where B~α\tilde{B}_{\alpha} is a generalization of the Bell number. For the multiprocessor case, we present an approximation algorithm of ratio B~α((1+ϵ)(1+wmaxwmin))α\tilde{B}_{\alpha}((1+\epsilon)(1+\frac{w_{\max}}{w_{\min}}))^{\alpha} improving the best known result by a factor of (52)α1(wmaxwmin)α(\frac{5}{2})^{\alpha-1}(\frac{w_{\max}}{w_{\min}})^{\alpha}. Notice that our result holds for the fully heterogeneous environment while the previous known result holds only in the more restricted case of parallel processors with identical power functions

    New Results on Online Resource Minimization

    Full text link
    We consider the online resource minimization problem in which jobs with hard deadlines arrive online over time at their release dates. The task is to determine a feasible schedule on a minimum number of machines. We rigorously study this problem and derive various algorithms with small constant competitive ratios for interesting restricted problem variants. As the most important special case, we consider scheduling jobs with agreeable deadlines. We provide the first constant ratio competitive algorithm for the non-preemptive setting, which is of particular interest with regard to the known strong lower bound of n for the general problem. For the preemptive setting, we show that the natural algorithm LLF achieves a constant ratio for agreeable jobs, while for general jobs it has a lower bound of Omega(n^(1/3)). We also give an O(log n)-competitive algorithm for the general preemptive problem, which improves upon the known O(p_max/p_min)-competitive algorithm. Our algorithm maintains a dynamic partition of the job set into loose and tight jobs and schedules each (temporal) subset individually on separate sets of machines. The key is a characterization of how the decrease in the relative laxity of jobs influences the optimum number of machines. To achieve this we derive a compact expression of the optimum value, which might be of independent interest. We complement the general algorithmic result by showing lower bounds that rule out that other known algorithms may yield a similar performance guarantee

    The Thermal-Constrained Real-Time Systems Design on Multi-Core Platforms -- An Analytical Approach

    Get PDF
    Over the past decades, the shrinking transistor size enabled more transistors to be integrated into an IC chip, to achieve higher and higher computing performances. However, the semiconductor industry is now reaching a saturation point of Moore’s Law largely due to soaring power consumption and heat dissipation, among other factors. High chip temperature not only significantly increases packing/cooling cost, degrades system performance and reliability, but also increases the energy consumption and even damages the chip permanently. Although designing 2D and even 3D multi-core processors helps to lower the power/thermal barrier for single-core architectures by exploring the thread/process level parallelism, the higher power density and longer heat removal path has made the thermal problem substantially more challenging, surpassing the heat dissipation capability of traditional cooling mechanisms such as cooling fan, heat sink, heat spread, etc., in the design of new generations of computing systems. As a result, dynamic thermal management (DTM), i.e. to control the thermal behavior by dynamically varying computing performance and workload allocation on an IC chip, has been well-recognized as an effective strategy to deal with the thermal challenges. Over the past decades, the shrinking transistor size, benefited from the advancement of IC technology, enabled more transistors to be integrated into an IC chip, to achieve higher and higher computing performances. However, the semiconductor industry is now reaching a saturation point of Moore’s Law largely due to soaring power consumption and heat dissipation, among other factors. High chip temperature not only significantly increases packing/cooling cost, degrades system performance and reliability, but also increases the energy consumption and even damages the chip permanently. Although designing 2D and even 3D multi-core processors helps to lower the power/thermal barrier for single-core architectures by exploring the thread/process level parallelism, the higher power density and longer heat removal path has made the thermal problem substantially more challenging, surpassing the heat dissipation capability of traditional cooling mechanisms such as cooling fan, heat sink, heat spread, etc., in the design of new generations of computing systems. As a result, dynamic thermal management (DTM), i.e. to control the thermal behavior by dynamically varying computing performance and workload allocation on an IC chip, has been well-recognized as an effective strategy to deal with the thermal challenges. Different from many existing DTM heuristics that are based on simple intuitions, we seek to address the thermal problems through a rigorous analytical approach, to achieve the high predictability requirement in real-time system design. In this regard, we have made a number of important contributions. First, we develop a series of lemmas and theorems that are general enough to uncover the fundamental principles and characteristics with regard to the thermal model, peak temperature identification and peak temperature reduction, which are key to thermal-constrained real-time computer system design. Second, we develop a design-time frequency and voltage oscillating approach on multi-core platforms, which can greatly enhance the system throughput and its service capacity. Third, different from the traditional workload balancing approach, we develop a thermal-balancing approach that can substantially improve the energy efficiency and task partitioning feasibility, especially when the system utilization is high or with a tight temperature constraint. The significance of our research is that, not only can our proposed algorithms on throughput maximization and energy conservation outperform existing work significantly as demonstrated in our extensive experimental results, the theoretical results in our research are very general and can greatly benefit other thermal-related research

    Qos-aware fine-grained power management in networked computing systems

    Get PDF
    Power is a major design concern of today\u27s networked computing systems, from low-power battery-powered mobile and embedded systems to high-power enterprise servers. Embedded systems are required to be power efficiency because most embedded systems are powered by battery with limited capacity. Similar concern of power expenditure rises as well in enterprise server environments due to cooling requirement, power delivery limit, electricity costs as well as environment pollutions. The power consumption in networked computing systems includes that on circuit board and that for communication. In the context of networked real-time systems, the power dissipation on wireless communication is more significant than that on circuit board. We focus on packet scheduling for wireless real-time systems with renewable energy resources. In such a scenario, it is required to transmit data with higher level of importance periodically. We formulate this packet scheduling problem as an NP-hard reward maximization problem with time and energy constraints. An optimal solution with pseudo polynomial time complexity is presented. In addition, we propose a sub-optimal solution with polynomial time complexity. Circuit board, especially processor, power consumption is still the major source of system power consumption. We provide a general-purposed, practical and comprehensive power management middleware for networked computing systems to manage circuit board power consumption thus to affect system-level power consumption. It has the functionalities of power and performance monitoring, power management (PM) policy selection and PM control, as well as energy efficiency analysis. This middleware includes an extensible PM policy library. We implemented a prototype of this middleware on Base Band Units (BBUs) with three PM policies enclosed. These policies have been validated on different platforms, such as enterprise servers, virtual environments and BBUs. In enterprise environments, the power dissipation on circuit board dominates. Regulation on computing resources on board has a significant impact on power consumption. Dynamic Voltage and Frequency Scaling (DVFS) is an effective technique to conserve energy consumption. We investigate system-level power management in order to avoid system failures due to power capacity overload or overheating. This management needs to control the power consumption in an accurate and responsive manner, which cannot be achieve by the existing black-box feedback control. Thus we present a model-predictive feedback controller to regulate processor frequency so that power budget can be satisfied without significant loss on performance. In addition to providing power guarantee alone, performance with respect to service-level agreements (SLAs) is required to be guaranteed as well. The proliferation of virtualization technology imposes new challenges on power management due to resource sharing. It is hard to achieve optimization in both power and performance on shared infrastructures due to system dynamics. We propose vPnP, a feedback control based coordination approach providing guarantee on application-level performance and underlying physical host power consumption in virtualized environments. This system can adapt gracefully to workload change. The preliminary results show its flexibility to achieve different levels of tradeoffs between power and performance as well as its robustness over a variety of workloads. It is desirable for improve energy efficiency of systems, such as BBUs, hosting soft-real time applications. We proposed a power management strategy for controlling delay and minimizing power consumption using DVFS. We use the Robbins-Monro (RM) stochastic approximation method to estimate delay quantile. We couple a fuzzy controller with the RM algorithm to scale CPU frequency that will maintain performance within the specified QoS

    MULTI-SCALE SCHEDULING TECHNIQUES FOR SIGNAL PROCESSING SYSTEMS

    Get PDF
    A variety of hardware platforms for signal processing has emerged, from distributed systems such as Wireless Sensor Networks (WSNs) to parallel systems such as Multicore Programmable Digital Signal Processors (PDSPs), Multicore General Purpose Processors (GPPs), and Graphics Processing Units (GPUs) to heterogeneous combinations of parallel and distributed devices. When a signal processing application is implemented on one of those platforms, the performance critically depends on the scheduling techniques, which in general allocate computation and communication resources for competing processing tasks in the application to optimize performance metrics such as power consumption, throughput, latency, and accuracy. Signal processing systems implemented on such platforms typically involve multiple levels of processing and communication hierarchy, such as network-level, chip-level, and processor-level in a structural context, and application-level, subsystem-level, component-level, and operation- or instruction-level in a behavioral context. In this thesis, we target scheduling issues that carefully address and integrate scheduling considerations at different levels of these structural and behavioral hierarchies. The core contributions of the thesis include the following. Considering both the network-level and chip-level, we have proposed an adaptive scheduling algorithm for wireless sensor networks (WSNs) designed for event detection. Our algorithm exploits discrepancies among the detection accuracy of individual sensors, which are derived from a collaborative training process, to allow each sensor to operate in a more energy efficient manner while the network satisfies given constraints on overall detection accuracy. Considering the chip-level and processor-level, we incorporated both temperature and process variations to develop new scheduling methods for throughput maximization on multicore processors. In particular, we studied how to process a large number of threads with high speed and without violating a given maximum temperature constraint. We targeted our methods to multicore processors in which the cores may operate at different frequencies and different levels of leakage. We develop speed selection and thread assignment schedulers based on the notion of a core's steady state temperature. Considering the application-level, component-level and operation-level, we developed a new dataflow based design flow within the targeted dataflow interchange format (TDIF) design tool. Our new multiprocessor system-on-chip (MPSoC)-oriented design flow, called TDIF-PPG, is geared towards analysis and mapping of embedded DSP applications on MPSoCs. An important feature of TDIF-PPG is its capability to integrate graph level parallelism and actor level parallelism into the application mapping process. Here, graph level parallelism is exposed by the dataflow graph application representation in TDIF, and actor level parallelism is modeled by a novel model for multiprocessor dataflow graph implementation that we call the Parallel Processing Group (PPG) model. Building on the contribution above, we formulated a new type of parallel task scheduling problem called Parallel Actor Scheduling (PAS) for chip-level MPSoC mapping of DSP systems that are represented as synchronous dataflow (SDF) graphs. In contrast to traditional SDF-based scheduling techniques, which focus on exploiting graph level (inter-actor) parallelism, the PAS problem targets the integrated exploitation of both intra- and inter-actor parallelism for platforms in which individual actors can be parallelized across multiple processing units. We address a special case of the PAS problem in which all of the actors in the DSP application or subsystem being optimized can be parallelized. For this special case, we develop and experimentally evaluate a two-phase scheduling framework with three work flows --- particle swarm optimization with a mixed integer programming formulation, particle swarm optimization with a simulated annealing engine, and particle swarm optimization with a fast heuristic based on list scheduling. Then, we extend our scheduling framework to support general PAS problem which considers the actors cannot be parallelized

    On the Design of Real-Time Systems on Multi-Core Platforms under Uncertainty

    Get PDF
    Real-time systems are computing systems that demand the assurance of not only the logical correctness of computational results but also the timing of these results. To ensure timing constraints, traditional real-time system designs usually adopt a worst-case based deterministic approach. However, such an approach is becoming out of sync with the continuous evolution of IC technology and increased complexity of real-time applications. As IC technology continues to evolve into the deep sub-micron domain, process variation causes processor performance to vary from die to die, chip to chip, and even core to core. The extensive resource sharing on multi-core platforms also significantly increases the uncertainty when executing real-time tasks. The traditional approach can only lead to extremely pessimistic, and thus, unpractical design of real-time systems. Our research seeks to address the uncertainty problem when designing real-time systems on multi-core platforms. We first attacked the uncertainty problem caused by process variation. We proposed a virtualization framework and developed techniques to optimize the system\u27s performance under process variation. We further studied the problem on peak temperature minimization for real-time applications on multi-core platforms. Three heuristics were developed to reduce the peak temperature for real-time systems. Next, we sought to address the uncertainty problem in real-time task execution times by developing statistical real-time scheduling techniques. We studied the problem of fixed-priority real-time scheduling of implicit periodic tasks with probabilistic execution times on multi-core platforms. We further extended our research for tasks with explicit deadlines. We introduced the concept of harmonic to a more general task set, i.e. tasks with explicit deadlines, and developed new task partitioning techniques. Throughout our research, we have conducted extensive simulations to study the effectiveness and efficiency of our developed techniques. The increasing process variation and the ever-increasing scale and complexity of real-time systems both demand a paradigm shift in the design of real-time applications. Effectively dealing with the uncertainty in design of real-time applications is a challenging but also critical problem. Our research is such an effort in this endeavor, and we conclude this dissertation with discussions of potential future work

    A general framework for handling commitment in online throughput maximization

    Full text link
    We study a fundamental online job admission problem where jobs with deadlines arrive online over time at their release dates, and the task is to determine a preemptive single-server schedule which maximizes the number of jobs that complete on time. To circumvent known impossibility results, we make a standard slackness assumption by which the feasible time window for scheduling a job is at least 1+ε1+\varepsilon times its processing time, for some ε>0\varepsilon>0. We quantify the impact that different provider commitment requirements have on the performance of online algorithms. Our main contribution is one universal algorithmic framework for online job admission both with and without commitments. Without commitment, our algorithm with a competitive ratio of O(1/ε)O(1/\varepsilon) is the best possible (deterministic) for this problem. For commitment models, we give the first non-trivial performance bounds. If the commitment decisions must be made before a job's slack becomes less than a δ\delta-fraction of its size, we prove a competitive ratio of O(ε/((εδ)δ2))O(\varepsilon/((\varepsilon-\delta)\delta^2)), for 0<δ<ε0<\delta<\varepsilon. When a provider must commit upon starting a job, our bound is O(1/ε2)O(1/\varepsilon^2). Finally, we observe that for scheduling with commitment the restriction to the `unweighted' throughput model is essential; if jobs have individual weights, we rule out competitive deterministic algorithms

    Energy-Efficient Transaction Scheduling in Data Systems

    Get PDF
    Natural short term fluctuations in the load of transactional data systems present an opportunity for power savings. For example, a system handling 1000 requests per second on average can expect more than 1000 requests in some seconds, fewer in others. By quickly adjusting processing capacity to match such fluctuations, power consumption can be reduced. Many systems do this already, using dynamic voltage and frequency scaling (DVFS) to reduce processor performance and power consumption when the load is low. DVFS is typically controlled by frequency governors in the operating system or by the processor itself. The work presented in this dissertation shows that transactional data systems can manage DVFS more effectively than the underlying operating system. This is because data systems have more information about the workload, and more control over that workload, than is available to the operating system. Our goal is to minimize power consumption while ensuring that transaction requests meet specified latency targets. We present energy-efficient scheduling algorithms and systems that manage CPU power consumption and performance within data systems. These algorithms are workload-aware and can accommodate concurrent workloads with different characteristics and latency budgets. The first technique we present is called POLARIS. It directly manages processor DVFS and controls database transaction scheduling. We show that POLARIS can simultaneously reduce power consumption and reduce missed latency targets, relative to operating-system-based DVFS governors. Second, we present PLASM, an energy-efficient scheduler that generalizes POLARIS to support multi-core, multi-processor systems. PLASM controls the distribution of requests to the processors, and it employs POLARIS to manage power consumption locally at each core. We show that PLASM can save power and reduce missed latency targets compared to generic routing techniques such as round-robin

    Survey on dynamic resource allocation strategy in cloud computing enviornment

    Get PDF
    Abstract-Cloud computing becomes quite popular among cloud users by offering a variety of resources. This is an on demand service because it offers dynamic flexible resource allocation and guaranteed services in pay as-you-use manner to public. In this paper, we present the several dynamic resource allocation techniques and its performance. This paper provides detailed description of the dynamic resource allocation technique in cloud for cloud users and comparative study provides the clear detail about the different techniques
    corecore