758 research outputs found

    Ocean-Atmosphere Application Scheduling within DIET

    Get PDF
    In this report, we tackle the problem of scheduling an Ocean-Atmosphere application in an heterogeneous environment. The application is used for long term climate forecast. In this context, we analyzed the execution of an experiment. An experiment is composed of several identical simulations composed of parallel tasks. On homogeneous platforms, we propose a heuristic and its optimizations, all based on the same idea: we divide the processors into disjoint sets, each group executing parallel tasks. On heterogeneous platforms the algorithm presented is applied on subsets of simulations. The computation of the subsets is done greedily and aims at minimizing the execution time by sending each subset on a cluster. We performed experiments on the french research grid \emph{Grid'5000} which exhibited some technical difficulties. We also present some modifications done to the heuristics to minimize the impact of these technical difficulties. Our simulations are then validated by experimentations

    Intelligent Decision Support System for Energy Management in Demand Response Programs and Residential and Industrial Sectors of the Smart Grid

    Get PDF
    This PhD thesis addresses the complexity of the energy efficiency control problem in residential and industrial customers of Smart electrical Grid, and examines the main factors that affect energy demand, and proposes an intelligent decision support system for applications of demand response. A multi criteria decision making algorithm is combined with a combinatorial optimization technique to assist energy managers to decide whether to participate in demand response programs or obtain energy from distributed energy resources

    Demand Response Driven Load Scheduling in Formal Smart Grid Framework

    Get PDF
    In this technical report, we present the current state of the research conducted during the first part of the PhD project named “Demand Response Driven Load Scheduling in Formal Smart Grid Framework”. The PhD project focuses on smart grids which employ information and communication technologies to assist the electricity production, distribution, and consumption. Designing smart grid applications is a novel challenging task that requires modeling, integrating, and validating different grid aspects in an efficient way. The project tackles such challenges by proposing an effective framework to formally describe smart grid elements along with their interactions. To validate this framework, the report concentrates on deploying efficiency in managing the electricity consumption in households which requires focusing on different impacts of demand response programs running in the smart grid to engage consumers to participate. A demand response system is considered which is connected to all households and utilizes their information to determine an effective load management strategy taking into account the grid constraints imposed by distribution system operators. The main responsibility of the demand response system is scheduling the operation of appliances of a large number of consumers in order to achieve a network-wide optimized performance. Finally, the PhD report demonstrates the simulation results, publications, courses, and dissemination activities done during this period. They are followed by envisaging future plans that will lead to completion of the PhD study

    Task Packing: Efficient task scheduling in unbalanced parallel programs to maximize CPU utilization

    Get PDF
    Load imbalance in parallel systems can be generated by external factors to the currently running applications like operating system noise or the underlying hardware like a heterogeneous cluster. HPC applications working on irregular data structures can also have difficulties to balance their computations across the parallel tasks. In this article we extend, improve and evaluate more deeply the Task Packing mechanism proposed in a previous work. The main idea of the mechanism is to concentrate the idle cycles of unbalanced applications in such a way that one or more CPUs are freed from execution. To achieve this, CPUs are stressed with just useful work of the parallel application tasks, provided performance is not degraded. The packing is solved by an algorithm based on the Knapsack problem, in a minimum number of CPUs and using oversubscription. We design and implement a more efficient version of such mechanism. To that end, we perform the Task Packing “in place”, taking advantage of idle cycles generated at synchronization points of unbalanced applications. Evaluations are carried out on a heterogeneous platform using FT and miniFE benchmarks. Results showed that our proposal generates low overhead. In addition the amount of freed CPUs are related to a load imbalance metric which can be used as a prediction for it.Peer ReviewedPostprint (author's final draft

    GPU Resource Optimization and Scheduling for Shared Execution Environments

    Get PDF
    General purpose graphics processing units have become a computing workhorse for a variety of data- and compute-intensive applications, from large supercomputing systems for massive data analytics to small, mobile embedded devices for autonomous vehicles. Making effective and efficient use of these processors traditionally relies on extensive programmer expertise to design and develop kernel methods which simultaneously trade off task decomposition and resource exploitation. Often, new architecture designs force code refinements in order to continue to achieve optimal performance. At the same time, not all applications require full utilization of the system to achieve that optimal performance. In this case, the increased capability of new architectures introduces an ever-widening gap between the level of resources necessary for optimal performance and the level necessary to maintain system efficiency. The ability to schedule and execute multiple independent tasks on a GPU, known generally as concurrent kernel execution, enables application programmers and system developers to balance application performance and system efficiency. Various approaches to develop both coarse- and fine-grained scheduling mechanisms to achieve a high degree of resource utilization and improved application performance have been studied. Most of these works focus on mechanisms for the management of compute resources, while a small percentage consider the data transfer channels. In this dissertation, we propose a pragmatic approach to scheduling and managing both types of resources – data transfer and compute – that is transparent to an application programmer and capable of providing near-optimal system performance. Furthermore, the approaches described herein rely on reinforcement learning methods, which enable the scheduling solutions to be flexible to a variety of factors, such as transient application behaviors, changing system designs, and tunable objective functions. Finally, we describe a framework for the practical implementation of learned scheduling policies to achieve high resource utilization and efficient system performance

    Packing Privacy Budget Efficiently

    Full text link
    Machine learning (ML) models can leak information about users, and differential privacy (DP) provides a rigorous way to bound that leakage under a given budget. This DP budget can be regarded as a new type of compute resource in workloads of multiple ML models training on user data. Once it is used, the DP budget is forever consumed. Therefore, it is crucial to allocate it most efficiently to train as many models as possible. This paper presents the scheduler for privacy that optimizes for efficiency. We formulate privacy scheduling as a new type of multidimensional knapsack problem, called privacy knapsack, which maximizes DP budget efficiency. We show that privacy knapsack is NP-hard, hence practical algorithms are necessarily approximate. We develop an approximation algorithm for privacy knapsack, DPK, and evaluate it on microbenchmarks and on a new, synthetic private-ML workload we developed from the Alibaba ML cluster trace. We show that DPK: (1) often approaches the efficiency-optimal schedule, (2) consistently schedules more tasks compared to a state-of-the-art privacy scheduling algorithm that focused on fairness (1.3-1.7x in Alibaba, 1.0-2.6x in microbenchmarks), but (3) sacrifices some level of fairness for efficiency. Therefore, using DPK, DP ML operators should be able to train more models on the same amount of user data while offering the same privacy guarantee to their users

    Performance Analysis and Design of Mobile Ad-Hoc Networks

    Get PDF
    We focus on the performance analysis and design of a wireless ad-hoc network using a virtual-circuit or reservation based medium access layer. In a reservation based MAC network, source nodes reserve a session's link capacity end-to-end over the entire path before sending traffic over the established path. An example of a generic reservation based MAC protocol is Unifying Slot Assignment Protocol (USAP). Any reservation based medium access protocol (including USAP) uses a simple set of rules to determine the cells or timeslots available at a node to reserve link capacity along the path to the next node. Given inputs of node locations, traffic pattern between nodes and link propagation matrices, we develop models to estimate blocking probability and throughput for reservation based wireless ad-hoc networks. These models are based on extending reduced load loss network models for a wireless network. For generic USAP with multiple frequency channels, the key effect of multiuser interference on a link is modeled via reduced available link capacity where the effects of transmissions and receptions in the link neighborhood are modeled using USAP reservation rules. We compare our results with simulation and obtain good results using our extended reduced load loss network models but with reduced available link capacity distribution obtained by simulation. For the case of generic USAP using a single frequency channel, we develop models for unicast traffic using reduced load loss network models but with the sharing of the wireless medium between a node and its neighbors modeled by considering cliques of neighboring interfering links around a particular link. We compare results of this model with simulation and show good match. We also develop models to calculate source-destination throughput for the reservation MAC as used in the Joint Tactical Radio System to support both unicast and multicast traffic. These models are based on extending reduced load loss network models for wireless multicast traffic with the sharing of the wireless medium between a node and its (upto 2 hop) neighbors modeled by considering cliques of interfering nodes around a particular node. We compare results of this model with simulation and show good match with simulation. Once we have developed models to estimate throughput and blocking probabilities, we use these models to optimize total network throughput. In order to optimize total throughput, we compute throughput sensitivities of the reduced load loss network model using an implied cost formulation and use these sensitivities to choose the routing probabilities among multiple paths so that total network throughput is maximized. In any network scenario, MANETs can get disconnected into clusters. As part of the MANET design problem, we look at the problem of establishing network connectivity and satisfying required traffic capacity between disconnected clusters by placing a minimum number of advantaged high flying Aerial Platforms (APs) as relay nodes at appropriate places. We also extend the connectivity solution in order to make the network single AP survivable. The problem of providing both connectivity and required capacity between disconnected ground clusters (which contain nodes that can communicate directly with each other) is formulated as a summation-form clustering problem of the ground clusters with the APs along with inter-AP distance constraints that make the AP network connected and with complexity costs that take care of ground cluster to AP capacity constraints. The resultant clustering problem is solved using Deterministic Annealing to find (near) globally optimal solutions for the minimum number and locations of the APs to establish connectivity and provide required traffic capacity between disconnected clusters. The basic connectivity constraints are extended to include conditions that make the resultant network survivable to a single AP failure. In order to make the network single AP survivable, we extend the basic connectivity solution by adding another summation form constraint so that the AP network forms a biconnected network and also by making sure that each ground cluster is connected to atleast two APs. We establish the validity of our algorithms by comparing them with optimal exhaustive search algorithms and show that our algorithms are near-optimal for the problem of establishing connectivity between disconnected clusters

    SALSA: A Formal Hierarchical Optimization Framework for Smart Grid

    Get PDF
    The smart grid, by the integration of advanced control and optimization technologies, provides the traditional grid with an indisputable opportunity to deliver and utilize the electricity more efficiently. Building smart grid applications is a challenging task, which requires a formal modeling, integration, and validation framework for various smart grid domains. The design flow of such applications must adapt to the grid requirements and ensure the security of supply and demand. This dissertation, by proposing a formal framework for customers and operations domains in the smart grid, aims at delivering a smooth way for: i) formalizing their interactions and functionalities, ii) upgrading their components independently, and iii) evaluating their performance quantitatively and qualitatively.The framework follows an event-driven demand response program taking no historical data and forecasting service into account. A scalable neighborhood of prosumers (inside the customers domain), which are equipped with smart appliances, photovoltaics, and battery energy storage systems, are considered. They individually schedule their appliances and sell/purchase their surplus/demand to/from the grid with the purposes of maximizing their comfort and profit at each instant of time. To orchestrate such trade relations, a bilateral multi-issue negotiation approach between a virtual power plant (on behalf of prosumers) and an aggregator (inside the operations domain) in a non-cooperative environment is employed. The aggregator, with the objectives of maximizing its profit and minimizing the grid purchase, intends to match prosumers' supply with demand. As a result, this framework particularly addresses the challenges of: i) scalable and hierarchical load demand scheduling, and ii) the match between the large penetration of renewable energy sources being produced and consumed. It is comprised of two generic multi-objective mixed integer nonlinear programming models for prosumers and the aggregator. These models support different scheduling mechanisms and electricity consumption threshold policies.The effectiveness of the framework is evaluated through various case studies based on economic and environmental assessment metrics. An interactive web service for the framework has also been developed and demonstrated
    • …
    corecore