814 research outputs found

    Stochastic Optimal Control of Grid-Level Storage

    Get PDF
    The primary focus of this dissertation is the design, analysis and implementation of stochastic optimal control of grid-level storage. It provides stochastic, quantitative models to aid decision-makers with rigorous, analytical tools that capture high uncertainty of storage control problems. The first part of the dissertation presents a pp-periodic Markov Decision Process (MDP) model, which is suitable for mitigating end-of-horizon effects. This is an extension of basic MDP, where the process follows the same pattern every pp time periods. We establish improved near-optimality bounds for a class of greedy policies, and derive a corresponding value-iteration algorithm suitable for periodic problems. A parallel implementation of the algorithm is provided on a grid-level storage control problem that involves stochastic electricity prices following a daily cycle. Additional analysis shows that the optimal policy is threshold policy. The second part of the dissertation is concerned with grid-level battery storage operations, taking battery aging phenomenon (battery degradation) into consideration. We still model the storage control problem as a MDP with an extra state variable indicating the aging status of the battery. An algorithm that takes advantage of the problem structure and works directly on the continuous state space is developed to maximize the expected cumulated discounted rewards over the life of the battery. The algorithm determines an optimal policy by solving a sequence of quasiconvex problems indexed by a battery-life state. Computational results are presented to compare the proposed approach to a standard dynamic programming method, and to evaluate the impact of refinements in the battery model. Error bounds for the proposed algorithm are established to demonstrate its accuracy. A generalization of price model to a class of Markovian regime-switching processes is also provided. The last part of this dissertation is concerned with how the ownership of energy storage make an impact on the price. Instead of one player in most storage control problems, we consider two players (consumer and supplier) in this market. Energy storage operations are modeled as an infinite-horizon Markov Game with random demand to maximize the expected discounted cumulated welfare of different players. A value iteration framework with bimatrix game embedded is provided to find equilibrium policies for players. Computational results show that the gap between optimal policies and obtained policies can be ignored. The assumption that storage levels are common knowledge is made without much loss of generality, because a learning algorithm is proposed that allows a player to ultimately identify the storage level of the other player. The expected value improvement from keeping the storage information private at the beginning of the game is then shown to be insignificant

    Aggregate constrained inventory systems with independent multi-product demand: control practices and theoretical limitations

    Get PDF
    In practice, inventory managers are often confronted with a need to consider one or more aggregate constraints. These aggregate constraints result from available workspace, workforce, maximum investment or target service level. We consider independent multi-item inventory problems with aggregate constraints and one of the following characteristics: deterministic leadtime demand, newsvendor, basestock policy, rQ policy and sS policy. We analyze some recent relevant references and investigate the considered versions of the problem, the proposed model formulations and the algorithmic approaches. Finally we highlight the limitations from a practical viewpoint for these models and point out some possible direction for future improvements

    Stick-Breaking Policy Learning in Dec-POMDPs

    Get PDF
    Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from optimal. This paper considers a variable-size FSC to represent the local policy of each agent. These variable-size FSCs are constructed using a stick-breaking prior, leading to a new framework called \emph{decentralized stick-breaking policy representation} (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the Dec-POMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods

    Lost in optimisation of water distribution systems? A literature review of system operation

    Get PDF
    This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this record.Optimisation of the operation of water distribution systems has been an active research field for almost half a century. It has focused mainly on optimal pump operation to minimise pumping costs and optimal water quality management to ensure that standards at customer nodes are met. This paper provides a systematic review by bringing together over two hundred publications from the past three decades, which are relevant to operational optimisation of water distribution systems, particularly optimal pump operation, valve control and system operation for water quality purposes of both urban drinking and regional multiquality water distribution systems. Uniquely, it also contains substantial and thorough information for over one hundred publications in a tabular form, which lists optimisation models inclusive of objectives, constraints, decision variables, solution methodologies used and other details. Research challenges in terms of simulation models, optimisation model formulation, selection of optimisation method and postprocessing needs have also been identified

    Resource Management for Distributed Estimation via Sparsity-Promoting Regularization

    Get PDF
    Recent advances in wireless communications and electronics have enabled the development of low-cost, low-power, multifunctional sensor nodes that are small in size and communicate untethered in a sensor network. These sensor nodes can sense, measure, and gather information from the environment and, based on some local processing, they transmit the sensed data to a fusion center that is responsible for making the global inference. Sensor networks are often tasked to perform parameter estimation; example applications include battlefield surveillance, medical monitoring, and navigation. However, under limited resources, such as limited communication bandwidth and sensor battery power, it is important to design an energy-efficient estimation architecture. The goal of this thesis is to provide a fundamental understanding and characterization of the optimal tradeoffs between estimation accuracy and resource usage in sensor networks. In the thesis, two basic issues of resource management are studied, sensor selection/scheduling and sensor collaboration for distributed estimation, where the former refers to finding the best subset of sensors to activate for data acquisition in order to minimize the estimation error subject to a constraint on the number of activations, and the latter refers to seeking the optimal inter-sensor communication topology and energy allocation scheme for distributed estimation systems. Most research on resource management so far has been based on several key assumptions, a) independence of observation, b) strict resource constraints, and c) absence of inter-sensor communication, which lend analytical tractability to the problem but are often found lacking in practice. This thesis introduces novel techniques to relax these assumptions and provide new insights into addressing resource management problems. The thesis analyzes how noise correlation affects solutions of sensor selection problems, and proposes both a convex relaxation approach and a greedy algorithm to find these solutions. Compared to the existing sensor selection approaches that are limited to the case of uncorrelated noise or weakly correlated noise, the methodology proposed in this thesis is valid for any arbitrary noise correlation regime. Moreover, this thesis shows a correspondence between active sensors and the nonzero columns of an estimator gain matrix. Based on this association, a sparsity-promoting optimization framework is established, where the desire to reduce the number of selected sensors is characterized by a sparsity-promoting penalty term in the objective function. Instead of placing a hard constraint on sensor activations, the promotion of sparsity leads to trade-offs between estimation performance and the number of selected sensors. To account for the individual power constraint of each sensor, a novel sparsity-promoting penalty function is presented to avoid scenarios in which the same sensors are successively selected. For solving the proposed optimization problem, we employ the alternating direction method of multipliers (ADMM), which allows the optimization problem to be decomposed into subproblems that can be solved analytically to obtain exact solutions. The problem of sensor collaboration arises when inter-sensor communication is incorporated in sensor networks, where sensors are allowed to update their measurements by taking a linear combination of the measurements of those they interact with prior to transmission to a fusion center. In this thesis, a sparsity-aware optimization framework is presented for the joint design of optimal sensor collaboration and selection schemes, where the cost of sensor collaboration is associated with the number of nonzero entries of a collaboration matrix, and the cost of sensor selection is characterized by the number of nonzero rows of the collaboration matrix. It is shown that a) the presence of sensor collaboration smooths out the observation noise, thereby improving the quality of the signal and eventual estimation performance, and b) there exists a trade-off between sensor selection and sensor collaboration. This thesis further addresses the problem of sensor collaboration for the estimation of time-varying parameters in dynamic networks that involve, for example, time-varying observation gains and channel gains. Impact of parameter correlation and temporal dynamics of sensor networks on estimation performance is illustrated from both theoretical and practical points of view. Last but not least, optimal energy allocation and storage control polices are designed in sensor networks with energy-harvesting nodes. We show that the resulting optimization problem can be solved as a special nonconvex problem, where the only source of nonconvexity can be isolated to a constraint that contains the difference of convex functions. This specific problem structure enables the use of a convex-concave procedure to obtain a near-optimal solution

    Market-Based Scheduling in Distributed Computing Systems

    Get PDF
    In verteilten Rechensystemen (bspw. im Cluster und Grid Computing) kann eine Knappheit der zur Verfügung stehenden Ressourcen auftreten. Hier haben Marktmechanismen das Potenzial, Ressourcenbedarf und -angebot durch geeignete Anreizmechanismen zu koordinieren und somit die ökonomische Effizienz des Gesamtsystems zu steigern. Diese Arbeit beschäftigt sich anhand vier spezifischer Anwendungsszenarien mit der Frage, wie Marktmechanismen für verteilte Rechensysteme ausgestaltet sein sollten

    A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment

    Get PDF
    We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function

    Optimal energy management for a grid-tied solar PV-battery microgrid: A reinforcement learning approach

    Get PDF
    There has been a shift towards energy sustainability in recent years, and this shift should continue. The steady growth of energy demand because of population growth, as well as heightened worries about the number of anthropogenic gases released into the atmosphere and deployment of advanced grid technologies, has spurred the penetration of renewable energy resources (RERs) at different locations and scales in the power grid. As a result, the energy system is moving away from the centralized paradigm of large, controllable power plants and toward a decentralized network based on renewables. Microgrids, either grid-connected or islanded, provide a key solution for integrating RERs, load demand flexibility, and energy storage systems within this framework. Nonetheless, renewable energy resources, such as solar and wind energy, can be extremely stochastic as they are weather dependent. These resources coupled with load demand uncertainties lead to random variations on both the generation and load sides, thus challenging optimal energy management. This thesis develops an optimal energy management system (EMS) for a grid-tied solar PV-battery microgrid. The goal of the EMS is to obtain the minimum operational costs (cost of power exchange with the utility and battery wear cost) while still considering network constraints, which ensure grid violations are avoided. A reinforcement learning (RL) approach is proposed to minimize the operational cost of the microgrid under this stochastic setting. RL is a reward-motivated optimization technique derived from how animals learn to optimize their behaviour in new environments. Unlike other conventional model-based optimization approaches, RL doesn't need an explicit model of the optimization system to get optimal solutions. The EMS is modelled as a Markov Decision Process (MDP) to achieve optimality considering the state, action, and reward function. The feasibility of two RL algorithms, namely, conventional Q-learning algorithm and deep Q network algorithm, are developed, and their efficacy in performing optimal energy management for the designed system is evaluated in this thesis. First, the energy management problem is expressed as a sequential decision-making process, after which two algorithms, trading, and non-trading algorithm, are developed. In the trading algorithm case, excess microgrid's energy can be sold back to the utility to increase revenue, while in the latter case constraining rules are embedded in the designed EMS to ensure that no excess energy is sold back to the utility. Then a Q-learning algorithm is developed to minimize the operational cost of the microgrid under unknown future information. Finally, to evaluate the performance of the proposed EMS, a comparison study between a trading case EMS model and a non-trading case is performed using a typical commercial load curve and PV generation profile over a 24- hour horizon. Numerical simulation results indicated that the algorithm learned to select an optimized energy schedule that minimizes energy cost (cost of power purchased from the utility based on the time-varying tariff and battery wear cost) in both summer and winter case studies. However, comparing the non-trading EMS to the trading EMS model operational costs, the latter one decreased cost by 4.033% in the summer season and 2.199% in the winter season. Secondly, a deep Q network (DQN) method that uses recent learning algorithm enhancements, including experience replay and target network, is developed to learn the system uncertainties, including load demand, grid prices and volatile power supply from the renewables solve the optimal energy management problem. Unlike the Q-learning method, which updates the Q-function using a lookup table (which limits its scalability and overall performance in stochastic optimization), the DQN method uses a deep neural network that approximates the Q- function via statistical regression. The performance of the proposed method is evaluated with differently fluctuating load profiles, i.e., slow, medium, and fast. Simulation results substantiated the efficacy of the proposed method as the algorithm was established to learn from experience to raise the battery state of charge and optimally shift loads from a one-time instance, thus supporting the utility grid in reducing aggregate peak load. Furthermore, the performance of the proposed DQN approach was compared to the conventional Q-learning algorithm in terms of achieving a minimum global cost. Simulation results showed that the DQN algorithm outperformed the conventional Q-learning approach, reducing system operational costs by 15%, 24%, and 26% for the slow, medium, and fast fluctuating load profiles in the studied cases
    corecore