31 research outputs found

    Online Convex Optimization with Binary Constraints

    Full text link
    We consider online optimization with binary decision variables and convex loss functions. We design a new algorithm, binary online gradient descent (bOGD) and bound its expected dynamic regret. We provide a regret bound that holds for any time horizon and a specialized bound for finite time horizons. First, we present the regret as the sum of the relaxed, continuous round optimum tracking error and the rounding error of our update in which the former asymptomatically decreases with time under certain conditions. Then, we derive a finite-time bound that is sublinear in time and linear in the cumulative variation of the relaxed, continuous round optima. We apply bOGD to demand response with thermostatically controlled loads, in which binary constraints model discrete on/off settings. We also model uncertainty and varying load availability, which depend on temperature deadbands, lockout of cooling units and manual overrides. We test the performance of bOGD in several simulations based on demand response. The simulations corroborate that the use of randomization in bOGD does not significantly degrade performance while making the problem more tractable

    Learning to Shift Thermostatically Controlled Loads

    Get PDF
    Demand response is a key mechanism for accommodating renewable power in the electric grid. Models of loads in demand response programs are typically assumed to be known a priori, leaving the load aggregator the task of choosing the best command. However, accurate load models are often hard to obtain. To address this problem, we propose an online learning algorithm that performs demand response while learning the model of an aggregation of thermostatically controlled loads. Specifically, we combine an adversarial multi-armed bandit framework with a standard formulation of load-shifting. We develop an Exp3-like algorithm to solve the learning problems. Numerical examples based on Ontario load data confirm that the algorithm achieves sub-linear regret and performs within 1% of the ideal case when the load is perfectly known.

    Dynamic and Distributed Online Convex Optimization for Demand Response of Commercial Buildings

    Full text link
    We extend the regret analysis of the online distributed weighted dual averaging (DWDA) algorithm [1] to the dynamic setting and provide the tightest dynamic regret bound known to date with respect to the time horizon for a distributed online convex optimization (OCO) algorithm. Our bound is linear in the cumulative difference between consecutive optima and does not depend explicitly on the time horizon. We use dynamic-online DWDA (D-ODWDA) and formulate a performance-guaranteed distributed online demand response approach for heating, ventilation, and air-conditioning (HVAC) systems of commercial buildings. We show the performance of our approach for fast timescale demand response in numerical simulations and obtain demand response decisions that closely reproduce the centralized optimal ones

    Approximate Multi-Agent Fitted Q Iteration

    Full text link
    We formulate an efficient approximation for multi-agent batch reinforcement learning, the approximate multi-agent fitted Q iteration (AMAFQI). We present a detailed derivation of our approach. We propose an iterative policy search and show that it yields a greedy policy with respect to multiple approximations of the centralized, standard Q-function. In each iteration and policy evaluation, AMAFQI requires a number of computations that scales linearly with the number of agents whereas the analogous number of computations increase exponentially for the fitted Q iteration (FQI), one of the most commonly used approaches in batch reinforcement learning. This property of AMAFQI is fundamental for the design of a tractable multi-agent approach. We evaluate the performance of AMAFQI and compare it to FQI in numerical simulations. Numerical examples illustrate the significant computation time reduction when using AMAFQI instead of FQI in multi-agent problems and corroborate the similar decision-making performance of both approaches

    An Online Newton's Method for Time-varying Linear Equality Constraints

    Full text link
    We consider online optimization problems with time-varying linear equality constraints. In this framework, an agent makes sequential decisions using only prior information. At every round, the agent suffers an environment-determined loss and must satisfy time-varying constraints. Both the loss functions and the constraints can be chosen adversarially. We propose the Online Projected Equality-constrained Newton Method (OPEN-M) to tackle this family of problems. We obtain sublinear dynamic regret and constraint violation bounds for OPEN-M under mild conditions. Namely, smoothness of the loss function and boundedness of the inverse Hessian at the optimum are required, but not convexity. Finally, we show OPEN-M outperforms state-of-the-art online constrained optimization algorithms in a numerical network flow application.Comment: Version takes into account reviewer comments. The contributions have been clarified. The assumptions regarding the variation of optima have been clarified. The figures have more explicit labeling of the axes. Several small typos were addressed including problems with parentheses and unnecessary line

    Second-order Online Nonconvex Optimization

    Full text link
    We present the online Newton's method, a single-step second-order method for online nonconvex optimization. We analyze its performance and obtain a dynamic regret bound that is linear in the cumulative variation between round optima. We show that if the variation between round optima is limited, the method leads to a constant regret bound. In the general case, the online Newton's method outperforms online convex optimization algorithms for convex functions and performs similarly to a specialized algorithm for strongly convex functions. We simulate the performance of the online Newton's method on a nonlinear, nonconvex moving target localization example and find that it outperforms a first-order approach

    Optimally Scheduling Public Safety Power Shutoffs

    Get PDF
    In an effort to reduce power system-caused wildfires, utilities carry out public safety power shutoffs (PSPS) in which portions of the grid are de-energized to mitigate the risk of ignition. The decision to call a PSPS must balance reducing ignition risks and the negative impact of service interruptions. In this work, we consider three PSPS scheduling scenarios, which we model as dynamic programs. In the first two scenarios, we assume that N PSPSs are budgeted as part of the investment strategy. In the first scenario, a penalty is incurred for each PSPS declared past the Nth event. In the second, we assume that some costs can be recovered if the number of PSPSs is below NN while still being subject to a penalty if above N. In the third, the system operator wants to minimize the number of PSPS such that the total expected cost is below a threshold. We provide optimal or asymptotically optimal policies for each case, the first two of which have closed-form expressions. Lastly, we establish the applicability of the first PSPS model's policy to critical-peak pricing, and obtain an optimal scheduling policy to reduce the peak demand based on weather observations

    Tensor-based Space Debris Detection for Satellite Mega-constellations

    Full text link
    Thousands of satellites, asteroids, and rocket bodies break, collide, or degrade, resulting in large amounts of space debris in low Earth orbit. The presence of space debris poses a serious threat to satellite mega-constellations and to future space missions. Debris can be avoided if detected within the safety range of a satellite. In this paper, an integrated sensing and communication technique is proposed to detect space debris for satellite mega-constellations. The canonical polyadic (CP) tensor decomposition method is used to estimate the rank of the tensor that denotes the number of paths including line-of-sight and non-line-of-sight by exploiting the sparsity of THz channel with limited scattering. The analysis reveals that the reflected signals of the THz can be utilized for the detection of space debris. The CP decomposition is cast as an optimization problem and solved using the alternating least square (ALS) algorithm. Simulation results show that the probability of detection of the proposed tensor-based scheme is higher than the conventional energy-based detection scheme for the space debris detection

    Evolution of High Throughput Satellite Systems: Vision, Requirements, and Key Technologies

    Full text link
    High throughput satellites (HTS), with their digital payload technology, are expected to play a key role as enablers of the upcoming 6G networks. HTS are mainly designed to provide higher data rates and capacities. Fueled by technological advancements including beamforming, advanced modulation techniques, reconfigurable phased array technologies, and electronically steerable antennas, HTS have emerged as a fundamental component for future network generation. This paper offers a comprehensive state-of-the-art of HTS systems, with a focus on standardization, patents, channel multiple access techniques, routing, load balancing, and the role of software-defined networking (SDN). In addition, we provide a vision for next-satellite systems that we named as extremely-HTS (EHTS) toward autonomous satellites supported by the main requirements and key technologies expected for these systems. The EHTS system will be designed such that it maximizes spectrum reuse and data rates, and flexibly steers the capacity to satisfy user demand. We introduce a novel architecture for future regenerative payloads while summarizing the challenges imposed by this architecture
    corecore