31 research outputs found
Online Convex Optimization with Binary Constraints
We consider online optimization with binary decision variables and convex
loss functions. We design a new algorithm, binary online gradient descent
(bOGD) and bound its expected dynamic regret. We provide a regret bound that
holds for any time horizon and a specialized bound for finite time horizons.
First, we present the regret as the sum of the relaxed, continuous round
optimum tracking error and the rounding error of our update in which the former
asymptomatically decreases with time under certain conditions. Then, we derive
a finite-time bound that is sublinear in time and linear in the cumulative
variation of the relaxed, continuous round optima. We apply bOGD to demand
response with thermostatically controlled loads, in which binary constraints
model discrete on/off settings. We also model uncertainty and varying load
availability, which depend on temperature deadbands, lockout of cooling units
and manual overrides. We test the performance of bOGD in several simulations
based on demand response. The simulations corroborate that the use of
randomization in bOGD does not significantly degrade performance while making
the problem more tractable
Learning to Shift Thermostatically Controlled Loads
Demand response is a key mechanism for accommodating renewable power in the electric grid. Models of loads in demand response programs are typically assumed to be known a priori, leaving the load aggregator the task of choosing the best command. However, accurate load models are often hard to obtain. To address this problem, we propose an online learning algorithm that performs demand response while learning the model of an aggregation of thermostatically controlled loads. Specifically, we combine an adversarial multi-armed bandit framework with a standard formulation of load-shifting. We develop an Exp3-like algorithm to solve the learning problems. Numerical examples based on Ontario load data confirm that the algorithm achieves sub-linear regret and performs within 1% of the ideal case when the load is perfectly known.
Dynamic and Distributed Online Convex Optimization for Demand Response of Commercial Buildings
We extend the regret analysis of the online distributed weighted dual
averaging (DWDA) algorithm [1] to the dynamic setting and provide the tightest
dynamic regret bound known to date with respect to the time horizon for a
distributed online convex optimization (OCO) algorithm. Our bound is linear in
the cumulative difference between consecutive optima and does not depend
explicitly on the time horizon. We use dynamic-online DWDA (D-ODWDA) and
formulate a performance-guaranteed distributed online demand response approach
for heating, ventilation, and air-conditioning (HVAC) systems of commercial
buildings. We show the performance of our approach for fast timescale demand
response in numerical simulations and obtain demand response decisions that
closely reproduce the centralized optimal ones
Approximate Multi-Agent Fitted Q Iteration
We formulate an efficient approximation for multi-agent batch reinforcement
learning, the approximate multi-agent fitted Q iteration (AMAFQI). We present a
detailed derivation of our approach. We propose an iterative policy search and
show that it yields a greedy policy with respect to multiple approximations of
the centralized, standard Q-function. In each iteration and policy evaluation,
AMAFQI requires a number of computations that scales linearly with the number
of agents whereas the analogous number of computations increase exponentially
for the fitted Q iteration (FQI), one of the most commonly used approaches in
batch reinforcement learning. This property of AMAFQI is fundamental for the
design of a tractable multi-agent approach. We evaluate the performance of
AMAFQI and compare it to FQI in numerical simulations. Numerical examples
illustrate the significant computation time reduction when using AMAFQI instead
of FQI in multi-agent problems and corroborate the similar decision-making
performance of both approaches
An Online Newton's Method for Time-varying Linear Equality Constraints
We consider online optimization problems with time-varying linear equality
constraints. In this framework, an agent makes sequential decisions using only
prior information. At every round, the agent suffers an environment-determined
loss and must satisfy time-varying constraints. Both the loss functions and the
constraints can be chosen adversarially. We propose the Online Projected
Equality-constrained Newton Method (OPEN-M) to tackle this family of problems.
We obtain sublinear dynamic regret and constraint violation bounds for OPEN-M
under mild conditions. Namely, smoothness of the loss function and boundedness
of the inverse Hessian at the optimum are required, but not convexity. Finally,
we show OPEN-M outperforms state-of-the-art online constrained optimization
algorithms in a numerical network flow application.Comment: Version takes into account reviewer comments. The contributions have
been clarified. The assumptions regarding the variation of optima have been
clarified. The figures have more explicit labeling of the axes. Several small
typos were addressed including problems with parentheses and unnecessary
line
Second-order Online Nonconvex Optimization
We present the online Newton's method, a single-step second-order method for
online nonconvex optimization. We analyze its performance and obtain a dynamic
regret bound that is linear in the cumulative variation between round optima.
We show that if the variation between round optima is limited, the method leads
to a constant regret bound. In the general case, the online Newton's method
outperforms online convex optimization algorithms for convex functions and
performs similarly to a specialized algorithm for strongly convex functions. We
simulate the performance of the online Newton's method on a nonlinear,
nonconvex moving target localization example and find that it outperforms a
first-order approach
Optimally Scheduling Public Safety Power Shutoffs
In an effort to reduce power system-caused wildfires, utilities carry out
public safety power shutoffs (PSPS) in which portions of the grid are
de-energized to mitigate the risk of ignition. The decision to call a PSPS must
balance reducing ignition risks and the negative impact of service
interruptions. In this work, we consider three PSPS scheduling scenarios, which
we model as dynamic programs. In the first two scenarios, we assume that N
PSPSs are budgeted as part of the investment strategy. In the first scenario, a
penalty is incurred for each PSPS declared past the Nth event. In the second,
we assume that some costs can be recovered if the number of PSPSs is below
while still being subject to a penalty if above N. In the third, the system
operator wants to minimize the number of PSPS such that the total expected cost
is below a threshold. We provide optimal or asymptotically optimal policies for
each case, the first two of which have closed-form expressions. Lastly, we
establish the applicability of the first PSPS model's policy to critical-peak
pricing, and obtain an optimal scheduling policy to reduce the peak demand
based on weather observations
Tensor-based Space Debris Detection for Satellite Mega-constellations
Thousands of satellites, asteroids, and rocket bodies break, collide, or
degrade, resulting in large amounts of space debris in low Earth orbit. The
presence of space debris poses a serious threat to satellite
mega-constellations and to future space missions. Debris can be avoided if
detected within the safety range of a satellite. In this paper, an integrated
sensing and communication technique is proposed to detect space debris for
satellite mega-constellations. The canonical polyadic (CP) tensor decomposition
method is used to estimate the rank of the tensor that denotes the number of
paths including line-of-sight and non-line-of-sight by exploiting the sparsity
of THz channel with limited scattering. The analysis reveals that the reflected
signals of the THz can be utilized for the detection of space debris. The CP
decomposition is cast as an optimization problem and solved using the
alternating least square (ALS) algorithm. Simulation results show that the
probability of detection of the proposed tensor-based scheme is higher than the
conventional energy-based detection scheme for the space debris detection
Evolution of High Throughput Satellite Systems: Vision, Requirements, and Key Technologies
High throughput satellites (HTS), with their digital payload technology, are
expected to play a key role as enablers of the upcoming 6G networks. HTS are
mainly designed to provide higher data rates and capacities. Fueled by
technological advancements including beamforming, advanced modulation
techniques, reconfigurable phased array technologies, and electronically
steerable antennas, HTS have emerged as a fundamental component for future
network generation. This paper offers a comprehensive state-of-the-art of HTS
systems, with a focus on standardization, patents, channel multiple access
techniques, routing, load balancing, and the role of software-defined
networking (SDN). In addition, we provide a vision for next-satellite systems
that we named as extremely-HTS (EHTS) toward autonomous satellites supported by
the main requirements and key technologies expected for these systems. The EHTS
system will be designed such that it maximizes spectrum reuse and data rates,
and flexibly steers the capacity to satisfy user demand. We introduce a novel
architecture for future regenerative payloads while summarizing the challenges
imposed by this architecture