2,588 research outputs found
Solving Markov decision processes for network-level post-hazard recovery via simulation optimization and rollout
Computation of optimal recovery decisions for community resilience assurance
post-hazard is a combinatorial decision-making problem under uncertainty. It
involves solving a large-scale optimization problem, which is significantly
aggravated by the introduction of uncertainty. In this paper, we draw upon
established tools from multiple research communities to provide an effective
solution to this challenging problem. We provide a stochastic model of damage
to the water network (WN) within a testbed community following a severe
earthquake and compute near-optimal recovery actions for restoration of the
water network. We formulate this stochastic decision-making problem as a Markov
Decision Process (MDP), and solve it using a popular class of heuristic
algorithms known as rollout. A simulation-based representation of MDPs is
utilized in conjunction with rollout and the Optimal Computing Budget
Allocation (OCBA) algorithm to address the resulting stochastic simulation
optimization problem. Our method employs non-myopic planning with efficient use
of simulation budget. We show, through simulation results, that rollout fused
with OCBA performs competitively with respect to rollout with total equal
allocation (TEA) at a meagre simulation budget of 5-10% of rollout with TEA,
which is a crucial step towards addressing large-scale community recovery
problems following natural disasters.Comment: Submitted to Simulation Optimization for Cyber Physical Energy
Systems (Special Session) in 14th IEEE International Conference on Automation
Science and Engineerin
Resource-Constrained Adaptive Search and Tracking for Sparse Dynamic Targets
This paper considers the problem of resource-constrained and noise-limited
localization and estimation of dynamic targets that are sparsely distributed
over a large area. We generalize an existing framework [Bashan et al, 2008] for
adaptive allocation of sensing resources to the dynamic case, accounting for
time-varying target behavior such as transitions to neighboring cells and
varying amplitudes over a potentially long time horizon. The proposed adaptive
sensing policy is driven by minimization of a modified version of the
previously introduced ARAP objective function, which is a surrogate function
for mean squared error within locations containing targets. We provide
theoretical upper bounds on the performance of adaptive sensing policies by
analyzing solutions with oracle knowledge of target locations, gaining insight
into the effect of target motion and amplitude variation as well as sparsity.
Exact minimization of the multi-stage objective function is infeasible, but
myopic optimization yields a closed-form solution. We propose a simple
non-myopic extension, the Dynamic Adaptive Resource Allocation Policy (D-ARAP),
that allocates a fraction of resources for exploring all locations rather than
solely exploiting the current belief state. Our numerical studies indicate that
D-ARAP has the following advantages: (a) it is more robust than the myopic
policy to noise, missing data, and model mismatch; (b) it performs comparably
to well-known approximate dynamic programming solutions but at significantly
lower computational complexity; and (c) it improves greatly upon non-adaptive
uniform resource allocation in terms of estimation error and probability of
detection.Comment: 49 pages, 1 table, 11 figure
- …