24,525 research outputs found
Sequential Design for Optimal Stopping Problems
We propose a new approach to solve optimal stopping problems via simulation.
Working within the backward dynamic programming/Snell envelope framework, we
augment the methodology of Longstaff-Schwartz that focuses on approximating the
stopping strategy. Namely, we introduce adaptive generation of the stochastic
grids anchoring the simulated sample paths of the underlying state process.
This allows for active learning of the classifiers partitioning the state space
into the continuation and stopping regions. To this end, we examine sequential
design schemes that adaptively place new design points close to the stopping
boundaries. We then discuss dynamic regression algorithms that can implement
such recursive estimation and local refinement of the classifiers. The new
algorithm is illustrated with a variety of numerical experiments, showing that
an order of magnitude savings in terms of design size can be achieved. We also
compare with existing benchmarks in the context of pricing multi-dimensional
Bermudan options.Comment: 24 page
Sequential Design for Ranking Response Surfaces
We propose and analyze sequential design methods for the problem of ranking
several response surfaces. Namely, given response surfaces over a
continuous input space , the aim is to efficiently find the index of
the minimal response across the entire . The response surfaces are not
known and have to be noisily sampled one-at-a-time. This setting is motivated
by stochastic control applications and requires joint experimental design both
in space and response-index dimensions. To generate sequential design
heuristics we investigate stepwise uncertainty reduction approaches, as well as
sampling based on posterior classification complexity. We also make connections
between our continuous-input formulation and the discrete framework of pure
regret in multi-armed bandits. To model the response surfaces we utilize
kriging surrogates. Several numerical examples using both synthetic data and an
epidemics control problem are provided to illustrate our approach and the
efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
- …