27,209 research outputs found
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
We present a tutorial on Bayesian optimization, a method of finding the
maximum of expensive cost functions. Bayesian optimization employs the Bayesian
technique of setting a prior over the objective function and combining it with
evidence to get a posterior function. This permits a utility-based selection of
the next observation to make on the objective function, which must take into
account both exploration (sampling from areas of high uncertainty) and
exploitation (sampling areas likely to offer improvement over the current best
observation). We also present two detailed extensions of Bayesian optimization,
with experiments---active user modelling with preferences, and hierarchical
reinforcement learning---and a discussion of the pros and cons of Bayesian
optimization based on our experiences
Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making
In multi-objective decision planning and learning, much attention is paid to
producing optimal solution sets that contain an optimal policy for every
possible user preference profile. We argue that the step that follows, i.e,
determining which policy to execute by maximising the user's intrinsic utility
function over this (possibly infinite) set, is under-studied. This paper aims
to fill this gap. We build on previous work on Gaussian processes and pairwise
comparisons for preference modelling, extend it to the multi-objective decision
support scenario, and propose new ordered preference elicitation strategies
based on ranking and clustering. Our main contribution is an in-depth
evaluation of these strategies using computer and human-based experiments. We
show that our proposed elicitation strategies outperform the currently used
pairwise methods, and found that users prefer ranking most. Our experiments
further show that utilising monotonicity information in GPs by using a linear
prior mean at the start and virtual comparisons to the nadir and ideal points,
increases performance. We demonstrate our decision support framework in a
real-world study on traffic regulation, conducted with the city of Amsterdam.Comment: AAMAS 2018, Source code at
https://github.com/lmzintgraf/gp_pref_elici
Tour recommendation for groups
Consider a group of people who are visiting a major touristic city, such as NY, Paris, or Rome. It is reasonable to assume that each member of the group has his or her own interests or preferences about places to visit, which in general may differ from those of other members. Still, people almost always want to hang out together and so the following question naturally arises: What is the best tour that the group could perform together in the city? This problem underpins several challenges, ranging from understanding people’s expected attitudes towards potential points of interest, to modeling and providing good and viable solutions. Formulating this problem is challenging because of multiple competing objectives. For example, making the entire group as happy as possible in general conflicts with the objective that no member becomes disappointed. In this paper, we address the algorithmic implications of the above problem, by providing various formulations that take into account the overall group as well as the individual satisfaction and the length of the tour. We then study the computational complexity of these formulations, we provide effective and efficient practical algorithms, and, finally, we evaluate them on datasets constructed from real city data
A tutorial on recursive models for analyzing and predicting path choice behavior
The problem at the heart of this tutorial consists in modeling the path
choice behavior of network users. This problem has been extensively studied in
transportation science, where it is known as the route choice problem. In this
literature, individuals' choice of paths are typically predicted using discrete
choice models. This article is a tutorial on a specific category of discrete
choice models called recursive, and it makes three main contributions: First,
for the purpose of assisting future research on route choice, we provide a
comprehensive background on the problem, linking it to different fields
including inverse optimization and inverse reinforcement learning. Second, we
formally introduce the problem and the recursive modeling idea along with an
overview of existing models, their properties and applications. Third, we
extensively analyze illustrative examples from different angles so that a
novice reader can gain intuition on the problem and the advantages provided by
recursive models in comparison to path-based ones
Requirement Analysis and Implementation of Multicriteria Analysis in the NEEDS Project
This report specifies the requirements for and implementation of the multicriteria analysis of future energy technologies performed by a large number of stakeholders within the EU-funded integrated projct NEEDS. The report is composed of two main parts and the appendix.
The first part starts with a summary of the objectives of the analysis followed by a detailed specifiation of the analyzed problem, in particular the analysis context, discussion of the sets of criteria and alternatives, and the participation of the stakeholders. Next, the planned problem analysis process is first outlined, and then discussed in more detail. Finally, the requirements for the multicritria analysis are specified.
The second part deals with the implementation of the dedicated Web-site developed for this analysis, and later extended to support analysis of any multicriteria choice between discrete alternatives. It starts with an overview of the problem analysis process and the corresponding basic assumptions. Te architecture of the application and its features are then presented. Lessons learned from the development and use of this application conclude this part of the report.
The appendix contains a review of the state-of-the-art of applying multicriteria analysis to energy problems, as well as characteristics of three applications that exploit the multicriteria analysis methods for energy problems considered relevant to the analysis reported in this paper
The Green Choice: Learning and Influencing Human Decisions on Shared Roads
Autonomous vehicles have the potential to increase the capacity of roads via
platooning, even when human drivers and autonomous vehicles share roads.
However, when users of a road network choose their routes selfishly, the
resulting traffic configuration may be very inefficient. Because of this, we
consider how to influence human decisions so as to decrease congestion on these
roads. We consider a network of parallel roads with two modes of
transportation: (i) human drivers who will choose the quickest route available
to them, and (ii) ride hailing service which provides an array of autonomous
vehicle ride options, each with different prices, to users. In this work, we
seek to design these prices so that when autonomous service users choose from
these options and human drivers selfishly choose their resulting routes, road
usage is maximized and transit delay is minimized. To do so, we formalize a
model of how autonomous service users make choices between routes with
different price/delay values. Developing a preference-based algorithm to learn
the preferences of the users, and using a vehicle flow model related to the
Fundamental Diagram of Traffic, we formulate a planning optimization to
maximize a social objective and demonstrate the benefit of the proposed routing
and learning scheme.Comment: Submitted to CDC 201
- …