509 research outputs found
Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization
Creating impact in real-world settings requires artificial intelligence
techniques to span the full pipeline from data, to predictive models, to
decisions. These components are typically approached separately: a machine
learning model is first trained via a measure of predictive accuracy, and then
its predictions are used as input into an optimization algorithm which produces
a decision. However, the loss function used to train the model may easily be
misaligned with the end goal, which is to make the best decisions possible.
Hand-tuning the loss function to align with optimization is a difficult and
error-prone process (which is often skipped entirely).
We focus on combinatorial optimization problems and introduce a general
framework for decision-focused learning, where the machine learning model is
directly trained in conjunction with the optimization algorithm to produce
high-quality decisions. Technically, our contribution is a means of integrating
common classes of discrete optimization problems into deep learning or other
predictive models, which are typically trained via gradient descent. The main
idea is to use a continuous relaxation of the discrete problem to propagate
gradients through the optimization procedure. We instantiate this framework for
two broad classes of combinatorial problems: linear programs and submodular
maximization. Experimental results across a variety of domains show that
decision-focused learning often leads to improved optimization performance
compared to traditional methods. We find that standard measures of accuracy are
not a reliable proxy for a predictive model's utility in optimization, and our
method's ability to specify the true goal as the model's training objective
yields substantial dividends across a range of decision problems.Comment: Full version of paper accepted at AAAI 201
Three fundamental pillars of multi-agent team formation (Doctoral Consortium)
Teams of voting agents are a powerful tool for solving complex problems. When forming such teams, there are three fundamental issues that must be addressed: (i) Selecting which agents should form a team; (ii) Aggregating the opinions of the agents; (iii) Assessing the performance of a team. In this thesis we address all these points
Unleashing the power of multi-agent voting teams
Teams of voting agents have great potential in finding optimal solutions. However, there are fundamental challenges to effectively use such teams: (i) selecting agents; (ii) aggregating opinions; (iii) assessing performance. I address all these challenges, with theoretical and experimental contributions
Learning adversary behavior in security games: A PAC model perspective
Recent applications of Stackelberg Security Games (SSG), from wildlife crime
to urban crime, have employed machine learning tools to learn and predict
adversary behavior using available data about defender-adversary interactions.
Given these recent developments, this paper commits to an approach of directly
learning the response function of the adversary. Using the PAC model, this
paper lays a firm theoretical foundation for learning in SSGs (e.g.,
theoretically answer questions about the numbers of samples required to learn
adversary behavior) and provides utility guarantees when the learned adversary
model is used to plan the defender's strategy. The paper also aims to answer
practical questions such as how much more data is needed to improve an
adversary model's accuracy. Additionally, we explain a recently observed
phenomenon that prediction accuracy of learned adversary behavior is not enough
to discover the utility maximizing defender strategy. We provide four main
contributions: (1) a PAC model of learning adversary response functions in
SSGs; (2) PAC-model analysis of the learning of key, existing bounded
rationality models in SSGs; (3) an entirely new approach to adversary modeling
based on a non-parametric class of response functions with PAC-model analysis
and (4) identification of conditions under which computing the best defender
strategy against the learned adversary behavior is indeed the optimal strategy.
Finally, we conduct experiments with real-world data from a national park in
Uganda, showing the benefit of our new adversary modeling approach and
verification of our PAC model predictions
- …