Search CORE

509 research outputs found

Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization

Author: Dilkina Bistra
Tambe Milind
Wilder Bryan
Publication venue
Publication date: 20/11/2018
Field of study

Creating impact in real-world settings requires artificial intelligence techniques to span the full pipeline from data, to predictive models, to decisions. These components are typically approached separately: a machine learning model is first trained via a measure of predictive accuracy, and then its predictions are used as input into an optimization algorithm which produces a decision. However, the loss function used to train the model may easily be misaligned with the end goal, which is to make the best decisions possible. Hand-tuning the loss function to align with optimization is a difficult and error-prone process (which is often skipped entirely). We focus on combinatorial optimization problems and introduce a general framework for decision-focused learning, where the machine learning model is directly trained in conjunction with the optimization algorithm to produce high-quality decisions. Technically, our contribution is a means of integrating common classes of discrete optimization problems into deep learning or other predictive models, which are typically trained via gradient descent. The main idea is to use a continuous relaxation of the discrete problem to propagate gradients through the optimization procedure. We instantiate this framework for two broad classes of combinatorial problems: linear programs and submodular maximization. Experimental results across a variety of domains show that decision-focused learning often leads to improved optimization performance compared to traditional methods. We find that standard measures of accuracy are not a reliable proxy for a predictive model's utility in optimization, and our method's ability to specify the true goal as the model's training objective yields substantial dividends across a range of decision problems.Comment: Full version of paper accepted at AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Three fundamental pillars of multi-agent team formation (Doctoral Consortium)

Author: Soriano Marcolino Leandro
Tambe Milind
Publication venue: AAMAS
Publication date: 01/01/2015
Field of study

Teams of voting agents are a powerful tool for solving complex problems. When forming such teams, there are three fundamental issues that must be addressed: (i) Selecting which agents should form a team; (ii) Aggregating the opinions of the agents; (iii) Assessing the performance of a team. In this thesis we address all these points

Lancaster E-Prints

Unleashing the power of multi-agent voting teams

Author: Soriano Marcolino Leandro
Tambe Milind
Publication venue: IJCAI
Publication date: 01/01/2015
Field of study

Teams of voting agents have great potential in finding optimal solutions. However, there are fundamental challenges to effectively use such teams: (i) selecting agents; (ii) aggregating opinions; (iii) assessing performance. I address all these challenges, with theoretical and experimental contributions

Lancaster E-Prints

Keeping pace with criminals: Designing patrol allocation against adaptive opportunistic criminals

Author: SINHA Arunesh
TAMBE Milind
ZHANG Chao
Publication venue: 'MDPI AG'
Publication date: 08/05/2015
Field of study

Institutional Knowledge at Singapore Management University

Learning adversary behavior in security games: A PAC model perspective

Author: KAR Debarun
SINHA Arunesh
TAMBE Milind
Publication venue
Publication date: 20/11/2015
Field of study

Recent applications of Stackelberg Security Games (SSG), from wildlife crime to urban crime, have employed machine learning tools to learn and predict adversary behavior using available data about defender-adversary interactions. Given these recent developments, this paper commits to an approach of directly learning the response function of the adversary. Using the PAC model, this paper lays a firm theoretical foundation for learning in SSGs (e.g., theoretically answer questions about the numbers of samples required to learn adversary behavior) and provides utility guarantees when the learned adversary model is used to plan the defender's strategy. The paper also aims to answer practical questions such as how much more data is needed to improve an adversary model's accuracy. Additionally, we explain a recently observed phenomenon that prediction accuracy of learned adversary behavior is not enough to discover the utility maximizing defender strategy. We provide four main contributions: (1) a PAC model of learning adversary response functions in SSGs; (2) PAC-model analysis of the learning of key, existing bounded rationality models in SSGs; (3) an entirely new approach to adversary modeling based on a non-parametric class of response functions with PAC-model analysis and (4) identification of conditions under which computing the best defender strategy against the learned adversary behavior is indeed the optimal strategy. Finally, we conduct experiments with real-world data from a national park in Uganda, showing the benefit of our new adversary modeling approach and verification of our PAC model predictions

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University