285 research outputs found
On the Combination of Game-Theoretic Learning and Multi Model Adaptive Filters
This paper casts coordination of a team of robots within the framework of game theoretic learning algorithms. In particular a novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies. The proposed algorithm can be used as a coordination mechanism between players when they should take decisions under uncertainty. Each player chooses an action after taking into account the actions of the other players and also the uncertainty. Uncertainty can occur either in terms of noisy observations or various types of other players. In addition, in contrast to other game-theoretic and heuristic algorithms for distributed optimisation, it is not necessary to find the optimal parameters a priori. Various parameter values can be used initially as inputs to different models. Therefore, the resulting decisions will be aggregate results of all the parameter values. Simulations are used to test the performance of the proposed methodology against other game-theoretic learning algorithms.</p
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
Fair collaborative vehicle routing: A deep multi-agent reinforcement learning approach
Collaborative vehicle routing occurs when carriers collaborate through
sharing their transportation requests and performing transportation requests on
behalf of each other. This achieves economies of scale, thus reducing cost,
greenhouse gas emissions and road congestion. But which carrier should partner
with whom, and how much should each carrier be compensated? Traditional game
theoretic solution concepts are expensive to calculate as the characteristic
function scales exponentially with the number of agents. This would require
solving the vehicle routing problem (NP-hard) an exponential number of times.
We therefore propose to model this problem as a coalitional bargaining game
solved using deep multi-agent reinforcement learning, where - crucially -
agents are not given access to the characteristic function. Instead, we
implicitly reason about the characteristic function; thus, when deployed in
production, we only need to evaluate the expensive post-collaboration vehicle
routing problem once. Our contribution is that we are the first to consider
both the route allocation problem and gain sharing problem simultaneously -
without access to the expensive characteristic function. Through decentralised
machine learning, our agents bargain with each other and agree to outcomes that
correlate well with the Shapley value - a fair profit allocation mechanism.
Importantly, we are able to achieve a reduction in run-time of 88%.Comment: Final, published version can be found here:
https://www.sciencedirect.com/science/article/pii/S0968090X2300366
Recommended from our members
A survey of swarm intelligence for dynamic optimization: algorithms and applications
Swarm intelligence (SI) algorithms, including ant colony optimization, particle swarm optimization, bee-inspired algorithms, bacterial foraging optimization, firefly algorithms, fish swarm optimization and many more, have been proven to be good methods to address difficult optimization problems under stationary environments. Most SI algorithms have been developed to address stationary optimization problems and hence, they can converge on the (near-) optimum solution efficiently. However, many real-world problems have a dynamic environment that changes over time. For such dynamic optimization problems (DOPs), it is difficult for a conventional SI algorithm to track the changing optimum once the algorithm has converged on a solution. In the last two decades, there has been a growing interest of addressing DOPs using SI algorithms due to their adaptation capabilities. This paper presents a broad review on SI dynamic optimization (SIDO) focused on several classes of problems, such as discrete, continuous, constrained, multi-objective and classification problems, and real-world applications. In addition, this paper focuses on the enhancement strategies integrated in SI algorithms to address dynamic changes, the performance measurements and benchmark generators used in SIDO. Finally, some considerations about future directions in the subject are given
Combining Optimization and Machine Learning for the Formation of Collectives
This thesis considers the problem of forming collectives of agents for real-world applications aligned with Sustainable Development Goals (e.g., shared mobility and cooperative learning). Such problems require fast approaches that can produce solutions of high quality for hundreds of agents. With this goal in mind, existing solutions for the formation of collectives focus on enhancing the optimization approach by exploiting the characteristics of a domain. However, the resulting approaches rely on specific domain knowledge and are not transferable to other collective formation problems. Therefore, approaches that can be applied to various problems need to be studied in order to obtain general approaches that do not require prior knowledge of the domain. Along these lines, this thesis proposes a general approach for the formation of collectives based on a novel combination of machine learning and an \emph{Integer Linear Program}. More precisely, a machine learning component is trained to generate a set of promising collectives that are likely to be part of a solution. Then, such collectives and their corresponding utility values are introduced into an \emph{Integer Linear Program} which finds a solution to the collective formation problem. In that way, the machine learning component learns the structure shared by ``good'' collectives in a particular domain, making the whole approach valid for various applications. In addition, the empirical analysis conducted on two real-world domains (i.e., ridesharing and team formation) shows that the proposed approach provides solutions of comparable quality to state-of-the-art approaches specific to each domain. Finally, this thesis also shows that the proposed approach can be extended to problems that combine the formation of collectives with other optimization objectives. Thus, this thesis proposes an extension of the collective formation approach for assigning pickup and delivery locations to robots in a warehouse environment. The experimental evaluation shows that, although it is possible to use the collective formation approach for that purpose, several improvements are required to compete with state-of-the-art approaches. Overall, this thesis aims to demonstrate that machine learning can be successfully intertwined with classical optimization approaches for the formation of collectives by learning the structure of a domain, reducing the need for ad-hoc algorithms devised for a specific application
Quantifying and Visualizing City Truck Route Network Efficiency Using a Virtual Testbed: Models for an Urban Freight and Parcel Delivery Virtual Testbed in NYC
69A3551747119This project explored routing app designs that can be of use to NYC DOT in informing truck drivers in NYC. This involved developing a prototype app and engaging in a hackathon in Fall 2022 to refine the visualization of the routing data. Second, we leveraged public data to construct a synthetic population of trucks that can be incorporated into a multiagent simulation that allows for dynamic passenger and commercial vehicle interactions. The synthetic truck population, which includes schedules of trip chains for each individual truck, will be incorporated into MATSim-NYC (He et al., 2021). Third, we proposed a new model for predicting zonal residential parcel delivery volumes and VMT that is applicable to large-scale scenarios and validate such a model using data from New York City (NYC)
- …