1,624 research outputs found
SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems
We study combinatorial problems with real world applications such as machine
scheduling, routing, and assignment. We propose a method that combines
Reinforcement Learning (RL) and planning. This method can equally be applied to
both the offline, as well as online, variants of the combinatorial problem, in
which the problem components (e.g., jobs in scheduling problems) are not known
in advance, but rather arrive during the decision-making process. Our solution
is quite generic, scalable, and leverages distributional knowledge of the
problem parameters. We frame the solution process as an MDP, and take a Deep
Q-Learning approach wherein states are represented as graphs, thereby allowing
our trained policies to deal with arbitrary changes in a principled manner.
Though learned policies work well in expectation, small deviations can have
substantial negative effects in combinatorial settings. We mitigate these
drawbacks by employing our graph-convolutional policies as non-optimal
heuristics in a compatible search algorithm, Monte Carlo Tree Search, to
significantly improve overall performance. We demonstrate our method on two
problems: Machine Scheduling and Capacitated Vehicle Routing. We show that our
method outperforms custom-tailored mathematical solvers, state of the art
learning-based algorithms, and common heuristics, both in computation time and
performance
- …