4,424 research outputs found
Belief Tree Search for Active Object Recognition
Active Object Recognition (AOR) has been approached as an unsupervised
learning problem, in which optimal trajectories for object inspection are not
known and are to be discovered by reducing label uncertainty measures or
training with reinforcement learning. Such approaches have no guarantees of the
quality of their solution. In this paper, we treat AOR as a Partially
Observable Markov Decision Process (POMDP) and find near-optimal policies on
training data using Belief Tree Search (BTS) on the corresponding belief Markov
Decision Process (MDP). AOR then reduces to the problem of knowledge transfer
from near-optimal policies on training set to the test set. We train a Long
Short Term Memory (LSTM) network to predict the best next action on the
training set rollouts. We sho that the proposed AOR method generalizes well to
novel views of familiar objects and also to novel objects. We compare this
supervised scheme against guided policy search, and find that the LSTM network
reaches higher recognition accuracy compared to the guided policy method. We
further look into optimizing the observation function to increase the total
collected reward of optimal policy. In AOR, the observation function is known
only approximately. We propose a gradient-based method update to this
approximate observation function to increase the total reward of any policy. We
show that by optimizing the observation function and retraining the supervised
LSTM network, the AOR performance on the test set improves significantly.Comment: IROS 201
An Online Decision-Theoretic Pipeline for Responder Dispatch
The problem of dispatching emergency responders to service traffic accidents,
fire, distress calls and crimes plagues urban areas across the globe. While
such problems have been extensively looked at, most approaches are offline.
Such methodologies fail to capture the dynamically changing environments under
which critical emergency response occurs, and therefore, fail to be implemented
in practice. Any holistic approach towards creating a pipeline for effective
emergency response must also look at other challenges that it subsumes -
predicting when and where incidents happen and understanding the changing
environmental dynamics. We describe a system that collectively deals with all
these problems in an online manner, meaning that the models get updated with
streaming data sources. We highlight why such an approach is crucial to the
effectiveness of emergency response, and present an algorithmic framework that
can compute promising actions for a given decision-theoretic model for
responder dispatch. We argue that carefully crafted heuristic measures can
balance the trade-off between computational time and the quality of solutions
achieved and highlight why such an approach is more scalable and tractable than
traditional approaches. We also present an online mechanism for incident
prediction, as well as an approach based on recurrent neural networks for
learning and predicting environmental features that affect responder dispatch.
We compare our methodology with prior state-of-the-art and existing dispatch
strategies in the field, which show that our approach results in a reduction in
response time with a drastic reduction in computational time.Comment: Appeared in ICCPS 201
Exploring search space trees using an adapted version of Monte Carlo tree search for combinatorial optimization problems
In this article, a novel approach to solve combinatorial optimization
problems is proposed. This approach makes use of a heuristic algorithm to
explore the search space tree of a problem instance. The algorithm is based on
Monte Carlo tree search, a popular algorithm in game playing that is used to
explore game trees. By leveraging the combinatorial structure of a problem,
several enhancements to the algorithm are proposed. These enhancements aim to
efficiently explore the search space tree by pruning subtrees, using a
heuristic simulation policy, reducing the domains of variables by eliminating
dominated value assignments and using a beam width. They are demonstrated for
two specific combinatorial optimization problems: the quay crane scheduling
problem with non-crossing constraints and the 0-1 knapsack problem.
Computational results show that the algorithm achieves promising results for
both problems and eight new best solutions for a benchmark set of instances are
found for the former problem. These results indicate that the algorithm is
competitive with the state-of-the-art. Apart from this, the results also show
evidence that the algorithm is able to learn to correct the incorrect choices
made by constructive heuristics
Topological Phases: An Expedition off Lattice
Motivated by the goal to give the simplest possible microscopic foundation
for a broad class of topological phases, we study quantum mechanical lattice
models where the topology of the lattice is one of the dynamical variables.
However, a fluctuating geometry can remove the separation between the system
size and the range of local interactions, which is important for topological
protection and ultimately the stability of a topological phase. In particular,
it can open the door to a pathology, which has been studied in the context of
quantum gravity and goes by the name of `baby universe', Here we discuss three
distinct approaches to suppressing these pathological fluctuations. We
complement this discussion by applying Cheeger's theory relating the geometry
of manifolds to their vibrational modes to study the spectra of Hamiltonians.
In particular, we present a detailed study of the statistical properties of
loop gas and string net models on fluctuating lattices, both analytically and
numerically.Comment: 38 pages, 22 figure
Game Specific Approaches to Monte Carlo Tree Search for Dots and Boxes
In this project, a Monte Carlo tree search player was designed and implemented for the child’s game dots and boxes, the computational burden of which has left traditional artificial intelligence approaches like minimax ineffective. Two potential improvements to this player were implemented using game-specific information about dots and boxes: the lack of information for decision-making provided by the net score and the inherent symmetry in many states. The results of these two approaches are presented, along with details about the design of the Monte Carlo tree search player. The first improvement, removing net score from the state information, was proven to be beneficial to both learning speed and memory requirements, while the second, accounting for symmetry in the state space, decreased memory requirements, but at the cost of learning speed
- …