434,817 research outputs found
Threat captures attention but does not affect learning of contextual regularities
Some of the stimulus features that guide visual attention are abstract properties of objects such as potential threat to one's survival, whereas others are complex configurations such as visual contexts that are learned through past experiences. The present study investigated the two functions that guide visual attention, threat detection and learning of contextual regularities, in visual search. Search arrays contained images of threat and non-threat objects, and their locations were fixed on some trials but random on other trials. Although they were irrelevant to the visual search task, threat objects facilitated attention capture and impaired attention disengagement. Search time improved for fixed configurations more than for random configurations, reflecting learning of visual contexts. Nevertheless, threat detection had little influence on learning of the contextual regularities. The results suggest that factors guiding visual attention are different from factors that influence learning to guide visual attention
Bayesian optimization of the PC algorithm for learning Gaussian Bayesian networks
The PC algorithm is a popular method for learning the structure of Gaussian
Bayesian networks. It carries out statistical tests to determine absent edges
in the network. It is hence governed by two parameters: (i) The type of test,
and (ii) its significance level. These parameters are usually set to values
recommended by an expert. Nevertheless, such an approach can suffer from human
bias, leading to suboptimal reconstruction results. In this paper we consider a
more principled approach for choosing these parameters in an automatic way. For
this we optimize a reconstruction score evaluated on a set of different
Gaussian Bayesian networks. This objective is expensive to evaluate and lacks a
closed-form expression, which means that Bayesian optimization (BO) is a
natural choice. BO methods use a model to guide the search and are hence able
to exploit smoothness properties of the objective surface. We show that the
parameters found by a BO method outperform those found by a random search
strategy and the expert recommendation. Importantly, we have found that an
often overlooked statistical test provides the best over-all reconstruction
results
Approaches for rule discovery in a learning classifier system
To fill the increasing demand for explanations of decisions made by automated prediction systems, machine learning (ML) techniques that produce inherently transparent models are directly suited. Learning Classifier Systems (LCSs), a family of rule-based learners, produce transparent models by design. However, the usefulness of such models, both for predictions and analyses, heavily depends on the placement and selection of rules (combined constituting the ML task of model selection). In this paper, we investigate a variety of techniques to efficiently place good rules within the search space based on their local prediction errors as well as their generality. This investigation is done within a specific LCS, named SupRB, where the placement of rules and the selection of good subsets of rules are strictly separated in contrast to other LCSs where these tasks sometimes blend. We compare a Random Search, (1,λ)-ES and three Novelty Search variants. We find that there is a definitive need to guide the search based on some sensible criteria, i.e. error and generality, rather than just placing rules randomly and selecting better performing ones but also find that Novelty Search variants do not beat the easier to understand (1,λ)-ES
Expert iteration
In this thesis, we study how reinforcement learning algorithms can tackle classical board games without recourse to human knowledge. Specifically, we develop a framework and algorithms which learn to play the board game Hex starting from random play. We first describe Expert Iteration (ExIt), a novel reinforcement learning framework which extends Modified Policy Iteration. ExIt explicitly decomposes the reinforcement learning problem into two parts: planning and generalisation. A planning algorithm explores possible move sequences starting from a particular position to find good strategies from that position, while a parametric function approximator is trained to predict those plans, generalising to states not yet seen. Subsequently, planning is improved by using the approximated policy to guide search, increasing the strength of new plans. This decomposition allows ExIt to combine the benefits of both planning methods and function approximation methods. We demonstrate the effectiveness of the ExIt paradigm by implementing ExIt with two different planning algorithms. First, we develop a version based on Monte Carlo Tree Search (MCTS), a search algorithm which has been successful both in specific games, such as Go, Hex and Havannah, and in general game playing competitions. We then develop a new planning algorithm, Policy Gradient Search (PGS), which uses a model-free reinforcement learning algorithm for online planning. Unlike MCTS, PGS does not require an explicit search tree. Instead PGS uses function approximation within a single search, allowing it to be applied to problems with larger branching factors. Both MCTS-ExIt and PGS-ExIt defeated MoHex 2.0 - the most recent Hex Olympiad winner to be open sourced - in 9 × 9 Hex. More importantly, whereas MoHex makes use of many Hex-specific improvements and knowledge, all our programs were trained tabula rasa using general reinforcement learning methods. This bodes well for ExIt’s applicability to both other games and real world decision making problems
Neural Informed RRT* with Point-based Network Guidance for Optimal Sampling-based Path Planning
Sampling-based planning algorithms like Rapidly-exploring Random Tree (RRT)
are versatile in solving path planning problems. RRT* offers asymptotical
optimality but requires growing the tree uniformly over the free space, which
leaves room for efficiency improvement. To accelerate convergence, informed
approaches sample states in an ellipsoidal subset of the search space
determined by current path cost during iteration. Learning-based alternatives
model the topology of the search space and infer the states close to the
optimal path to guide planning. We combine the strengths from both sides and
propose Neural Informed RRT* with Point-based Network Guidance. We introduce
Point-based Network to infer the guidance states, and integrate the network
into Informed RRT* for guidance state refinement. We use Neural Connect to
build connectivity of the guidance state set and further boost performance in
challenging planning problems. Our method surpasses previous works in path
planning benchmarks while preserving probabilistic completeness and
asymptotical optimality. We demonstrate the deployment of our method on mobile
robot navigation in the real world.Comment: 7 pages, 6 figure
- …