7,827 research outputs found
Unsupervised Basis Function Adaptation for Reinforcement Learning
When using reinforcement learning (RL) algorithms it is common, given a large
state space, to introduce some form of approximation architecture for the value
function (VF). The exact form of this architecture can have a significant
effect on an agent's performance, however, and determining a suitable
approximation architecture can often be a highly complex task. Consequently
there is currently interest among researchers in the potential for allowing RL
algorithms to adaptively generate (i.e. to learn) approximation architectures.
One relatively unexplored method of adapting approximation architectures
involves using feedback regarding the frequency with which an agent has visited
certain states to guide which areas of the state space to approximate with
greater detail. In this article we will: (a) informally discuss the potential
advantages offered by such methods; (b) introduce a new algorithm based on such
methods which adapts a state aggregation approximation architecture on-line and
is designed for use in conjunction with SARSA; (c) provide theoretical results,
in a policy evaluation setting, regarding this particular algorithm's
complexity, convergence properties and potential to reduce VF error; and
finally (d) test experimentally the extent to which this algorithm can improve
performance given a number of different test problems. Taken together our
results suggest that our algorithm (and potentially such methods more
generally) can provide a versatile and computationally lightweight means of
significantly boosting RL performance given suitable conditions which are
commonly encountered in practice
Unsupervised Basis Function Adaptation for Reinforcement Learning
When using reinforcement learning (RL) algorithms to evaluate a policy it is
common, given a large state space, to introduce some form of approximation
architecture for the value function (VF). The exact form of this architecture
can have a significant effect on the accuracy of the VF estimate, however, and
determining a suitable approximation architecture can often be a highly complex
task. Consequently there is a large amount of interest in the potential for
allowing RL algorithms to adaptively generate approximation architectures.
We investigate a method of adapting approximation architectures which uses
feedback regarding the frequency with which an agent has visited certain states
to guide which areas of the state space to approximate with greater detail.
This method is "unsupervised" in the sense that it makes no direct reference to
reward or the VF estimate. We introduce an algorithm based upon this idea which
adapts a state aggregation approximation architecture on-line.
A common method of scoring a VF estimate is to weight the squared Bellman
error of each state-action by the probability of that state-action occurring.
Adopting this scoring method, and assuming states, we demonstrate
theoretically that - provided (1) the number of cells in the state
aggregation architecture is of order or greater, (2)
the policy and transition function are close to deterministic, and (3) the
prior for the transition function is uniformly distributed - our algorithm,
used in conjunction with a suitable RL algorithm, can guarantee a score which
is arbitrarily close to zero as becomes large. It is able to do this
despite having only space complexity and negligible time
complexity. The results take advantage of certain properties of the stationary
distributions of Markov chains.Comment: Extended abstract submitted (3 March 2017) for 3rd Multidisciplinary
Conference on Reinforcement Learning and Decision Making (RLDM) 201
An algorithm for finding Hamiltonian Cycles in Cubic Planar Graphs
We first prove a one-to-one correspondence between finding Hamiltonian cycles
in a cubic planar graphs and finding trees with specific properties in dual
graphs. Using this information, we construct an exact algorithm for finding
Hamiltonian cycles in cubic planar graphs. The worst case time complexity of
our algorithm is O
Distance labellings of Cayley graphs of semigroups
This paper establishes connections between the structure of a semigroup and
the minimum spans of distance labellings of its Cayley graphs. We show that
certain general restrictions on the minimum spans are equivalent to the
semigroup being combinatorial, and that other restrictions are equivalent to
the semigroup being a right zero band. We obtain a description of the structure
of all semigroups and their subsets such that \Cay(S,C) is a disjoint
union of complete graphs, and show that this description is also equivalent to
several restrictions on the minimum span of \Cay(S,C). We then describe all
graphs with minimum spans satisfying the same restrictions, and give examples
to show that a fairly straightforward upper bound for the minimum spans of the
underlying undirected graphs of Cayley graphs turns out to be sharp even for
the class of combinatorial semigroups
Degree Bounded Bottleneck Spanning Trees in Three Dimensions
The geometric -minimum spanning tree problem (-MST) is the
problem of finding a minimum spanning tree for a set of points in a normed
vector space, such that no vertex in the tree has a degree which exceeds
, and the sum of the lengths of the edges in the tree is minimum. The
similarly defined geometric -minimum bottleneck spanning tree problem
(-MBST), is the problem of finding a degree bounded spanning tree such
that the length of the longest edge is minimum. For point sets that lie in the
Euclidean plane, both of these problems have been shown to be NP-hard for
certain specific values of . In this paper, we investigate the
-MBST problem in -dimensional Euclidean space and -dimensional
rectilinear space. We show that the problems are NP-hard for certain values of
, and we provide inapproximability results for these cases. We also
describe new approximation algorithms for solving these -dimensional
variants, and then analyse their worst-case performance.Comment: 35 pages, 22 figure
A Flow-dependent Quadratic Steiner Tree Problem in the Euclidean Plane
We introduce a flow-dependent version of the quadratic Steiner tree problem
in the plane. An instance of the problem on a set of embedded sources and a
sink asks for a directed tree spanning these nodes and a bounded number of
Steiner points, such that is a
minimum, where is the flow on edge . The edges are uncapacitated and
the flows are determined additively, i.e., the flow on an edge leaving a node
will be the sum of the flows on all edges entering . Our motivation for
studying this problem is its utility as a model for relay augmentation of
wireless sensor networks. In these scenarios one seeks to optimise power
consumption -- which is predominantly due to communication and, in free space,
is proportional to the square of transmission distance -- in the network by
introducing additional relays. We prove several geometric and combinatorial
results on the structure of optimal and locally optimal solution-trees (under
various strategies for bounding the number of Steiner points) and describe a
geometric linear-time algorithm for constructing such trees with known
topologies
An exact algorithm for the bottleneck 2-connected -Steiner network problem in planes
We present the first exact polynomial time algorithm for constructing optimal
geometric bottleneck 2-connected Steiner networks containing at most
Steiner points, where is a constant. Given a set of vertices embedded
in an plane, the objective of the problem is to find a 2-connected
network, spanning the given vertices and at most additional vertices, such
that the length of the longest edge is minimised. In contrast to the discrete
version of this problem the additional vertices may be located anywhere in the
plane. The problem is motivated by the modelling of relay-augmentation for the
optimisation of energy consumption in wireless ad hoc networks. Our algorithm
employs Voronoi diagrams and properties of block-cut-vertex decompositions of
graphs to find an optimal solution in steps when
and in steps when
Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges
Issues regarding explainable AI involve four components: users, laws &
regulations, explanations and algorithms. Together these components provide a
context in which explanation methods can be evaluated regarding their adequacy.
The goal of this chapter is to bridge the gap between expert users and lay
users. Different kinds of users are identified and their concerns revealed,
relevant statements from the General Data Protection Regulation are analyzed in
the context of Deep Neural Networks (DNNs), a taxonomy for the classification
of existing explanation methods is introduced, and finally, the various classes
of explanation methods are analyzed to verify if user concerns are justified.
Overall, it is clear that (visual) explanations can be given about various
aspects of the influence of the input on the output. However, it is noted that
explanation methods or interfaces for lay users are missing and we speculate
which criteria these methods / interfaces should satisfy. Finally it is noted
that two important concerns are difficult to address with explanation methods:
the concern about bias in datasets that leads to biased DNNs, as well as the
suspicion about unfair outcomes.Comment: 14 pages, 1 figure, This article will appear as a chapter in
Explainable and Interpretable Models in Computer Vision and Machine Learning
Springer series on Challenges in Machine Learnin
A geometric characterisation of the quadratic min-power centre
For a given set of nodes in the plane the min-power centre is a point such
that the cost of the star centred at this point and spanning all nodes is
minimised. The cost of the star is defined as the sum of the costs of its
nodes, where the cost of a node is an increasing function of the length of its
longest incident edge. The min-power centre problem provides a model for
optimally locating a cluster-head amongst a set of radio transmitters, however,
the problem can also be formulated within a bicriteria location model involving
the 1-centre and a generalized Fermat-Weber point, making it suitable for a
variety of facility location problems. We use farthest point Voronoi diagrams
and Delaunay triangulations to provide a complete geometric description of the
min-power centre of a finite set of nodes in the Euclidean plane when cost is a
quadratic function. This leads to a new linear-time algorithm for its
construction when the convex hull of the nodes is given. We also provide an
upper bound for the performance of the centroid as an approximation to the
quadratic min-power centre. Finally, we briefly describe the relationship
between solutions under quadratic cost and solutions under more general cost
functions
REVIEW ARTICLE Ras Proteins: Recent Advances and New Functions
Updated information and services can be found at
- …
