42,639 research outputs found
Constraint-based reachability
Iterative imperative programs can be considered as infinite-state systems
computing over possibly unbounded domains. Studying reachability in these
systems is challenging as it requires to deal with an infinite number of states
with standard backward or forward exploration strategies. An approach that we
call Constraint-based reachability, is proposed to address reachability
problems by exploring program states using a constraint model of the whole
program. The keypoint of the approach is to interpret imperative constructions
such as conditionals, loops, array and memory manipulations with the
fundamental notion of constraint over a computational domain. By combining
constraint filtering and abstraction techniques, Constraint-based reachability
is able to solve reachability problems which are usually outside the scope of
backward or forward exploration strategies. This paper proposes an
interpretation of classical filtering consistencies used in Constraint
Programming as abstract domain computations, and shows how this approach can be
used to produce a constraint solver that efficiently generates solutions for
reachability problems that are unsolvable by other approaches.Comment: In Proceedings Infinity 2012, arXiv:1302.310
Reachability in Biochemical Dynamical Systems by Quantitative Discrete Approximation (extended abstract)
In this paper, a novel computational technique for finite discrete
approximation of continuous dynamical systems suitable for a significant class
of biochemical dynamical systems is introduced. The method is parameterized in
order to affect the imposed level of approximation provided that with
increasing parameter value the approximation converges to the original
continuous system. By employing this approximation technique, we present
algorithms solving the reachability problem for biochemical dynamical systems.
The presented method and algorithms are evaluated on several exemplary
biological models and on a real case study.Comment: In Proceedings CompMod 2011, arXiv:1109.104
Hi-Val: Iterative Learning of Hierarchical Value Functions for Policy Generation
Task decomposition is effective in manifold applications where the global complexity of a problem makes planning and decision-making too demanding. This is true, for example, in high-dimensional robotics domains, where (1) unpredictabilities and modeling limitations typically prevent the manual specification of robust behaviors, and (2) learning an action policy is challenging due to the curse of dimensionality. In this work, we borrow the concept of Hierarchical Task Networks (HTNs) to decompose the learning procedure, and we exploit Upper Confidence Tree (UCT) search to introduce HOP, a novel iterative algorithm for hierarchical optimistic planning with learned value functions. To obtain better generalization and generate policies, HOP simultaneously learns and uses action values. These are used to formalize constraints within the search space and to reduce the dimensionality of the problem. We evaluate our algorithm both on a fetching task using a simulated 7-DOF KUKA light weight arm and, on a pick and delivery task with a Pioneer robot
Learning Representations in Model-Free Hierarchical Reinforcement Learning
Common approaches to Reinforcement Learning (RL) are seriously challenged by
large-scale applications involving huge state spaces and sparse delayed reward
feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address
this scalability issue by learning action selection policies at multiple levels
of temporal abstraction. Abstraction can be had by identifying a relatively
small set of states that are likely to be useful as subgoals, in concert with
the learning of corresponding skill policies to achieve those subgoals. Many
approaches to subgoal discovery in HRL depend on the analysis of a model of the
environment, but the need to learn such a model introduces its own problems of
scale. Once subgoals are identified, skills may be learned through intrinsic
motivation, introducing an internal reward signal marking subgoal attainment.
In this paper, we present a novel model-free method for subgoal discovery using
incremental unsupervised learning over a small memory of the most recent
experiences (trajectories) of the agent. When combined with an intrinsic
motivation learning mechanism, this method learns both subgoals and skills,
based on experiences in the environment. Thus, we offer an original approach to
HRL that does not require the acquisition of a model of the environment,
suitable for large-scale applications. We demonstrate the efficiency of our
method on two RL problems with sparse delayed feedback: a variant of the rooms
environment and the first screen of the ATARI 2600 Montezuma's Revenge game
- …