10,943 research outputs found
Asymptotic Expansions for Stationary Distributions of Perturbed Semi-Markov Processes
New algorithms for computing of asymptotic expansions for stationary
distributions of nonlinearly perturbed semi-Markov processes are presented. The
algorithms are based on special techniques of sequential phase space reduction,
which can be applied to processes with asymptotically coupled and uncoupled
finite phase spaces.Comment: 83 page
A Simulation-based Approach for Solving Temporal Markov Problems
Time is a crucial variable in planning and often requires special attention since it introduces a specific structure along with additional complexity, especially in the case of decision under uncertainty. In this paper, after reviewing and comparing MDP frameworks designed to deal with temporal problems, we focus on Generalized Semi-Markov Decision Processes (GSMDP) with observable time. We highlight the inherent structure and complexity of these problems and present the differences with classical reinforcement learning problems. Finally, we introduce a new simulation-based reinforcement learning method for solving GSMDP, bringing together results from simulation-based policy iteration, regression techniques and simulation theory. We illustrate our approach on a subway network control example
Une Approche basée sur la Simulation pour l'Optimisation des Processus Décisionnels Semi-Markoviens Généralisés
Time is a crucial variable in planning and often requires special attention since it introduces a specific structure along with additional complexity, especially in the case of decision under uncertainty. In this paper, after reviewing and comparing MDP frameworks designed to deal with temporal problems, we focus on Generalized Semi-Markov Decision Processes (GSMDP) with observable time. We highlight the inherent structure and complexity of these problems and present the differences with classical reinforcement learning problems. Finally, we introduce a new simulation-based reinforcement learning method for solving GSMDP, bringing together results from simulation-based policy iteration, regression techniques and simulation theory. We illustrate our approach on a subway network control example
Quantum walks: a comprehensive review
Quantum walks, the quantum mechanical counterpart of classical random walks,
is an advanced tool for building quantum algorithms that has been recently
shown to constitute a universal model of quantum computation. Quantum walks is
now a solid field of research of quantum computation full of exciting open
problems for physicists, computer scientists, mathematicians and engineers.
In this paper we review theoretical advances on the foundations of both
discrete- and continuous-time quantum walks, together with the role that
randomness plays in quantum walks, the connections between the mathematical
models of coined discrete quantum walks and continuous quantum walks, the
quantumness of quantum walks, a summary of papers published on discrete quantum
walks and entanglement as well as a succinct review of experimental proposals
and realizations of discrete-time quantum walks. Furthermore, we have reviewed
several algorithms based on both discrete- and continuous-time quantum walks as
well as a most important result: the computational universality of both
continuous- and discrete- time quantum walks.Comment: Paper accepted for publication in Quantum Information Processing
Journa
Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation
The control of nonlinear dynamical systems remains a major challenge for
autonomous agents. Current trends in reinforcement learning (RL) focus on
complex representations of dynamics and policies, which have yielded impressive
results in solving a variety of hard control tasks. However, this new
sophistication and extremely over-parameterized models have come with the cost
of an overall reduction in our ability to interpret the resulting policies. In
this paper, we take inspiration from the control community and apply the
principles of hybrid switching systems in order to break down complex dynamics
into simpler components. We exploit the rich representational power of
probabilistic graphical models and derive an expectation-maximization (EM)
algorithm for learning a sequence model to capture the temporal structure of
the data and automatically decompose nonlinear dynamics into stochastic
switching linear dynamical systems. Moreover, we show how this framework of
switching models enables extracting hierarchies of Markovian and
auto-regressive locally linear controllers from nonlinear experts in an
imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro
Planning in Hybrid Structured Stochastic Domains
Efficient representations and solutions for large structured decision problems with continuous and discrete variables are among the important challenges faced by the designers of automated decision support systems. In this work, we describe a novel hybrid factored Markov decision process (MDP) model that allows for a compact representation of these problems, and a hybrid approximate linear programming (HALP) framework that permits their efficient solutions. The central idea of HALP is to approximate the optimal value function of an MDP by a linear combination of basis functions and optimize its weights by linear programming. We study both theoretical and practical aspects of this approach, and demonstrate its scale-up potential on several hybrid optimization problems
- …