204,192 research outputs found
Smoothing Policies and Safe Policy Gradients
Policy gradient algorithms are among the best candidates for the much
anticipated application of reinforcement learning to real-world control tasks,
such as the ones arising in robotics. However, the trial-and-error nature of
these methods introduces safety issues whenever the learning phase itself must
be performed on a physical system. In this paper, we address a specific safety
formulation, where danger is encoded in the reward signal and the learning
agent is constrained to never worsen its performance. By studying actor-only
policy gradient from a stochastic optimization perspective, we establish
improvement guarantees for a wide class of parametric policies, generalizing
existing results on Gaussian policies. This, together with novel upper bounds
on the variance of policy gradient estimators, allows to identify those
meta-parameter schedules that guarantee monotonic improvement with high
probability. The two key meta-parameters are the step size of the parameter
updates and the batch size of the gradient estimators. By a joint, adaptive
selection of these meta-parameters, we obtain a safe policy gradient algorithm
The geometric Cauchy problem for developable submanifolds
Given a smooth distribution of -dimensional planes along a
smooth regular curve in , we consider the following
problem: To find an -dimensional developable submanifold of
, that is, a ruled submanifold with constant tangent space
along the rulings, such that its tangent bundle along coincides with
. In particular, we give sufficient conditions for the local
well-posedness of the problem, together with a parametric description of the
solution.Comment: 15 page
Renormalisation of gauge theories on general anisotropic lattices and high-energy scattering in QCD
We study the renormalisation of gauge theories on general
anisotropic lattices, to one-loop order in perturbation theory, employing the
background field method. The results are then applied in the context of two
different approaches to hadronic high-energy scattering. In the context of the
Euclidean nonperturbative approach to soft high-energy scattering based on
Wilson loops, we refine the nonperturbative justification of the analytic
continuation relations of the relevant Wilson-loop correlators, required to
obtain physical results. In the context of longitudinally-rescaled actions, we
study the consequences of one-loop corrections on the relation between the
gauge theory and its effective description in terms of
two-dimensional principal chiral models.Comment: Revised version with minor corrections, matches published version; 40
pages, 4 figure
An Energy-based Approach to Ensure the Stability of Learned Dynamical Systems
Non-linear dynamical systems represent a compact, flexible, and robust tool
for reactive motion generation. The effectiveness of dynamical systems relies
on their ability to accurately represent stable motions. Several approaches
have been proposed to learn stable and accurate motions from demonstration.
Some approaches work by separating accuracy and stability into two learning
problems, which increases the number of open parameters and the overall
training time. Alternative solutions exploit single-step learning but restrict
the applicability to one regression technique. This paper presents a
single-step approach to learn stable and accurate motions that work with any
regression technique. The approach makes energy considerations on the learned
dynamics to stabilize the system at run-time while introducing small deviations
from the demonstrated motion. Since the initial value of the energy injected
into the system affects the reproduction accuracy, it is estimated from
training data using an efficient procedure. Experiments on a real robot and a
comparison on a public benchmark shows the effectiveness of the proposed
approach.Comment: Accepted at the International Conference on Robotics and Automation
202
Stochastic thermodynamics of entropic transport
Seifert derived an exact fluctuation relation for diffusion processes using
the concept of "stochastic system entropy". In this note we extend his
formalism to entropic transport. We introduce the notion of relative stochastic
entropy, or "relative surprisal", and use it to generalize Seifert's
system/medium decomposition of the total entropy. This result allows to apply
the concepts of stochastic thermodynamics to diffusion processes in confined
geometries, such as ion channels, cellular pores or nanoporous materials. It
can be seen as the equivalent for diffusion processes of Esposito and
Schaller's generalized fluctuation theorem for "Maxwell demon feedbacks".Comment: 3 pages, wording change
Preserving Co-Location Privacy in Geo-Social Networks
The number of people on social networks has grown exponentially. Users share
very large volumes of personal informations and content every days. This
content could be tagged with geo-spatial and temporal coordinates that may be
considered sensitive for some users. While there is clearly a demand for users
to share this information with each other, there is also substantial demand for
greater control over the conditions under which their information is shared.
Content published in a geo-aware social networks (GeoSN) often involves
multiple users and it is often accessible to multiple users, without the
publisher being aware of the privacy preferences of those users. This makes
difficult for GeoSN users to control which information about them is available
and to whom it is available. Thus, the lack of means to protect users privacy
scares people bothered about privacy issues. This paper addresses a particular
privacy threats that occur in GeoSNs: the Co-location privacy threat. It
concerns the availability of information about the presence of multiple users
in a same locations at given times, against their will. The challenge addressed
is that of supporting privacy while still enabling useful services.Comment: 10 pages, 5 figure
- …
