115,186 research outputs found
Smoothing Policies and Safe Policy Gradients
Policy gradient algorithms are among the best candidates for the much
anticipated application of reinforcement learning to real-world control tasks,
such as the ones arising in robotics. However, the trial-and-error nature of
these methods introduces safety issues whenever the learning phase itself must
be performed on a physical system. In this paper, we address a specific safety
formulation, where danger is encoded in the reward signal and the learning
agent is constrained to never worsen its performance. By studying actor-only
policy gradient from a stochastic optimization perspective, we establish
improvement guarantees for a wide class of parametric policies, generalizing
existing results on Gaussian policies. This, together with novel upper bounds
on the variance of policy gradient estimators, allows to identify those
meta-parameter schedules that guarantee monotonic improvement with high
probability. The two key meta-parameters are the step size of the parameter
updates and the batch size of the gradient estimators. By a joint, adaptive
selection of these meta-parameters, we obtain a safe policy gradient algorithm
Possible Signatures of Inflationary Particle Content: Spin-2 Fields
We study the imprints of a massive spin-2 field on inflationary observables,
and in particular on the breaking of consistency relations. In this setup, the
minimal inflationary field content interacts with the massive spin-2 field
through dRGT interactions, thus guaranteeing the absence of Boulware-Deser
ghostly degrees of freedom. The unitarity requirement on spinning particles,
known as Higuchi bound, plays a crucial role for the size of the observable
signal.Comment: 24 pages, 6 figure
Symmetries, Holography and Quantum Phase Transition in Two-dimensional Dilaton AdS Gravity
We present a revisitation of the Almheiri-Polchinski dilaton gravity model
from a two-dimensional (2D) bulk perspective. We describe a peculiar feature of
the model, namely the pattern of conformal symmetry breaking using bulk Killing
vectors, a covariant definition of mass and the flow between different vacua of
the theory. We show that the effect of the symmetry breaking is both the
generation of an infrared scale (a mass gap) and to make local the Goldstone
modes associated with the asymptotic symmetries of the 2D spacetime. In this
way a non vanishing central charge is generated in the dual conformal theory,
which accounts for the microscopic entropy of the 2D black hole. The use of
covariant mass allows to compare energetically the two different vacua of the
theory and to show that at zero temperature the vacuum with a constant dilaton
is energetically preferred. We also translate in the bulk language several
features of the dual CFT discussed by Maldacena et al. The uplifting of the 2D
model to dimensional theories exhibiting hyperscaling violation is
briefly discussed.Comment: 7 pages, no figure
A reversible allelic partition process and Pitman sampling formula
We introduce a continuous-time Markov chain describing dynamic allelic
partitions which extends the branching process construction of the Pitman
sampling formula in Pitman (2006) and the birth-and-death process with
immigration studied in Karlin and McGregor (1967), in turn related to the
celebrated Ewens sampling formula. A biological basis for the scheme is
provided in terms of a population of individuals grouped into families, that
evolves according to a sequence of births, deaths and immigrations. We
investigate the asymptotic behaviour of the chain and show that, as opposed to
the birth-and-death process with immigration, this construction maintains in
the temporal limit the mutual dependence among the multiplicities. When the
death rate exceeds the birth rate, the system is shown to have reversible
distribution identified as a mixture of Pitman sampling formulae, with negative
binomial mixing distribution on the population size. The population therefore
converges to a stationary random configuration, characterised by a finite
number of families and individuals.Comment: 17 pages, to appear in ALEA , Latin American Journal of Probability
and Mathematical Statistic
Measurements by A LEAP-Based Virtual Glove for the hand rehabilitation
Hand rehabilitation is fundamental after stroke or surgery. Traditional rehabilitation
requires a therapist and implies high costs, stress for the patient, and subjective evaluation of
the therapy effectiveness. Alternative approaches, based on mechanical and tracking-based gloves,
can be really effective when used in virtual reality (VR) environments. Mechanical devices are often
expensive, cumbersome, patient specific and hand specific, while tracking-based devices are not
affected by these limitations but, especially if based on a single tracking sensor, could suffer from
occlusions. In this paper, the implementation of a multi-sensors approach, the Virtual Glove (VG),
based on the simultaneous use of two orthogonal LEAP motion controllers, is described. The VG is
calibrated and static positioning measurements are compared with those collected with an accurate
spatial positioning system. The positioning error is lower than 6 mm in a cylindrical region of interest
of radius 10 cm and height 21 cm. Real-time hand tracking measurements are also performed, analysed
and reported. Hand tracking measurements show that VG operated in real-time (60 fps), reduced
occlusions, and managed two LEAP sensors correctly, without any temporal and spatial discontinuity
when skipping from one sensor to the other. A video demonstrating the good performance of VG
is also collected and presented in the Supplementary Materials. Results are promising but further
work must be done to allow the calculation of the forces exerted by each finger when constrained by
mechanical tools (e.g., peg-boards) and for reducing occlusions when grasping these tools. Although
the VG is proposed for rehabilitation purposes, it could also be used for tele-operation of tools and
robots, and for other VR applications
Quantum memories with zero-energy Majorana modes and experimental constraints
In this work we address the problem of realizing a reliable quantum memory
based on zero-energy Majorana modes in the presence of experimental constraints
on the operations aimed at recovering the information. In particular, we
characterize the best recovery operation acting only on the zero-energy
Majorana modes and the memory fidelity that can be therewith achieved. In order
to understand the effect of such restriction, we discuss two examples of noise
models acting on the topological system and compare the amount of information
that can be recovered by accessing either the whole system, or the zero-modes
only, with particular attention to the scaling with the size of the system and
the energy gap. We explicitly discuss the case of a thermal bosonic environment
inducing a parity-preserving Markovian dynamics in which the introduced memory
fidelity decays exponentially in time, independent from system size, thus
showing the impossibility to retrieve the information by acting on the
zero-modes only. We argue, however, that even in the presence of experimental
limitations, the Hamiltonian gap is still beneficial to the storage of
information.Comment: 18 pages, 7 figures. Updated to published versio
Weighted fast diffusion equations (Part I): Sharp asymptotic rates without symmetry and symmetry breaking in Caffarelli-Kohn-Nirenberg inequalities
In this paper we consider a family of Caffarelli-Kohn-Nirenberg interpolation
inequalities (CKN), with two radial power law weights and exponents in a
subcritical range. We address the question of symmetry breaking: are the
optimal functions radially symmetric, or not ? Our intuition comes from a
weighted fast diffusion (WFD) flow: if symmetry holds, then an explicit entropy
- entropy production inequality which governs the intermediate asymptotics is
indeed equivalent to (CKN), and the self-similar profiles are optimal for
(CKN). We establish an explicit symmetry breaking condition by proving the
linear instability of the radial optimal functions for (CKN). Symmetry breaking
in (CKN) also has consequences on entropy - entropy production inequalities and
on the intermediate asymptotics for (WFD). Even when no symmetry holds in
(CKN), asymptotic rates of convergence of the solutions to (WFD) are determined
by a weighted Hardy-Poincar{\'e} inequality which is interpreted as a
linearized entropy - entropy production inequality. All our results rely on the
study of the bottom of the spectrum of the linearized diffusion operator around
the self-similar profiles, which is equivalent to the linearization of (CKN)
around the radial optimal functions, and on variational methods. Consequences
for the (WFD) flow will be studied in Part II of this work
- …