115,186 research outputs found

    Smoothing Policies and Safe Policy Gradients

    Full text link
    Policy gradient algorithms are among the best candidates for the much anticipated application of reinforcement learning to real-world control tasks, such as the ones arising in robotics. However, the trial-and-error nature of these methods introduces safety issues whenever the learning phase itself must be performed on a physical system. In this paper, we address a specific safety formulation, where danger is encoded in the reward signal and the learning agent is constrained to never worsen its performance. By studying actor-only policy gradient from a stochastic optimization perspective, we establish improvement guarantees for a wide class of parametric policies, generalizing existing results on Gaussian policies. This, together with novel upper bounds on the variance of policy gradient estimators, allows to identify those meta-parameter schedules that guarantee monotonic improvement with high probability. The two key meta-parameters are the step size of the parameter updates and the batch size of the gradient estimators. By a joint, adaptive selection of these meta-parameters, we obtain a safe policy gradient algorithm

    Possible Signatures of Inflationary Particle Content: Spin-2 Fields

    Get PDF
    We study the imprints of a massive spin-2 field on inflationary observables, and in particular on the breaking of consistency relations. In this setup, the minimal inflationary field content interacts with the massive spin-2 field through dRGT interactions, thus guaranteeing the absence of Boulware-Deser ghostly degrees of freedom. The unitarity requirement on spinning particles, known as Higuchi bound, plays a crucial role for the size of the observable signal.Comment: 24 pages, 6 figure

    Symmetries, Holography and Quantum Phase Transition in Two-dimensional Dilaton AdS Gravity

    Full text link
    We present a revisitation of the Almheiri-Polchinski dilaton gravity model from a two-dimensional (2D) bulk perspective. We describe a peculiar feature of the model, namely the pattern of conformal symmetry breaking using bulk Killing vectors, a covariant definition of mass and the flow between different vacua of the theory. We show that the effect of the symmetry breaking is both the generation of an infrared scale (a mass gap) and to make local the Goldstone modes associated with the asymptotic symmetries of the 2D spacetime. In this way a non vanishing central charge is generated in the dual conformal theory, which accounts for the microscopic entropy of the 2D black hole. The use of covariant mass allows to compare energetically the two different vacua of the theory and to show that at zero temperature the vacuum with a constant dilaton is energetically preferred. We also translate in the bulk language several features of the dual CFT discussed by Maldacena et al. The uplifting of the 2D model to (d+2)−(d+2)-dimensional theories exhibiting hyperscaling violation is briefly discussed.Comment: 7 pages, no figure

    A reversible allelic partition process and Pitman sampling formula

    Get PDF
    We introduce a continuous-time Markov chain describing dynamic allelic partitions which extends the branching process construction of the Pitman sampling formula in Pitman (2006) and the birth-and-death process with immigration studied in Karlin and McGregor (1967), in turn related to the celebrated Ewens sampling formula. A biological basis for the scheme is provided in terms of a population of individuals grouped into families, that evolves according to a sequence of births, deaths and immigrations. We investigate the asymptotic behaviour of the chain and show that, as opposed to the birth-and-death process with immigration, this construction maintains in the temporal limit the mutual dependence among the multiplicities. When the death rate exceeds the birth rate, the system is shown to have reversible distribution identified as a mixture of Pitman sampling formulae, with negative binomial mixing distribution on the population size. The population therefore converges to a stationary random configuration, characterised by a finite number of families and individuals.Comment: 17 pages, to appear in ALEA , Latin American Journal of Probability and Mathematical Statistic

    Measurements by A LEAP-Based Virtual Glove for the hand rehabilitation

    Get PDF
    Hand rehabilitation is fundamental after stroke or surgery. Traditional rehabilitation requires a therapist and implies high costs, stress for the patient, and subjective evaluation of the therapy effectiveness. Alternative approaches, based on mechanical and tracking-based gloves, can be really effective when used in virtual reality (VR) environments. Mechanical devices are often expensive, cumbersome, patient specific and hand specific, while tracking-based devices are not affected by these limitations but, especially if based on a single tracking sensor, could suffer from occlusions. In this paper, the implementation of a multi-sensors approach, the Virtual Glove (VG), based on the simultaneous use of two orthogonal LEAP motion controllers, is described. The VG is calibrated and static positioning measurements are compared with those collected with an accurate spatial positioning system. The positioning error is lower than 6 mm in a cylindrical region of interest of radius 10 cm and height 21 cm. Real-time hand tracking measurements are also performed, analysed and reported. Hand tracking measurements show that VG operated in real-time (60 fps), reduced occlusions, and managed two LEAP sensors correctly, without any temporal and spatial discontinuity when skipping from one sensor to the other. A video demonstrating the good performance of VG is also collected and presented in the Supplementary Materials. Results are promising but further work must be done to allow the calculation of the forces exerted by each finger when constrained by mechanical tools (e.g., peg-boards) and for reducing occlusions when grasping these tools. Although the VG is proposed for rehabilitation purposes, it could also be used for tele-operation of tools and robots, and for other VR applications

    Quantum memories with zero-energy Majorana modes and experimental constraints

    Full text link
    In this work we address the problem of realizing a reliable quantum memory based on zero-energy Majorana modes in the presence of experimental constraints on the operations aimed at recovering the information. In particular, we characterize the best recovery operation acting only on the zero-energy Majorana modes and the memory fidelity that can be therewith achieved. In order to understand the effect of such restriction, we discuss two examples of noise models acting on the topological system and compare the amount of information that can be recovered by accessing either the whole system, or the zero-modes only, with particular attention to the scaling with the size of the system and the energy gap. We explicitly discuss the case of a thermal bosonic environment inducing a parity-preserving Markovian dynamics in which the introduced memory fidelity decays exponentially in time, independent from system size, thus showing the impossibility to retrieve the information by acting on the zero-modes only. We argue, however, that even in the presence of experimental limitations, the Hamiltonian gap is still beneficial to the storage of information.Comment: 18 pages, 7 figures. Updated to published versio

    Weighted fast diffusion equations (Part I): Sharp asymptotic rates without symmetry and symmetry breaking in Caffarelli-Kohn-Nirenberg inequalities

    Full text link
    In this paper we consider a family of Caffarelli-Kohn-Nirenberg interpolation inequalities (CKN), with two radial power law weights and exponents in a subcritical range. We address the question of symmetry breaking: are the optimal functions radially symmetric, or not ? Our intuition comes from a weighted fast diffusion (WFD) flow: if symmetry holds, then an explicit entropy - entropy production inequality which governs the intermediate asymptotics is indeed equivalent to (CKN), and the self-similar profiles are optimal for (CKN). We establish an explicit symmetry breaking condition by proving the linear instability of the radial optimal functions for (CKN). Symmetry breaking in (CKN) also has consequences on entropy - entropy production inequalities and on the intermediate asymptotics for (WFD). Even when no symmetry holds in (CKN), asymptotic rates of convergence of the solutions to (WFD) are determined by a weighted Hardy-Poincar{\'e} inequality which is interpreted as a linearized entropy - entropy production inequality. All our results rely on the study of the bottom of the spectrum of the linearized diffusion operator around the self-similar profiles, which is equivalent to the linearization of (CKN) around the radial optimal functions, and on variational methods. Consequences for the (WFD) flow will be studied in Part II of this work
    • …
    corecore