Search CORE

115,186 research outputs found

Smoothing Policies and Safe Policy Gradients

Author: Papini Matteo
Pirotta Matteo
Restelli Marcello
Publication venue
Publication date: 08/05/2019
Field of study

Policy gradient algorithms are among the best candidates for the much anticipated application of reinforcement learning to real-world control tasks, such as the ones arising in robotics. However, the trial-and-error nature of these methods introduces safety issues whenever the learning phase itself must be performed on a physical system. In this paper, we address a specific safety formulation, where danger is encoded in the reward signal and the learning agent is constrained to never worsen its performance. By studying actor-only policy gradient from a stochastic optimization perspective, we establish improvement guarantees for a wide class of parametric policies, generalizing existing results on Gaussian policies. This, together with novel upper bounds on the variance of policy gradient estimators, allows to identify those meta-parameter schedules that guarantee monotonic improvement with high probability. The two key meta-parameters are the step size of the parameter updates and the batch size of the gradient estimators. By a joint, adaptive selection of these meta-parameters, we obtain a safe policy gradient algorithm

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

UPF Digital Repository

Possible Signatures of Inflationary Particle Content: Spin-2 Fields

Author: Biagetti Matteo
Dimastrogiovanni Emanuela
Fasiello Matteo
Publication venue: 'IOP Publishing'
Publication date: 04/08/2017
Field of study

We study the imprints of a massive spin-2 field on inflationary observables, and in particular on the breaking of consistency relations. In this setup, the minimal inflationary field content interacts with the massive spin-2 field through dRGT interactions, thus guaranteeing the absence of Boulware-Deser ghostly degrees of freedom. The unitarity requirement on spinning particles, known as Higuchi bound, plays a crucial role for the size of the observable signal.Comment: 24 pages, 6 figure

arXiv.org e-Print Archive

Portsmouth University Research Portal (Pure)

International Migration, Integration and Social Cohesion online publications

Symmetries, Holography and Quantum Phase Transition in Two-dimensional Dilaton AdS Gravity

Author: Cadoni Mariano
Ciulu Matteo
Tuveri Matteo
Publication venue: 'American Physical Society (APS)'
Publication date: 07/11/2017
Field of study

We present a revisitation of the Almheiri-Polchinski dilaton gravity model from a two-dimensional (2D) bulk perspective. We describe a peculiar feature of the model, namely the pattern of conformal symmetry breaking using bulk Killing vectors, a covariant definition of mass and the flow between different vacua of the theory. We show that the effect of the symmetry breaking is both the generation of an infrared scale (a mass gap) and to make local the Goldstone modes associated with the asymptotic symmetries of the 2D spacetime. In this way a non vanishing central charge is generated in the dual conformal theory, which accounts for the microscopic entropy of the 2D black hole. The use of covariant mass allows to compare energetically the two different vacua of the theory and to show that at zero temperature the vacuum with a constant dilaton is energetically preferred. We also translate in the bulk language several features of the dual CFT discussed by Maldacena et al. The uplifting of the 2D model to

(d+2)-

dimensional theories exhibiting hyperscaling violation is briefly discussed.Comment: 7 pages, no figure

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Cagliari

A reversible allelic partition process and Pitman sampling formula

Author: De Blasi Pierpaolo
Giordano Matteo
Ruggiero Matteo
Publication venue
Publication date: 01/01/2020
Field of study

We introduce a continuous-time Markov chain describing dynamic allelic partitions which extends the branching process construction of the Pitman sampling formula in Pitman (2006) and the birth-and-death process with immigration studied in Karlin and McGregor (1967), in turn related to the celebrated Ewens sampling formula. A biological basis for the scheme is provided in terms of a population of individuals grouped into families, that evolves according to a sequence of births, deaths and immigrations. We investigate the asymptotic behaviour of the chain and show that, as opposed to the birth-and-death process with immigration, this construction maintains in the temporal limit the mutual dependence among the multiplicities. When the death rate exceeds the birth rate, the system is shown to have reversible distribution identified as a mixture of Pitman sampling formulae, with negative binomial mixing distribution on the population size. The population therefore converges to a stationary random configuration, characterised by a finite number of families and individuals.Comment: 17 pages, to appear in ALEA , Latin American Journal of Probability and Mathematical Statistic

arXiv.org e-Print Archive

Institutional Research Information System University of Turin

Measurements by A LEAP-Based Virtual Glove for the hand rehabilitation

Author: Cinque Luigi
Placidi Giuseppe
Polsinelli Matteo
Spezialetti Matteo
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

Hand rehabilitation is fundamental after stroke or surgery. Traditional rehabilitation requires a therapist and implies high costs, stress for the patient, and subjective evaluation of the therapy effectiveness. Alternative approaches, based on mechanical and tracking-based gloves, can be really effective when used in virtual reality (VR) environments. Mechanical devices are often expensive, cumbersome, patient specific and hand specific, while tracking-based devices are not affected by these limitations but, especially if based on a single tracking sensor, could suffer from occlusions. In this paper, the implementation of a multi-sensors approach, the Virtual Glove (VG), based on the simultaneous use of two orthogonal LEAP motion controllers, is described. The VG is calibrated and static positioning measurements are compared with those collected with an accurate spatial positioning system. The positioning error is lower than 6 mm in a cylindrical region of interest of radius 10 cm and height 21 cm. Real-time hand tracking measurements are also performed, analysed and reported. Hand tracking measurements show that VG operated in real-time (60 fps), reduced occlusions, and managed two LEAP sensors correctly, without any temporal and spatial discontinuity when skipping from one sensor to the other. A video demonstrating the good performance of VG is also collected and presented in the Supplementary Materials. Results are promising but further work must be done to allow the calculation of the forces exerted by each finger when constrained by mechanical tools (e.g., peg-boards) and for reducing occlusions when grasping these tools. Although the VG is proposed for rehabilitation purposes, it could also be used for tele-operation of tools and robots, and for other VR applications

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

Quantum memories with zero-energy Majorana modes and experimental constraints

Author: Giovannetti Vittorio
Ippoliti Matteo
Mazza Leonardo
Rizzi Matteo
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2016
Field of study

In this work we address the problem of realizing a reliable quantum memory based on zero-energy Majorana modes in the presence of experimental constraints on the operations aimed at recovering the information. In particular, we characterize the best recovery operation acting only on the zero-energy Majorana modes and the memory fidelity that can be therewith achieved. In order to understand the effect of such restriction, we discuss two examples of noise models acting on the topological system and compare the amount of information that can be recovered by accessing either the whole system, or the zero-modes only, with particular attention to the scaling with the size of the system and the energy gap. We explicitly discuss the case of a thermal bosonic environment inducing a parity-preserving Markovian dynamics in which the introduced memory fidelity decays exponentially in time, independent from system size, thus showing the impossibility to retrieve the information by acting on the zero-modes only. We argue, however, that even in the presence of experimental limitations, the Hamiltonian gap is still beneficial to the storage of information.Comment: 18 pages, 7 figures. Updated to published versio

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Weighted fast diffusion equations (Part I): Sharp asymptotic rates without symmetry and symmetry breaking in Caffarelli-Kohn-Nirenberg inequalities

Author: Bonforte Matteo
Dolbeault Jean
Muratori Matteo
Nazaret Bruno
Publication venue
Publication date: 20/06/2016
Field of study

In this paper we consider a family of Caffarelli-Kohn-Nirenberg interpolation inequalities (CKN), with two radial power law weights and exponents in a subcritical range. We address the question of symmetry breaking: are the optimal functions radially symmetric, or not ? Our intuition comes from a weighted fast diffusion (WFD) flow: if symmetry holds, then an explicit entropy - entropy production inequality which governs the intermediate asymptotics is indeed equivalent to (CKN), and the self-similar profiles are optimal for (CKN). We establish an explicit symmetry breaking condition by proving the linear instability of the radial optimal functions for (CKN). Symmetry breaking in (CKN) also has consequences on entropy - entropy production inequalities and on the intermediate asymptotics for (WFD). Even when no symmetry holds in (CKN), asymptotic rates of convergence of the solutions to (WFD) are determined by a weighted Hardy-Poincar{\'e} inequality which is interpreted as a linearized entropy - entropy production inequality. All our results rely on the study of the bottom of the spectrum of the linearized diffusion operator around the self-similar profiles, which is equivalent to the linearization of (CKN) around the radial optimal functions, and on variational methods. Consequences for the (WFD) flow will be studied in Part II of this work

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

HAL-Paris1