3,005 research outputs found

    Geometry and convergence of natural policy gradient methods

    Full text link
    We study the convergence of several natural policy gradient (NPG) methods in infinite-horizon discounted Markov decision processes with regular policy parametrizations. For a variety of NPGs and reward functions we show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, based on which we obtain global convergence guarantees and convergence rates. In particular, we show linear convergence for unregularized and regularized NPG flows with the metrics proposed by Kakade and Morimura and co-authors by observing that these arise from the Hessian geometries of conditional entropy and entropy respectively. Further, we obtain sublinear convergence rates for Hessian geometries arising from other convex functions like log-barriers. Finally, we interpret the discrete-time NPG methods with regularized rewards as inexact Newton methods if the NPG is defined with respect to the Hessian geometry of the regularizer. This yields local quadratic convergence rates of these methods for step size equal to the penalization strength.Comment: 33 pages, 5 figures, under revie

    Localizing and Estimating Causal Relations of Interacting Brain Rhythms

    Get PDF
    Estimating brain connectivity and especially causality between different brain regions from EEG or MEG is limited by the fact that the data are a largely unknown superposition of the actual brain activities. Any method, which is not robust to mixing artifacts, is prone to yield false positive results. We here review a number of methods that allow for addressing this problem. They are all based on the insight that the imaginary part of the cross-spectra cannot be explained as a mixing artifact. First, a joined decomposition of these imaginary parts into pairwise activities separates subsystems containing different rhythmic activities. Second, assuming that the respective source estimates are least overlapping, yields a separation of the rhythmic interacting subsystem into the source topographies themselves. Finally, a causal relation between these sources can be estimated using the newly proposed measure Phase Slope Index (PSI). This work, for the first time, presents the above methods in combination; all illustrated using a single, simulated data set

    Internationales Product Management 2011 : Einsatz und Trends – Ergebnisse Schweiz

    Get PDF
    StudieIm Bereich Product Management führen wir 2011 erstmalig eine umfassende Studie zum Status Quo und den aktuellen Trends des Product Managements durch. Befragt werden Product Management-Entscheidungsträger aus der Schweiz sowie aus dem Ausland
    corecore