8,347 research outputs found

    CrossNorm: Normalization for Off-Policy TD Reinforcement Learning

    Full text link
    Off-policy temporal difference (TD) methods are a powerful class of reinforcement learning (RL) algorithms. Intriguingly, deep off-policy TD algorithms are not commonly used in combination with feature normalization techniques, despite positive effects of normalization in other domains. We show that naive application of existing normalization techniques is indeed not effective, but that well-designed normalization improves optimization stability and removes the necessity of target networks. In particular, we introduce a normalization based on a mixture of on- and off-policy transitions, which we call cross-normalization. It can be regarded as an extension of batch normalization that re-centers data for two different distributions, as present in off-policy learning. Applied to DDPG and TD3, cross-normalization improves over the state of the art across a range of MuJoCo benchmark tasks

    Did the European Free Movement of Persons and Residence Directive Change Migration Patterns within the EU? A First Glance

    Get PDF
    Niederlassungsrecht, EU-Recht, FreizĂŒgigkeit, MobilitĂ€t, EuropĂ€ische Wirtschafts- und WĂ€hrungsunion, Freedom of establishment, Community law, Freedom of movement , Mobility, European Economic and Monetary Union
    • 

    corecore