457 research outputs found

    Long-term Stabilization of Fiber Laser Using Phase-locking Technique with Ultra-low Phase Noise and Phase Drift

    Full text link
    We review the conventional phase-locking technique in the long-term stabilization of the mode-locked fiber laser and investigate the phase noise limitation of the conventional technique. To break the limitation, we propose an improved phase-locking technique with an optic-microwave phase detector in achieving the ultra-low phase noise and phase drift. The mechanism and the theoretical model of the novel phase-locking technique are also discussed. The long-term stabilization experiments demonstrate that the improved technique can achieve the long-term stabilization for the MLFL with ultra-low phase noise and phase drift. The excellent locking performance of the improved phase-locking technique implies that this technique can be used to stabilize the mode-locked fiber laser with the highly stable H-master or optical clock without stability loss

    Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes

    Full text link
    Lying on the heart of intelligent decision-making systems, how policy is represented and optimized is a fundamental problem. The root challenge in this problem is the large scale and the high complexity of policy space, which exacerbates the difficulty of policy learning especially in real-world scenarios. Towards a desirable surrogate policy space, recently policy representation in a low-dimensional latent space has shown its potential in improving both the evaluation and optimization of policy. The key question involved in these studies is by what criterion we should abstract the policy space for desired compression and generalization. However, both the theory on policy abstraction and the methodology on policy representation learning are less studied in the literature. In this work, we make very first efforts to fill up the vacancy. First, we propose a unified policy abstraction theory, containing three types of policy abstraction associated to policy features at different levels. Then, we generalize them to three policy metrics that quantify the distance (i.e., similarity) of policies, for more convenient use in learning policy representation. Further, we propose a policy representation learning approach based on deep metric learning. For the empirical study, we investigate the efficacy of the proposed policy metrics and representations, in characterizing policy difference and conveying policy generalization respectively. Our experiments are conducted in both policy optimization and evaluation problems, containing trust-region policy optimization (TRPO), diversity-guided evolution strategy (DGES) and off-policy evaluation (OPE). Somewhat naturally, the experimental results indicate that there is no a universally optimal abstraction for all downstream learning problems; while the influence-irrelevance policy abstraction can be a generally preferred choice.Comment: Preprint versio

    Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

    Full text link
    Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks. Existing transfer approaches either explicitly computes the similarity between tasks or select appropriate source policies to provide guided explorations for the target task. However, how to directly optimize the target policy by alternatively utilizing knowledge from appropriate source policies without explicitly measuring the similarity is currently missing. In this paper, we propose a novel Policy Transfer Framework (PTF) to accelerate RL by taking advantage of this idea. Our framework learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as the option learning problem. PTF can be easily combined with existing deep RL approaches. Experimental results show it significantly accelerates the learning process and surpasses state-of-the-art policy transfer methods in terms of learning efficiency and final performance in both discrete and continuous action spaces.Comment: Accepted by IJCAI'202
    • …
    corecore