2,738 research outputs found

    Unconventional Superconducting Symmetry in a Checkerboard Antiferromagnet

    Full text link
    We use a renormalized mean field theory to study the Gutzwiller projected BCS states of the extended Hubbard model in the large UU limit, or the tt-t′t'-JJ-J′J' model on a two-dimensional checkerboard lattice. At small t′/tt'/t, the frustration due to the diagonal terms of t′t' and J′J' does not alter the dx2−y2d_{x^2-y^2}-wave pairing symmetry, and the negative (positive) t′/tt'/t enhances (suppresses) the pairing order parameter. At large t′/tt'/t, the ground state has an extended s-wave symmetry. At the intermediate t′/tt'/t, the ground state is d+idd+id or d+isd+is-wave with time reversal symmetry broken.Comment: 6 pages, 6 figure

    Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning

    Get PDF
    Reinforcement learning is an important branch of machine learning.With the development of deep learning,deep reinforcement learning research has gradually developed into the focus of reinforcement learning research.Model-free off-policy deep reinforcement learning algorithms for continuous control attract everyone’s attention because of their strong practicality.Like Q-learning,algorithms based on actor-critic suffer from the problem of overestimations.To a certain extent,clipped double Q-lear-ning method solves the effect of the overestimation in actor-critic algorithms,but it also introduces underestimation to the lear-ning process.In order to further solve the problems of overestimation and underestimation in the actor-critic algorithms,a new learning method,randomly weighted triple Q-learning method is proposed.In addition,combining the new method with the soft actor critic algorithm,a new soft actor critic algorithm based on randomly weighted triple Q-learning is proposed.This algorithm not only limits the Q estimation value near the real Q value,but also increases the randomness of the Q estimation value through randomly weighted method,so as to solve the problems of overestimation and underestimation of action value in the learning process.Experiment results show that,compared to the SAC algorithm and other currently popular deep reinforcement learning algorithms such as DDPG,PPO and TD3,the SAC-RWTQ algorithm has better performance on several Mujoco tasks on the gym simulation platform

    Information Theory of Blockchain Systems

    Full text link
    In this paper, we apply the information theory to provide an approximate expression of the steady-state probability distribution for blockchain systems. We achieve this goal by maximizing an entropy function subject to specific constraints. These constraints are based on some prior information, including the average numbers of transactions in the block and the transaction pool, respectively. Furthermore, we use some numerical experiments to analyze how the key factors in this approximate expression depend on the crucial parameters of the blockchain system. As a result, this approximate expression has important theoretical significance in promoting practical applications of blockchain technology. At the same time, not only do the method and results given in this paper provide a new line in the study of blockchain queueing systems, but they also provide the theoretical basis and technical support for how to apply the information theory to the investigation of blockchain queueing networks and stochastic models more broadly.Comment: 14 pages, 5 figure
    • …
    corecore