126 research outputs found

    Effect of spin-orbit interaction on the critical temperature of an ideal Bose gas

    Full text link
    We consider Bose-Einstein condensation of an ideal bose gas with an equal mixture of `Rashba' and `Dresselhaus' spin-orbit interactions and study its effect on the critical temperature. In uniform bose gas a `cusp' and a sharp drop in the critical temperature occurs due to the change in the density of states at a critical Raman coupling where the degeneracy of the ground states is lifted. Relative drop in the critical temperature depends on the diluteness of the gas as well as on the spin-orbit coupling strength. In the presence of a harmonic trap, the cusp in the critical temperature smoothened out and a minimum appears. Both the drop in the critical temperature and lifting of `quasi-degeneracy' of the ground states exhibit crossover phenomena which is controlled by the trap frequency. By considering a 'Dicke' like model we extend our calculation to bosons with large spin and observe a similar minimum in the critical temperature near the critical Raman frequency, which becomes deeper for larger spin. Finally in the limit of infinite spin, the critical temperature vanishes at the critical frequency, which is a manifestation of Dicke type quantum phase transition.Comment: 9 pages, 6 figure

    Misspecified Linear Bandits

    Full text link
    We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold under the assumption that the arms expected rewards are perfectly linear in their features. It is, however, of interest to investigate the impact of potential misspecification in linear bandit models, where the expected rewards are perturbed away from the linear subspace determined by the arms features. Although OFUL has recently been shown to be robust to relatively small deviations from linearity, we show that any linear bandit algorithm that enjoys optimal regret performance in the perfectly linear setting (e.g., OFUL) must suffer linear regret under a sparse additive perturbation of the linear model. In an attempt to overcome this negative result, we define a natural class of bandit models characterized by a non-sparse deviation from linearity. We argue that the OFUL algorithm can fail to achieve sublinear regret even under models that have non-sparse deviation.We finally develop a novel bandit algorithm, comprising a hypothesis test for linearity followed by a decision to use either the OFUL or Upper Confidence Bound (UCB) algorithm. For perfectly linear bandit models, the algorithm provably exhibits OFULs favorable regret performance, while for misspecified models satisfying the non-sparse deviation property, the algorithm avoids the linear regret phenomenon and falls back on UCBs sublinear regret scaling. Numerical experiments on synthetic data, and on recommendation data from the public Yahoo! Learning to Rank Challenge dataset, empirically support our findings.Comment: Thirty-First AAAI Conference on Artificial Intelligence, 201

    No-Regret Reinforcement Learning with Value Function Approximation: a Kernel Embedding Approach

    Full text link
    We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. In many real-world RL environments, the state and action spaces are continuous or very large. Existing approaches establish regret guarantees by either a low-dimensional representation of the stochastic transition model or an approximation of the QQ-functions. However, the understanding of function approximation schemes for state-value functions largely remains missing. In this paper, we propose an online model-based RL algorithm, namely the CME-RL, that learns representations of transition distributions as embeddings in a reproducing kernel Hilbert space while carefully balancing the exploitation-exploration tradeoff. We demonstrate the efficiency of our algorithm by proving a frequentist (worst-case) regret bound that is of order O~(HγNN)\tilde{O}\big(H\gamma_N\sqrt{N}\big), where HH is the episode length, NN is the total number of time steps and γN\gamma_N is an information theoretic quantity relating the effective dimension of the state-action feature space. Our method bypasses the need for estimating transition probabilities and applies to any domain on which kernels can be defined. It also brings new insights into the general theory of kernel methods for approximate inference and RL regret minimization
    • …
    corecore