9,125 research outputs found

    Deep Ordinal Reinforcement Learning

    Full text link
    Reinforcement learning usually makes use of numerical rewards, which have nice properties but also come with drawbacks and difficulties. Using rewards on an ordinal scale (ordinal rewards) is an alternative to numerical rewards that has received more attention in recent years. In this paper, a general approach to adapting reinforcement learning problems to the use of ordinal rewards is presented and motivated. We show how to convert common reinforcement learning algorithms to an ordinal variation by the example of Q-learning and introduce Ordinal Deep Q-Networks, which adapt deep reinforcement learning to ordinal rewards. Additionally, we run evaluations on problems provided by the OpenAI Gym framework, showing that our ordinal variants exhibit a performance that is comparable to the numerical variations for a number of problems. We also give first evidence that our ordinal variant is able to produce better results for problems with less engineered and simpler-to-design reward signals.Comment: replaced figures for better visibility, added github repository, more details about source of experimental results, updated target value calculation for standard and ordinal Deep Q-Networ

    Lifshitz transitions and quasiparticle de-renormalization in YbRh2_2Si2_2

    Full text link
    We study the effect of magnetic fields up to 15 T on the heavy fermion state of YbRh2_2Si2_2 via Hall effect and magnetoresistance measurements down to 50 mK. Our data show anomalies at three different characteristic fields. We compare our data to renormalized band structure calculations through which we identify Lifshitz transitions associated with the heavy fermion bands. The Hall measurements indicate that the de-renormalization of the quasiparticles, {\it i.e} the destruction of the local Kondo singlets, occurs smoothly while the Lifshitz transitions occur within rather confined regions of the magnetic field.Comment: 7 pages, 5 figure

    Interaction-induced chiral p_x \pm i p_y superfluid order of bosons in an optical lattice

    Full text link
    The study of superconductivity with unconventional order is complicated in condensed matter systems by their extensive complexity. Optical lattices with their exceptional precision and control allow one to emulate superfluidity avoiding many of the complications of condensed matter. A promising approach to realize unconventional superfluid order is to employ orbital degrees of freedom in higher Bloch bands. In recent work, indications were found that bosons condensed in the second band of an optical chequerboard lattice might exhibit p_x \pm i p_y order. Here we present experiments, which provide strong evidence for the emergence of p_x \pm i p_y order driven by the interaction in the local p-orbitals. We compare our observations with a multi-band Hubbard model and find excellent quantitative agreement

    Model-free preference-based reinforcement learning

    Get PDF
    Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tuning from a human expert. In contrast, preference-based reinforcement learning (PBRL) utilizes only pairwise comparisons between trajectories as a feedback signal, which are often more intuitive to specify. Currently available approaches to PBRL for control problems with continuous state/action spaces require a known or estimated model, which is often not available and hard to learn. In this paper, we integrate preference-based estimation of the reward function into a model-free reinforcement learning (RL) algorithm, resulting in a model-free PBRL algorithm. Our new algorithm is based on Relative Entropy Policy Search (REPS), enabling us to utilize stochastic policies and to directly control the greediness of the policy update. REPS decreases exploration of the policy slowly by limiting the relative entropy of the policy update, which ensures that the algorithm is provided with a versatile set of trajectories, and consequently with informative preferences. The preference-based estimation is computed using a sample-based Bayesian method, which can also estimate the uncertainty of the utility. Additionally, we also compare to a linear solvable approximation, based on inverse RL. We show that both approaches perform favourably to the current state-of-the-art. The overall result is an algorithm that can learn non-parametric continuous action policies from a small number of preferences

    Pressure-induced phase transitions and high-pressure tetragonal phase of Fe1.08Te

    Full text link
    We report the effects of hydrostatic pressure on the temperature-induced phase transitions in Fe1.08Te in the pressure range 0-3 GPa using synchrotron powder x-ray diffraction (XRD). The results reveal a plethora of phase transitions. At ambient pressure, Fe1.08Te undergoes simultaneous first-order structural symmetry-breaking and magnetic phase transitions, namely from the paramagnetic tetragonal (P4/nmm) to the antiferromagnetic monoclinic (P2_1/m) phase. We show that, at a pressure of 1.33 GPa, the low temperature structure adopts an orthorhombic symmetry. More importantly, for pressures of 2.29 GPa and higher, a symmetry-conserving tetragonal-tetragonal phase transition has been identified from a change in the c/a ratio of the lattice parameters. The succession of different pressure and temperature-induced structural and magnetic phases indicates the presence of strong magneto-elastic coupling effects in this material.Comment: 11 page

    On the Nature of the Strong Emission-Line Galaxies in Cluster Cl 0024+1654: Are Some the Progenitors of Low Mass Spheroidals?

    Get PDF
    We present new size, line ratio, and velocity width measurements for six strong emission-line galaxies in the galaxy cluster, Cl 0024+1654, at redshift z~0.4. The velocity widths from Keck spectra are all narrow (30<sigma<120 km/s), with three profiles showing double peaks. Four galaxies have low masses (M<10^{10} Mo). Whereas three galaxies were previously reported to be possible AGNs, none exhibit AGN-like emission line ratios or velocity widths. Two or three appear as very blue spirals with the remainder more akin to luminous H-II galaxies undergoing a strong burst of star formation. We propose that after the burst subsides, these galaxies will transform into quiescent dwarfs, and are thus progenitors of some cluster spheroidals (We adopt the nomenclature suggested by Kormendy & Bender (1994), i.e., low-density, dwarf ellipsoidal galaxies like NGC 205 are called `spheroidals' instead of `dwarf ellipticals') seen today.Comment: 14 pages + 2 figures + 1 table, LaTeX, Acc. for publ. in ApJL also available at http://www.ucolick.org/~deep/papers/papers.htm

    A Search for X-Ray Bright Distant Clusters of Galaxies

    Full text link
    We present the results of a search for X--ray luminous distant clusters of galaxies. We found extended X--ray emission characteristic of a cluster towards two of our candidate clusters of galaxies. They both have a luminosity in the ROSAT bandpass of 1044ergs1\simeq10^{44}{\rm \,erg\,s^{-1}} and a redshift of >0.5>0.5; thus making them two of the most distant X--ray clusters ever observed. Furthermore, we show that both clusters are optically rich and have a known radio source associated with them. We compare our result with other recent searches for distant X--ray luminous clusters and present a lower limit of 1.2×107Mpc31.2\times10^{-7}\,{\rm Mpc^{-3}} for the number density of such high redshift clusters. This limit is consistent with the expected abundance of such clusters in a standard (b=2) Cold Dark Matter Universe. Finally, our clusters provide important high redshift targets for further study into the origin and evolution of massive clusters of galaxies. Accepted for publication in the 10th September 1994 issue of ApJ.Comment: 20 pages Latex file + 1 postscript figure file appende
    corecore