9,125 research outputs found
Deep Ordinal Reinforcement Learning
Reinforcement learning usually makes use of numerical rewards, which have
nice properties but also come with drawbacks and difficulties. Using rewards on
an ordinal scale (ordinal rewards) is an alternative to numerical rewards that
has received more attention in recent years. In this paper, a general approach
to adapting reinforcement learning problems to the use of ordinal rewards is
presented and motivated. We show how to convert common reinforcement learning
algorithms to an ordinal variation by the example of Q-learning and introduce
Ordinal Deep Q-Networks, which adapt deep reinforcement learning to ordinal
rewards. Additionally, we run evaluations on problems provided by the OpenAI
Gym framework, showing that our ordinal variants exhibit a performance that is
comparable to the numerical variations for a number of problems. We also give
first evidence that our ordinal variant is able to produce better results for
problems with less engineered and simpler-to-design reward signals.Comment: replaced figures for better visibility, added github repository, more
details about source of experimental results, updated target value
calculation for standard and ordinal Deep Q-Networ
Lifshitz transitions and quasiparticle de-renormalization in YbRhSi
We study the effect of magnetic fields up to 15 T on the heavy fermion state
of YbRhSi via Hall effect and magnetoresistance measurements down to 50
mK. Our data show anomalies at three different characteristic fields. We
compare our data to renormalized band structure calculations through which we
identify Lifshitz transitions associated with the heavy fermion bands. The Hall
measurements indicate that the de-renormalization of the quasiparticles, {\it
i.e} the destruction of the local Kondo singlets, occurs smoothly while the
Lifshitz transitions occur within rather confined regions of the magnetic
field.Comment: 7 pages, 5 figure
Interaction-induced chiral p_x \pm i p_y superfluid order of bosons in an optical lattice
The study of superconductivity with unconventional order is complicated in
condensed matter systems by their extensive complexity. Optical lattices with
their exceptional precision and control allow one to emulate superfluidity
avoiding many of the complications of condensed matter. A promising approach to
realize unconventional superfluid order is to employ orbital degrees of freedom
in higher Bloch bands. In recent work, indications were found that bosons
condensed in the second band of an optical chequerboard lattice might exhibit
p_x \pm i p_y order. Here we present experiments, which provide strong evidence
for the emergence of p_x \pm i p_y order driven by the interaction in the local
p-orbitals. We compare our observations with a multi-band Hubbard model and
find excellent quantitative agreement
Recommended from our members
3-5-man chess: Maximals and mzugs
This article reports the combined results of several initiatives in creating and surveying complete suites of endgame tables (EGTs) to the Depth to Mate (DTM) and Depth to Conversion (DTC) metrics. Data on percentage results, maximals and mutual zugzwangs, mzugs, has been filed and made available on the web, as have the DTM EGTs
Model-free preference-based reinforcement learning
Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tuning from a human expert. In contrast, preference-based reinforcement learning (PBRL) utilizes only pairwise comparisons between trajectories as a feedback signal, which are often more intuitive to specify. Currently available approaches to PBRL for control problems with continuous state/action spaces require a known or estimated model, which is often not available and hard to learn. In this paper, we integrate preference-based estimation of the reward function into a model-free reinforcement learning (RL) algorithm, resulting in a model-free PBRL algorithm. Our new algorithm is based on Relative Entropy Policy Search (REPS), enabling us to utilize stochastic policies and to directly control the greediness of the policy update. REPS decreases exploration of the policy slowly by limiting the relative entropy of the policy update, which ensures that the algorithm is provided with a versatile set of trajectories, and consequently with informative preferences. The preference-based estimation is computed using a sample-based Bayesian method, which can also estimate the uncertainty of the utility. Additionally, we also compare to a linear solvable approximation, based on inverse RL. We show that both approaches perform favourably to the current state-of-the-art. The overall result is an algorithm that can learn non-parametric continuous action policies from a small number of preferences
Pressure-induced phase transitions and high-pressure tetragonal phase of Fe1.08Te
We report the effects of hydrostatic pressure on the temperature-induced
phase transitions in Fe1.08Te in the pressure range 0-3 GPa using synchrotron
powder x-ray diffraction (XRD). The results reveal a plethora of phase
transitions. At ambient pressure, Fe1.08Te undergoes simultaneous first-order
structural symmetry-breaking and magnetic phase transitions, namely from the
paramagnetic tetragonal (P4/nmm) to the antiferromagnetic monoclinic (P2_1/m)
phase. We show that, at a pressure of 1.33 GPa, the low temperature structure
adopts an orthorhombic symmetry. More importantly, for pressures of 2.29 GPa
and higher, a symmetry-conserving tetragonal-tetragonal phase transition has
been identified from a change in the c/a ratio of the lattice parameters. The
succession of different pressure and temperature-induced structural and
magnetic phases indicates the presence of strong magneto-elastic coupling
effects in this material.Comment: 11 page
On the Nature of the Strong Emission-Line Galaxies in Cluster Cl 0024+1654: Are Some the Progenitors of Low Mass Spheroidals?
We present new size, line ratio, and velocity width measurements for six
strong emission-line galaxies in the galaxy cluster, Cl 0024+1654, at redshift
z~0.4. The velocity widths from Keck spectra are all narrow (30<sigma<120
km/s), with three profiles showing double peaks. Four galaxies have low masses
(M<10^{10} Mo). Whereas three galaxies were previously reported to be possible
AGNs, none exhibit AGN-like emission line ratios or velocity widths. Two or
three appear as very blue spirals with the remainder more akin to luminous H-II
galaxies undergoing a strong burst of star formation. We propose that after the
burst subsides, these galaxies will transform into quiescent dwarfs, and are
thus progenitors of some cluster spheroidals (We adopt the nomenclature
suggested by Kormendy & Bender (1994), i.e., low-density, dwarf ellipsoidal
galaxies like NGC 205 are called `spheroidals' instead of `dwarf ellipticals')
seen today.Comment: 14 pages + 2 figures + 1 table, LaTeX, Acc. for publ. in ApJL also
available at http://www.ucolick.org/~deep/papers/papers.htm
Recommended from our members
A metasynthesis of studies of patients’ experience of living with terminal cancer
Objective: The aim of this research was to produce a synthesis of phenomenological studies of the experience of living with the awareness of having terminal cancer in order to gain a more complete understanding of the parameters of this experience.
Methods: This research used metasynthesis as a method for integrating the results of 23 phenomenological studies of the experience of living with the awareness of having terminal cancer published between 2011 and 2016.
Results: The metasynthesis generated 19 theme clusters which informed the construction of four master themes: trauma, liminality, holding on to life and life as a cancer patient. Each master theme captures a distinct experiential dimension of living with the awareness of having terminal cancer. Each dimension brings with it significant and distinctive psychological challenges.
Conclusion: The results from the present metasynthesis suggest that the experience of living with the awareness of having terminal cancer is a multi-dimensional experience which patients actively negotiate as they search for ways in which they can rise to the psychological challenges associated with it. A better understanding of the parameters of this experience can help health care professionals provide appropriate support for this client group
A Search for X-Ray Bright Distant Clusters of Galaxies
We present the results of a search for X--ray luminous distant clusters of
galaxies. We found extended X--ray emission characteristic of a cluster towards
two of our candidate clusters of galaxies. They both have a luminosity in the
ROSAT bandpass of and a redshift of ;
thus making them two of the most distant X--ray clusters ever observed.
Furthermore, we show that both clusters are optically rich and have a known
radio source associated with them. We compare our result with other recent
searches for distant X--ray luminous clusters and present a lower limit of
for the number density of such high redshift
clusters. This limit is consistent with the expected abundance of such clusters
in a standard (b=2) Cold Dark Matter Universe. Finally, our clusters provide
important high redshift targets for further study into the origin and evolution
of massive clusters of galaxies. Accepted for publication in the 10th September
1994 issue of ApJ.Comment: 20 pages Latex file + 1 postscript figure file appende
- …
