3,405 research outputs found
Deep Ordinal Reinforcement Learning
Reinforcement learning usually makes use of numerical rewards, which have
nice properties but also come with drawbacks and difficulties. Using rewards on
an ordinal scale (ordinal rewards) is an alternative to numerical rewards that
has received more attention in recent years. In this paper, a general approach
to adapting reinforcement learning problems to the use of ordinal rewards is
presented and motivated. We show how to convert common reinforcement learning
algorithms to an ordinal variation by the example of Q-learning and introduce
Ordinal Deep Q-Networks, which adapt deep reinforcement learning to ordinal
rewards. Additionally, we run evaluations on problems provided by the OpenAI
Gym framework, showing that our ordinal variants exhibit a performance that is
comparable to the numerical variations for a number of problems. We also give
first evidence that our ordinal variant is able to produce better results for
problems with less engineered and simpler-to-design reward signals.Comment: replaced figures for better visibility, added github repository, more
details about source of experimental results, updated target value
calculation for standard and ordinal Deep Q-Networ
Recommended from our members
KQQKQQ and the Kasparov-World Game
The 1999 Kasparov-World game for the first time enabled anyone to join a team playing against a World Chess Champion via the web. It included a surprise in the opening, complex middle-game strategy and a deep ending. As the game headed for its mysterious finale, the World Team re-quested a KQQKQQ endgame table and was provided with two by the authors. This paper
describes their work, compares the methods used, examines the issues raised and summarises the concepts involved for the benefit of future workers in the endgame field. It also notes the contribution of this endgame to chess itself
Vehicle test report: Jet Industries Electra Van 600
The Electra Van 600, an electric vehicle, was tested. Tests were performed to characterize parameters of the Electra Van 600 and to provide baseline data to be used for comparison of improved batteries and to which will be incorporated into the vehicle. The vehicle tests concentrated on the electrical drive subsystem, the batteries, controller, and motor; coastdowns to characterize the road load and range evaluation for cyclic and constant speed conditions; and qualitative performance was evaluated. It is found that the Electra Van 600 range performance is approximately equal to the majority of the vehicles tested previously
Vehicle test report: Electric Vehicle Associates electric conversion of an AMC Pacer
Tests were performed to characterize certain parameters of the EVA Pacer and to provide baseline data that can be used for the comparison of improved batteries that may be incorporated into the vehicle at a later time. The vehicle tests were concentrated on the electrical drive subsystem; i.e., the batteries, controller and motor. The tests included coastdowns to characterize the road load, and range evaluations for both cyclic and constant speed conditions. A qualitative evaluation of the vehicle's performance was made by comparing its constant speed range performance with other electric and hybrid vehicles. The Pacer performance was approximately equal to the majority of those vehicles assessed in 1977
Vehicle test report: Electric Vehicle Associates electric conversion of an AMC Pacer
The change of pace, an electric vehicle was tested. These tests were performed to characterize certain parameters of the electric vehicle pacer and to provide baseline data that can be used for the comparison of improved batteries that may be incorporated into the vehicle at a later time. The vehicle tests were concentrated on the electrical drive subsystem, the batteries, controller and motor. Coastdowns to characterize the road load, and range evaluations for both cyclic and constant speed conditions were performed. The vehicle's performance was evaluated by comparing its constant speed range performance with described vehicles. It is found that the pacer performance is approximately equal to the majority of the vehicles tested in the 1977 assessment
Defintion of "banner clouds" based on time lapse movies
International audienceBanner clouds appear on the leeward side of a mountain and resemble a banner or a flag. This article provides a comprehensive definition of "banner clouds". It is based primarily on an extensive collection of time lapse movies, but previous attempts at an explanation of this phenomenon are also taken into account. The following ingredients are considered essential: the cloud must be attached to the mountain but not appear on the windward side; the cloud must originate from condensation of water vapour contained in the air (rather than consist of blowing snow); the cloud must be persistent; and the cloud must not be of convective nature. The definition is illustrated and discussed with the help of still images and time lapse movies taken at Mount Zugspitze in the Bavarian Alps
Possible experimental signature of octupole correlations in the 0 states of the actinides
= 0 states have been investigated in the actinide nucleus
Pu up to an excitation energy of 3 MeV with a high-resolution (p,t)
experiment at = 24 MeV. To test the recently proposed = 0
double-octupole structure, the phenomenological approach of the
spdf-interacting boson model has been chosen. In addition, the total 0
strength distribution and the strength fragmentation have been compared
to the model predictions as well as to the previously studied (p,t) reactions
in the actinides. The results suggest that the structure of the 0 states
in the actinides might be more complex than the usually discussed pairing
isomers. Instead, the octupole degree of freedom might contribute
significantly. The signature of two close-lying 0 states below the
2-quasiparticle energy is presented as a possible manifestation of strong
octupole correlations in the structure of the 0 states in the actinides.Comment: 6 pages, 5 figures, published in Phys. Rev. C 88, 041303(R) (2013
- …