13,717 research outputs found
Deep Ordinal Reinforcement Learning
Reinforcement learning usually makes use of numerical rewards, which have
nice properties but also come with drawbacks and difficulties. Using rewards on
an ordinal scale (ordinal rewards) is an alternative to numerical rewards that
has received more attention in recent years. In this paper, a general approach
to adapting reinforcement learning problems to the use of ordinal rewards is
presented and motivated. We show how to convert common reinforcement learning
algorithms to an ordinal variation by the example of Q-learning and introduce
Ordinal Deep Q-Networks, which adapt deep reinforcement learning to ordinal
rewards. Additionally, we run evaluations on problems provided by the OpenAI
Gym framework, showing that our ordinal variants exhibit a performance that is
comparable to the numerical variations for a number of problems. We also give
first evidence that our ordinal variant is able to produce better results for
problems with less engineered and simpler-to-design reward signals.Comment: replaced figures for better visibility, added github repository, more
details about source of experimental results, updated target value
calculation for standard and ordinal Deep Q-Networ
Crawling in Rogue's dungeons with (partitioned) A3C
Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of
its gender. Rogue-like games are known for the necessity to explore partially
observable and always different randomly-generated labyrinths, preventing any
form of level replay. As such, they serve as a very natural and challenging
task for reinforcement learning, requiring the acquisition of complex,
non-reactive behaviors involving memory and planning. In this article we show
how, exploiting a version of A3C partitioned on different situations, the agent
is able to reach the stairs and descend to the next level in 98% of cases.Comment: Accepted at the Fourth International Conference on Machine Learning,
Optimization, and Data Science (LOD 2018
The Dreaming Variational Autoencoder for Reinforcement Learning Environments
Reinforcement learning has shown great potential in generalizing over raw
sensory data using only a single neural network for value optimization. There
are several challenges in the current state-of-the-art reinforcement learning
algorithms that prevent them from converging towards the global optima. It is
likely that the solution to these problems lies in short- and long-term
planning, exploration and memory management for reinforcement learning
algorithms. Games are often used to benchmark reinforcement learning algorithms
as they provide a flexible, reproducible, and easy to control environment.
Regardless, few games feature a state-space where results in exploration,
memory, and planning are easily perceived. This paper presents The Dreaming
Variational Autoencoder (DVAE), a neural network based generative modeling
architecture for exploration in environments with sparse feedback. We further
present Deep Maze, a novel and flexible maze engine that challenges DVAE in
partial and fully-observable state-spaces, long-horizon tasks, and
deterministic and stochastic problems. We show initial findings and encourage
further work in reinforcement learning driven by generative exploration.Comment: Best Student Paper Award, Proceedings of the 38th SGAI International
Conference on Artificial Intelligence, Cambridge, UK, 2018, Artificial
Intelligence XXXV, 201
Systematic review and meta-analysis. small intestinal bacterial overgrowth in chronic pancreatitis
BACKGROUND:
Evidence on small intestinal bacterial overgrowth (SIBO) in patients with chronic pancreatitis (CP) is conflicting.
AIM:
The purpose of this study was to perform a systematic review and meta-analysis on the prevalence of SIBO in CP and to examine the relationship of SIBO with symptoms and nutritional status.
METHODS:
Case-control and cross-sectional studies investigating SIBO in CP patients were analysed. The prevalence of positive tests was pooled across studies, and the rate of positivity between CP cases and controls was calculated.
RESULTS:
In nine studies containing 336 CP patients, the pooled prevalence of SIBO was 36% (95% confidence interval (CI) 17-60%) with considerable heterogeneity (I2 = 91%). A sensitivity analysis excluding studies employing lactulose breath test gave a pooled prevalence of 21.7% (95% CI 12.7-34.5%) with lower heterogeneity (I2 = 56%). The odds ratio for a positive test in CP vs controls was 4.1 (95% CI 1.6-10.4) (I2 = 59.7%). The relationship between symptoms and SIBO in CP patients varied across studies, and the treatment of SIBO was associated with clinical improvement.
CONCLUSIONS:
One-third of CP patients have SIBO, with a significantly increased risk over controls, although results are heterogeneous, and studies carry several limitations. The impact of SIBO and its treatment in CP patients deserve further investigation
Recommended from our members
Visualising gas heating from an RF plasma loudspeaker
In an electro-acoustic transduction mechanism, an ac modulation (here in the audio frequency range) of the electric field in an atmospheric pressure air plasma gives rise to a rapid increase in the gas temperature and dimensions of the gas volume. As in natural lightning, the rapid expansion in the ionised column though the air produces external pressure variations at the modulation frequency.
\ud
Spatial and temporal measurement of the gas temperature can identify the nature of the thermal expansion and provide a direct approach to understanding its relationship to the sound pressure wave that is generated. However, the established method through spectroscopic measurement of rotational line emission from nitrogen molecules is limited to the main current channel where relaxation and subsequent optical emission of the excited nitrogen molecules occurs. The wider picture is revealed through the use of the Schlieren method where the refractive index gradients caused by gas heating in the plasma are imaged
- …