318 research outputs found
The Dreaming Variational Autoencoder for Reinforcement Learning Environments
Reinforcement learning has shown great potential in generalizing over raw
sensory data using only a single neural network for value optimization. There
are several challenges in the current state-of-the-art reinforcement learning
algorithms that prevent them from converging towards the global optima. It is
likely that the solution to these problems lies in short- and long-term
planning, exploration and memory management for reinforcement learning
algorithms. Games are often used to benchmark reinforcement learning algorithms
as they provide a flexible, reproducible, and easy to control environment.
Regardless, few games feature a state-space where results in exploration,
memory, and planning are easily perceived. This paper presents The Dreaming
Variational Autoencoder (DVAE), a neural network based generative modeling
architecture for exploration in environments with sparse feedback. We further
present Deep Maze, a novel and flexible maze engine that challenges DVAE in
partial and fully-observable state-spaces, long-horizon tasks, and
deterministic and stochastic problems. We show initial findings and encourage
further work in reinforcement learning driven by generative exploration.Comment: Best Student Paper Award, Proceedings of the 38th SGAI International
Conference on Artificial Intelligence, Cambridge, UK, 2018, Artificial
Intelligence XXXV, 201
Crawling in Rogue's dungeons with (partitioned) A3C
Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of
its gender. Rogue-like games are known for the necessity to explore partially
observable and always different randomly-generated labyrinths, preventing any
form of level replay. As such, they serve as a very natural and challenging
task for reinforcement learning, requiring the acquisition of complex,
non-reactive behaviors involving memory and planning. In this article we show
how, exploiting a version of A3C partitioned on different situations, the agent
is able to reach the stairs and descend to the next level in 98% of cases.Comment: Accepted at the Fourth International Conference on Machine Learning,
Optimization, and Data Science (LOD 2018
Unsupervised, Efficient and Semantic Expertise Retrieval
We introduce an unsupervised discriminative model for the task of retrieving
experts in online document collections. We exclusively employ textual evidence
and avoid explicit feature engineering by learning distributed word
representations in an unsupervised way. We compare our model to
state-of-the-art unsupervised statistical vector space and probabilistic
generative approaches. Our proposed log-linear model achieves the retrieval
performance levels of state-of-the-art document-centric methods with the low
inference cost of so-called profile-centric approaches. It yields a
statistically significant improved ranking over vector space and generative
models in most cases, matching the performance of supervised methods on various
benchmarks. That is, by using solely text we can do as well as methods that
work with external evidence and/or relevance feedback. A contrastive analysis
of rankings produced by discriminative and generative approaches shows that
they have complementary strengths due to the ability of the unsupervised
discriminative model to perform semantic matching.Comment: WWW2016, Proceedings of the 25th International Conference on World
Wide Web. 201
Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning
Swarm systems constitute a challenging problem for reinforcement learning
(RL) as the algorithm needs to learn decentralized control policies that can
cope with limited local sensing and communication abilities of the agents.
While it is often difficult to directly define the behavior of the agents,
simple communication protocols can be defined more easily using prior knowledge
about the given task. In this paper, we propose a number of simple
communication protocols that can be exploited by deep reinforcement learning to
find decentralized control policies in a multi-robot swarm environment. The
protocols are based on histograms that encode the local neighborhood relations
of the agents and can also transmit task-specific information, such as the
shortest distance and direction to a desired target. In our framework, we use
an adaptation of Trust Region Policy Optimization to learn complex
collaborative tasks, such as formation building and building a communication
link. We evaluate our findings in a simulated 2D-physics environment, and
compare the implications of different communication protocols.Comment: 13 pages, 4 figures, version 2, accepted at ANTS 201
Reinforced Axial Refinement Network for Monocular 3D Object Detection
Monocular 3D object detection aims to extract the 3D position and properties
of objects from a 2D input image. This is an ill-posed problem with a major
difficulty lying in the information loss by depth-agnostic cameras.
Conventional approaches sample 3D bounding boxes from the space and infer the
relationship between the target object and each of them, however, the
probability of effective samples is relatively small in the 3D space. To
improve the efficiency of sampling, we propose to start with an initial
prediction and refine it gradually towards the ground truth, with only one 3d
parameter changed in each step. This requires designing a policy which gets a
reward after several steps, and thus we adopt reinforcement learning to
optimize it. The proposed framework, Reinforced Axial Refinement Network
(RAR-Net), serves as a post-processing stage which can be freely integrated
into existing monocular 3D detection methods, and improve the performance on
the KITTI dataset with small extra computational costs.Comment: Accepted by ECCV 202
Recommended from our members
Explainable AI: The new 42?
Explainable AI is not a new field. Since at least the early exploitation of C.S. Pierce’s abductive reasoning in expert systems of the 1980s, there were reasoning architectures to support an explanation function for complex AI systems, including applications in medical diagnosis, complex multi-component design, and reasoning about the real world. So explainability is at least as old as early AI, and a natural consequence of the design of AI systems. While early expert systems consisted of handcrafted knowledge bases that enabled reasoning over narrowly well-defined domains (e.g., INTERNIST, MYCIN), such systems had no learning capabilities and had only primitive uncertainty handling. But the evolution of formal reasoning architectures to incorporate principled probabilistic reasoning helped address the capture and use of uncertain knowledge.
There has been recent and relatively rapid success of AI/machine learning solutions arises from neural network architectures. A new generation of neural methods now scale to exploit the practical applicability of statistical and algebraic learning approaches in arbitrarily high dimensional spaces. But despite their huge successes, largely in problems which can be cast as classification problems, their effectiveness is still limited by their un-debuggability, and their inability to “explain” their decisions in a human understandable and reconstructable way. So while AlphaGo or DeepStack can crush the best humans at Go or Poker, neither program has any internal model of its task; its representations defy interpretation by humans, there is no mechanism to explain their actions and behaviour, and furthermore, there is no obvious instructional value.. the high performance systems can not help humans improve. Even when we understand the underlying mathematical scaffolding of current machine learning architectures, it is often impossible to get insight into the internal working of the models; we need explicit modeling and reasoning tools to explain how and why a result was achieved. We also know that a significant challenge for future AI is contextual adaptation, i.e., systems that incrementally help to construct explanatory models for solving real-world problems. Here it would be beneficial not to exclude human expertise, but to augment human intelligence with artificial intelligence
Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
Despite having various attractive qualities such as high prediction accuracy
and the ability to quantify uncertainty and avoid over-fitting, Bayesian Matrix
Factorization has not been widely adopted because of the prohibitive cost of
inference. In this paper, we propose a scalable distributed Bayesian matrix
factorization algorithm using stochastic gradient MCMC. Our algorithm, based on
Distributed Stochastic Gradient Langevin Dynamics, can not only match the
prediction accuracy of standard MCMC methods like Gibbs sampling, but at the
same time is as fast and simple as stochastic gradient descent. In our
experiments, we show that our algorithm can achieve the same level of
prediction accuracy as Gibbs sampling an order of magnitude faster. We also
show that our method reduces the prediction error as fast as distributed
stochastic gradient descent, achieving a 4.1% improvement in RMSE for the
Netflix dataset and an 1.8% for the Yahoo music dataset
Skin Lesion Analyser: An Efficient Seven-Way Multi-Class Skin Cancer Classification Using MobileNet
Skin cancer, a major form of cancer, is a critical public health problem with
123,000 newly diagnosed melanoma cases and between 2 and 3 million non-melanoma
cases worldwide each year. The leading cause of skin cancer is high exposure of
skin cells to UV radiation, which can damage the DNA inside skin cells leading
to uncontrolled growth of skin cells. Skin cancer is primarily diagnosed
visually employing clinical screening, a biopsy, dermoscopic analysis, and
histopathological examination. It has been demonstrated that the dermoscopic
analysis in the hands of inexperienced dermatologists may cause a reduction in
diagnostic accuracy. Early detection and screening of skin cancer have the
potential to reduce mortality and morbidity. Previous studies have shown Deep
Learning ability to perform better than human experts in several visual
recognition tasks. In this paper, we propose an efficient seven-way automated
multi-class skin cancer classification system having performance comparable
with expert dermatologists. We used a pretrained MobileNet model to train over
HAM10000 dataset using transfer learning. The model classifies skin lesion
image with a categorical accuracy of 83.1 percent, top2 accuracy of 91.36
percent and top3 accuracy of 95.34 percent. The weighted average of precision,
recall, and f1-score were found to be 0.89, 0.83, and 0.83 respectively. The
model has been deployed as a web application for public use at
(https://saketchaturvedi.github.io). This fast, expansible method holds the
potential for substantial clinical impact, including broadening the scope of
primary care practice and augmenting clinical decision-making for dermatology
specialists.Comment: This is a pre-copyedited version of a contribution published in
Advances in Intelligent Systems and Computing, Hassanien A., Bhatnagar R.,
Darwish A. (eds) published by Chaturvedi S.S., Gupta K., Prasad P.S. The
definitive authentication version is available online via
https://doi.org/10.1007/978-981-15-3383-9_1
The Dreaming Variational Autoencoder for Reinforcement Learning Environments
Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.The Dreaming Variational Autoencoder for Reinforcement Learning EnvironmentsacceptedVersionNivĂĄ
- …