225 research outputs found
Deep Quality-Value (DQV) Learning
We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep
Quality-Value (DQV) Learning. DQV uses temporal-difference learning to train a
Value neural network and uses this network for training a second Quality-value
network that learns to estimate state-action values. We first test DQV's update
rules with Multilayer Perceptrons as function approximators on two classic RL
problems, and then extend DQV with the use of Deep Convolutional Neural
Networks, `Experience Replay' and `Target Neural Networks' for tackling four
games of the Atari Arcade Learning environment. Our results show that DQV
learns significantly faster and better than Deep Q-Learning and Double Deep
Q-Learning, suggesting that our algorithm can potentially be a better
performing synchronous temporal difference algorithm than what is currently
present in DRL
Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms
peer reviewedThis paper makes one step forward towards characterizing a new family of model-free Deep Reinforcement Learning (DRL) algorithms. The aim of these algorithms is to jointly learn an approximation of the state-value function (V), alongside an approximation of the state-action value function (Q). Our analysis starts with a thorough study of the Deep Quality-Value Learning (DQV) algorithm, a DRL algorithm which has been shown to outperform popular techniques such as Deep-Q-Learning (DQN) and Double-Deep-Q-Learning (DDQN). Intending to investigate why DQV's learning dynamics allow this algorithm to perform so well, we formulate a set of research questions which help us characterize a new family of DRL algorithms. Among our results, we present some specific cases in which DQV's performance can get harmed and introduce a novel off-policy DRL algorithm, called DQV-Max, which can outperform DQV. We then study the behavior of the V and Q functions that are learned by DQV and DQV-Max and show that both algorithms might perform so well on several DRL test-beds because they are less prone to suffer from the overestimation bias of the Q function
Semantic data ingestion for intelligent, value-driven big data analytics
In this position paper we describe a conceptual
model for intelligent Big Data analytics based on both semantic
and machine learning AI techniques (called AI ensembles). These
processes are linked to business outcomes by explicitly modelling
data value and using semantic technologies as the underlying
mode for communication between the diverse processes and
organisations creating AI ensembles. Furthermore, we show
how data governance can direct and enhance these ensembles
by providing recommendations and insights that to ensure the
output generated produces the highest possible value for the
organisation
Linked Data Quality Assessment: A Survey
Data is of high quality if it is fit for its intended use in operations, decision-making, and planning. There is a colossal amount of linked data available on the web. However, it is difficult to understand how well the linked data fits into the modeling tasks due to the defects present in the data. Faults emerged in the linked data, spreading far and wide, affecting all the services designed for it. Addressing linked data quality deficiencies requires identifying quality problems, quality assessment, and the refinement of data to improve its quality. This study aims to identify existing end-to-end frameworks for quality assessment and improvement of data quality. One important finding is that most of the work deals with only one aspect rather than a combined approach. Another finding is that most of the framework aims at solving problems related to DBpedia. Therefore, a standard scalable system is required that integrates the identification of quality issues, the evaluation, and the improvement of the linked data quality. This survey contributes to understanding the state of the art of data quality evaluation and data quality improvement. A solution based on ontology is also proposed to build an end-to-end system that analyzes quality violations\u27 root causes
MAN: Multi-Action Networks Learning
Learning control policies with large action spaces is a challenging problem
in the field of reinforcement learning due to present inefficiencies in
exploration. In this work, we introduce a Deep Reinforcement Learning (DRL)
algorithm call Multi-Action Networks (MAN) Learning that addresses the
challenge of large discrete action spaces. We propose separating the action
space into two components, creating a Value Neural Network for each sub-action.
Then, MAN uses temporal-difference learning to train the networks
synchronously, which is simpler than training a single network with a large
action output directly. To evaluate the proposed method, we test MAN on a block
stacking task, and then extend MAN to handle 12 games from the Atari Arcade
Learning environment with 18 action spaces. Our results indicate that MAN
learns faster than both Deep Q-Learning and Double Deep Q-Learning, implying
our method is a better performing synchronous temporal difference algorithm
than those currently available for large action spaces
- …