35 research outputs found
Large-scale Interactive Recommendation with Tree-structured Policy Gradient
Reinforcement learning (RL) has recently been introduced to interactive
recommender systems (IRS) because of its nature of learning from dynamic
interactions and planning for long-run performance. As IRS is always with
thousands of items to recommend (i.e., thousands of actions), most existing
RL-based methods, however, fail to handle such a large discrete action space
problem and thus become inefficient. The existing work that tries to deal with
the large discrete action space problem by utilizing the deep deterministic
policy gradient framework suffers from the inconsistency between the continuous
action representation (the output of the actor network) and the real discrete
action. To avoid such inconsistency and achieve high efficiency and
recommendation effectiveness, in this paper, we propose a Tree-structured
Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical
clustering tree is built over the items and picking an item is formulated as
seeking a path from the root to a certain leaf of the tree. Extensive
experiments on carefully-designed environments based on two real-world datasets
demonstrate that our model provides superior recommendation performance and
significant efficiency improvement over state-of-the-art methods
Applications of Deep Neural Networks
Deep learning is a group of exciting new technologies for neural networks.
Through a combination of advanced training techniques and neural network
architectural components, it is now possible to create neural networks that can
handle tabular data, images, text, and audio as both input and output. Deep
learning allows a neural network to learn hierarchies of information in a way
that is like the function of the human brain. This course will introduce the
student to classic neural network structures, Convolution Neural Networks
(CNN), Long Short-Term Memory (LSTM), Gated Recurrent Neural Networks (GRU),
General Adversarial Networks (GAN), and reinforcement learning. Application of
these architectures to computer vision, time series, security, natural language
processing (NLP), and data generation will be covered. High-Performance
Computing (HPC) aspects will demonstrate how deep learning can be leveraged
both on graphical processing units (GPUs), as well as grids. Focus is primarily
upon the application of deep learning to problems, with some introduction to
mathematical foundations. Readers will use the Python programming language to
implement deep learning using Google TensorFlow and Keras. It is not necessary
to know Python prior to this book; however, familiarity with at least one
programming language is assumed
Data-driven action-value functions for evaluating players in professional team sports
As more and larger event stream datasets for professional sports become available, there is growing interest in modeling the complex play dynamics to evaluate player performance. Among these models, a common player evaluation method is assigning values to player actions. Traditional action-values metrics, however, consider very limited game context and player information. Furthermore, they provide directly related to goals (e.g., shots), not all actions. Recent work has shown that reinforcement learning provided powerful methods for addressing quantifying the value of player actions in sports. This dissertation develops deep reinforcement learning (DRL) methods for estimating action values in sports. We make several contributions to DRL for sports. First, we develop neural network architectures that learn an action-value Q-function from sports events logs to estimate each team\u27s expected success given the current match context. Specifically, our architecture models the game history with a recurrent network and predicts the probability that a team scores the next goal. From the learned Q-values, we derive a Goal Impact Metric (GIM) for evaluating a player\u27s performance over a game season. We show that the resulting player rankings are consistent with standard player metrics and temporally consistent within and across seasons. Second, we address the interpretability of the learned Q-values. While neural networks provided accurate estimates, the black-box structure prohibits understanding the influence of different game features on the action values. To interpret the Q-function and understand the influence of game features on action values, we design an interpretable mimic learning framework for the DRL. The framework is based on a Linear Model U-Tree (LMUT) as a transparent mimic model, which facilitates extracting the function rules and computing the feature importance for action values. Third, we incorporate information about specific players into the action values, by introducing a deep player representation framework. In this framework, each player is assigned a latent feature vector called an embedding, with the property that statistically similar players are mapped to nearby embeddings. To compute embeddings that summarize the statistical information about players, we implement a Variational Recurrent Ladder Agent Encoder (VaRLAE) to learn a contextualized representation for when and how players are likely to act. We learn and evaluate deep Q-functions from event data for both ice hockey and soccer. These are challenging continuous-flow games where game context and medium-term consequences are crucial for properly assessing the impact of a player\u27s actions
Planning and Learning: Path-Planning for Autonomous Vehicles, a Review of the Literature
This short review aims to make the reader familiar with state-of-the-art
works relating to planning, scheduling and learning. First, we study
state-of-the-art planning algorithms. We give a brief introduction of neural
networks. Then we explore in more detail graph neural networks, a recent
variant of neural networks suited for processing graph-structured inputs. We
describe briefly the concept of reinforcement learning algorithms and some
approaches designed to date. Next, we study some successful approaches
combining neural networks for path-planning. Lastly, we focus on temporal
planning problems with uncertainty.Comment: AAAI-format & update
Gaining Insight into Determinants of Physical Activity using Bayesian Network Learning
Contains fulltext :
228326pre.pdf (preprint version ) (Open Access)
Contains fulltext :
228326pub.pdf (publisher's version ) (Open Access)BNAIC/BeneLearn 202