5,604 research outputs found
Belief Tree Search for Active Object Recognition
Active Object Recognition (AOR) has been approached as an unsupervised
learning problem, in which optimal trajectories for object inspection are not
known and are to be discovered by reducing label uncertainty measures or
training with reinforcement learning. Such approaches have no guarantees of the
quality of their solution. In this paper, we treat AOR as a Partially
Observable Markov Decision Process (POMDP) and find near-optimal policies on
training data using Belief Tree Search (BTS) on the corresponding belief Markov
Decision Process (MDP). AOR then reduces to the problem of knowledge transfer
from near-optimal policies on training set to the test set. We train a Long
Short Term Memory (LSTM) network to predict the best next action on the
training set rollouts. We sho that the proposed AOR method generalizes well to
novel views of familiar objects and also to novel objects. We compare this
supervised scheme against guided policy search, and find that the LSTM network
reaches higher recognition accuracy compared to the guided policy method. We
further look into optimizing the observation function to increase the total
collected reward of optimal policy. In AOR, the observation function is known
only approximately. We propose a gradient-based method update to this
approximate observation function to increase the total reward of any policy. We
show that by optimizing the observation function and retraining the supervised
LSTM network, the AOR performance on the test set improves significantly.Comment: IROS 201
Fr\'echet ChemNet Distance: A metric for generative models for molecules in drug discovery
The new wave of successful generative models in machine learning has
increased the interest in deep learning driven de novo drug design. However,
assessing the performance of such generative models is notoriously difficult.
Metrics that are typically used to assess the performance of such generative
models are the percentage of chemically valid molecules or the similarity to
real molecules in terms of particular descriptors, such as the partition
coefficient (logP) or druglikeness. However, method comparison is difficult
because of the inconsistent use of evaluation metrics, the necessity for
multiple metrics, and the fact that some of these measures can easily be
tricked by simple rule-based systems. We propose a novel distance measure
between two sets of molecules, called Fr\'echet ChemNet distance (FCD), that
can be used as an evaluation metric for generative models. The FCD is similar
to a recently established performance metric for comparing image generation
methods, the Fr\'echet Inception Distance (FID). Whereas the FID uses one of
the hidden layers of InceptionNet, the FCD utilizes the penultimate layer of a
deep neural network called ChemNet, which was trained to predict drug
activities. Thus, the FCD metric takes into account chemically and biologically
relevant information about molecules, and also measures the diversity of the
set via the distribution of generated molecules. The FCD's advantage over
previous metrics is that it can detect if generated molecules are a) diverse
and have similar b) chemical and c) biological properties as real molecules. We
further provide an easy-to-use implementation that only requires the SMILES
representation of the generated molecules as input to calculate the FCD.
Implementations are available at: https://www.github.com/bioinf-jku/FCDComment: Implementations are available at:
https://www.github.com/bioinf-jku/FC
A Recurrent Neural Network Survival Model: Predicting Web User Return Time
The size of a website's active user base directly affects its value. Thus, it
is important to monitor and influence a user's likelihood to return to a site.
Essential to this is predicting when a user will return. Current state of the
art approaches to solve this problem come in two flavors: (1) Recurrent Neural
Network (RNN) based solutions and (2) survival analysis methods. We observe
that both techniques are severely limited when applied to this problem.
Survival models can only incorporate aggregate representations of users instead
of automatically learning a representation directly from a raw time series of
user actions. RNNs can automatically learn features, but can not be directly
trained with examples of non-returning users who have no target value for their
return time. We develop a novel RNN survival model that removes the limitations
of the state of the art methods. We demonstrate that this model can
successfully be applied to return time prediction on a large e-commerce dataset
with a superior ability to discriminate between returning and non-returning
users than either method applied in isolation.Comment: Accepted into ECML PKDD 2018; 8 figures and 1 tabl
A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning
Automatic decision-making approaches, such as reinforcement learning (RL),
have been applied to (partially) solve the resource allocation problem
adaptively in the cloud computing system. However, a complete cloud resource
allocation framework exhibits high dimensions in state and action spaces, which
prohibit the usefulness of traditional RL techniques. In addition, high power
consumption has become one of the critical concerns in design and control of
cloud computing systems, which degrades system reliability and increases
cooling cost. An effective dynamic power management (DPM) policy should
minimize power consumption while maintaining performance degradation within an
acceptable level. Thus, a joint virtual machine (VM) resource allocation and
power management framework is critical to the overall cloud computing system.
Moreover, novel solution framework is necessary to address the even higher
dimensions in state and action spaces. In this paper, we propose a novel
hierarchical framework for solving the overall resource allocation and power
management problem in cloud computing systems. The proposed hierarchical
framework comprises a global tier for VM resource allocation to the servers and
a local tier for distributed power management of local servers. The emerging
deep reinforcement learning (DRL) technique, which can deal with complicated
control problems with large state space, is adopted to solve the global tier
problem. Furthermore, an autoencoder and a novel weight sharing structure are
adopted to handle the high-dimensional state space and accelerate the
convergence speed. On the other hand, the local tier of distributed server
power managements comprises an LSTM based workload predictor and a model-free
RL based power manager, operating in a distributed manner.Comment: accepted by 37th IEEE International Conference on Distributed
Computing (ICDCS 2017
- …