19,974 research outputs found
Pretrain Soft Q-Learning with Imperfect Demonstrations
Pretraining reinforcement learning methods with demonstrations has been an
important concept in the study of reinforcement learning since a large amount
of computing power is spent on online simulations with existing reinforcement
learning algorithms. Pretraining reinforcement learning remains a significant
challenge in exploiting expert demonstrations whilst keeping exploration
potentials, especially for value based methods. In this paper, we propose a
pretraining method for soft Q-learning. Our work is inspired by pretraining
methods for actor-critic algorithms since soft Q-learning is a value based
algorithm that is equivalent to policy gradient. The proposed method is based
on -discounted biased policy evaluation with entropy regularization,
which is also the updating target of soft Q-learning. Our method is evaluated
on various tasks from Atari 2600. Experiments show that our method effectively
learns from imperfect demonstrations, and outperforms other state-of-the-art
methods that learn from expert demonstrations
A Review of Learning with Deep Generative Models from Perspective of Graphical Modeling
This document aims to provide a review on learning with deep generative
models (DGMs), which is an highly-active area in machine learning and more
generally, artificial intelligence. This review is not meant to be a tutorial,
but when necessary, we provide self-contained derivations for completeness.
This review has two features. First, though there are different perspectives to
classify DGMs, we choose to organize this review from the perspective of
graphical modeling, because the learning methods for directed DGMs and
undirected DGMs are fundamentally different. Second, we differentiate model
definitions from model learning algorithms, since different learning algorithms
can be applied to solve the learning problem on the same model, and an
algorithm can be applied to learn different models. We thus separate model
definition and model learning, with more emphasis on reviewing, differentiating
and connecting different learning algorithms. We also discuss promising future
research directions.Comment: add SN-GANs, SA-GANs, conditional generation (cGANs, AC-GANs). arXiv
admin note: text overlap with arXiv:1606.00709, arXiv:1801.03558 by other
author
Human Attention Estimation for Natural Images: An Automatic Gaze Refinement Approach
Photo collections and its applications today attempt to reflect user
interactions in various forms. Moreover, photo collections aim to capture the
users' intention with minimum effort through applications capturing user
intentions. Human interest regions in an image carry powerful information about
the user's behavior and can be used in many photo applications. Research on
human visual attention has been conducted in the form of gaze tracking and
computational saliency models in the computer vision community, and has shown
considerable progress. This paper presents an integration between implicit gaze
estimation and computational saliency model to effectively estimate human
attention regions in images on the fly. Furthermore, our method estimates human
attention via implicit calibration and incremental model updating without any
active participation from the user. We also present extensive analysis and
possible applications for personal photo collections
Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network
We consider the problem of sparse phase retrieval from Fourier transform
magnitudes to recover the -sparse signal vector and its support
. We exploit extended support estimate with size
larger than satisfying and obtained by
a trained deep neural network (DNN). To make the DNN learnable, it provides
as the union of equivalent solutions of by
utilizing modulo Fourier invariances. Set can be estimated with
short running time via the DNN, and support can be determined
from the DNN output rather than from the full index set by applying hard
thresholding to . Thus, the DNN-based extended support estimation
improves the reconstruction performance of the signal with a low complexity
burden dependent on . Numerical results verify that the proposed scheme has
a superior performance with lower complexity compared to local search-based
greedy sparse phase retrieval and a state-of-the-art variant of the Fienup
method
Verification for Machine Learning, Autonomy, and Neural Networks Survey
This survey presents an overview of verification techniques for autonomous
systems, with a focus on safety-critical autonomous cyber-physical systems
(CPS) and subcomponents thereof. Autonomy in CPS is enabling by recent advances
in artificial intelligence (AI) and machine learning (ML) through approaches
such as deep neural networks (DNNs), embedded in so-called learning enabled
components (LECs) that accomplish tasks from classification to control.
Recently, the formal methods and formal verification community has developed
methods to characterize behaviors in these LECs with eventual goals of formally
verifying specifications for LECs, and this article presents a survey of many
of these recent approaches
Continuously heterogeneous hyper-objects in cryo-EM and 3-D movies of many temporal dimensions
Single particle cryo-electron microscopy (EM) is an increasingly popular
method for determining the 3-D structure of macromolecules from noisy 2-D
images of single macromolecules whose orientations and positions are random and
unknown. One of the great opportunities in cryo-EM is to recover the structure
of macromolecules in heterogeneous samples, where multiple types or multiple
conformations are mixed together. Indeed, in recent years, many tools have been
introduced for the analysis of multiple discrete classes of molecules mixed
together in a cryo-EM experiment. However, many interesting structures have a
continuum of conformations which do not fit discrete models nicely; the
analysis of such continuously heterogeneous models has remained a more elusive
goal. In this manuscript, we propose to represent heterogeneous molecules and
similar structures as higher dimensional objects. We generalize the basic
operations used in many existing reconstruction algorithms, making our approach
generic in the sense that, in principle, existing algorithms can be adapted to
reconstruct those higher dimensional objects. As proof of concept, we present a
prototype of a new algorithm which we use to solve simulated reconstruction
problems
Hybrid optimization and Bayesian inference techniques for a non-smooth radiation detection problem
In this investigation, we propose several algorithms to recover the location
and intensity of a radiation source located in a simulated 250 m x 180 m block
in an urban center based on synthetic measurements. Radioactive decay and
detection are Poisson random processes, so we employ likelihood functions based
on this distribution. Due to the domain geometry and the proposed response
model, the negative logarithm of the likelihood is only piecewise continuous
differentiable, and it has multiple local minima. To address these
difficulties, we investigate three hybrid algorithms comprised of mixed
optimization techniques. For global optimization, we consider Simulated
Annealing (SA), Particle Swarm (PS) and Genetic Algorithm (GA), which rely
solely on objective function evaluations; i.e., they do not evaluate the
gradient in the objective function. By employing early stopping criteria for
the global optimization methods, a pseudo-optimum point is obtained. This is
subsequently utilized as the initial value by the deterministic Implicit
Filtering method (IF), which is able to find local extrema in non-smooth
functions, to finish the search in a narrow domain. These new hybrid techniques
combining global optimization and Implicit Filtering address difficulties
associated with the non-smooth response, and their performances are shown to
significantly decrease the computational time over the global optimization
methods alone. To quantify uncertainties associated with the source location
and intensity, we employ the Delayed Rejection Adaptive Metropolis (DRAM) and
DiffeRential Evolution Adaptive Metropolis (DREAM) algorithms. Marginal
densities of the source properties are obtained, and the means of the chains'
compare accurately with the estimates produced by the hybrid algorithms.Comment: 36 pages, 14 figure
Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues
As a key technique for enabling artificial intelligence, machine learning
(ML) is capable of solving complex problems without explicit programming.
Motivated by its successful applications to many practical tasks like image
recognition, both industry and the research community have advocated the
applications of ML in wireless communication. This paper comprehensively
surveys the recent advances of the applications of ML in wireless
communication, which are classified as: resource management in the MAC layer,
networking and mobility management in the network layer, and localization in
the application layer. The applications in resource management further include
power control, spectrum management, backhaul management, cache management,
beamformer design and computation resource management, while ML based
networking focuses on the applications in clustering, base station switching
control, user association and routing. Moreover, literatures in each aspect is
organized according to the adopted ML techniques. In addition, several
conditions for applying ML to wireless communication are identified to help
readers decide whether to use ML and which kind of ML techniques to use, and
traditional approaches are also summarized together with their performance
comparison with ML based approaches, based on which the motivations of surveyed
literatures to adopt ML are clarified. Given the extensiveness of the research
area, challenges and unresolved issues are presented to facilitate future
studies, where ML based network slicing, infrastructure update to support ML
based paradigms, open data sets and platforms for researchers, theoretical
guidance for ML implementation and so on are discussed.Comment: 34 pages,8 figure
A textual transform of multivariate time-series for prognostics
Prognostics or early detection of incipient faults is an important industrial
challenge for condition-based and preventive maintenance. Physics-based
approaches to modeling fault progression are infeasible due to multiple
interacting components, uncontrolled environmental factors and observability
constraints. Moreover, such approaches to prognostics do not generalize to new
domains. Consequently, domain-agnostic data-driven machine learning approaches
to prognostics are desirable. Damage progression is a path-dependent process
and explicitly modeling the temporal patterns is critical for accurate
estimation of both the current damage state and its progression leading to
total failure. In this paper, we present a novel data-driven approach to
prognostics that employs a novel textual representation of multivariate
temporal sensor observations for predicting the future health state of the
monitored equipment early in its life. This representation enables us to
utilize well-understood concepts from text-mining for modeling, prediction and
understanding distress patterns in a domain agnostic way. The approach has been
deployed and successfully tested on large scale multivariate time-series data
from commercial aircraft engines. We report experiments on well-known publicly
available benchmark datasets and simulation datasets. The proposed approach is
shown to be superior in terms of prediction accuracy, lead time to prediction
and interpretability.Comment: 10 page
Learning from Conditional Distributions via Dual Embeddings
Many machine learning tasks, such as learning with invariance and policy
evaluation in reinforcement learning, can be characterized as problems of
learning from conditional distributions. In such problems, each sample
itself is associated with a conditional distribution represented by
samples , and the goal is to learn a function that links
these conditional distributions to target values . These learning problems
become very challenging when we only have limited samples or in the extreme
case only one sample from each conditional distribution. Commonly used
approaches either assume that is independent of , or require an
overwhelmingly large samples from each conditional distribution.
To address these challenges, we propose a novel approach which employs a new
min-max reformulation of the learning from conditional distribution problem.
With such new reformulation, we only need to deal with the joint distribution
. We also design an efficient learning algorithm, Embedding-SGD, and
establish theoretical sample complexity for such problems. Finally, our
numerical experiments on both synthetic and real-world datasets show that the
proposed approach can significantly improve over the existing algorithms.Comment: 24 pages, 11 figure
- …