13,720 research outputs found
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Multigrid methods for two-player zero-sum stochastic games
We present a fast numerical algorithm for large scale zero-sum stochastic
games with perfect information, which combines policy iteration and algebraic
multigrid methods. This algorithm can be applied either to a true finite state
space zero-sum two player game or to the discretization of an Isaacs equation.
We present numerical tests on discretizations of Isaacs equations or
variational inequalities. We also present a full multi-level policy iteration,
similar to FMG, which allows to improve substantially the computation time for
solving some variational inequalities.Comment: 31 page
Subsampling Algorithms for Semidefinite Programming
We derive a stochastic gradient algorithm for semidefinite optimization using
randomization techniques. The algorithm uses subsampling to reduce the
computational cost of each iteration and the subsampling ratio explicitly
controls granularity, i.e. the tradeoff between cost per iteration and total
number of iterations. Furthermore, the total computational cost is directly
proportional to the complexity (i.e. rank) of the solution. We study numerical
performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System
Enhancing Domain Word Embedding via Latent Semantic Imputation
We present a novel method named Latent Semantic Imputation (LSI) to transfer
external knowledge into semantic space for enhancing word embedding. The method
integrates graph theory to extract the latent manifold structure of the
entities in the affinity space and leverages non-negative least squares with
standard simplex constraints and power iteration method to derive spectral
embeddings. It provides an effective and efficient approach to combining entity
representations defined in different Euclidean spaces. Specifically, our
approach generates and imputes reliable embedding vectors for low-frequency
words in the semantic space and benefits downstream language tasks that depend
on word embedding. We conduct comprehensive experiments on a carefully designed
classification problem and language modeling and demonstrate the superiority of
the enhanced embedding via LSI over several well-known benchmark embeddings. We
also confirm the consistency of the results under different parameter settings
of our method.Comment: ACM SIGKDD 201
Valuation Perspectives and Decompositions for Variable Annuities with GMWB riders
The guaranteed minimum withdrawal benefit (GMWB) rider, as an add on to a
variable annuity (VA), guarantees the return of premiums in the form of peri-
odic withdrawals while allowing policyholders to participate fully in any
market gains. GMWB riders represent an embedded option on the account value
with a fee structure that is different from typical financial derivatives. We
consider fair pricing of the GMWB rider from a financial economic perspective.
Particular focus is placed on the distinct perspectives of the insurer and
policyholder and the unifying relationship. We extend a decomposition of the VA
contract into components that reflect term-certain payments and embedded
derivatives to the case where the policyholder has the option to surrender, or
lapse, the contract early.Comment: 18 pages, proof of Lemma A.1 expanded for clarit
- …