444,510 research outputs found
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Offline reinforcement learning, which aims at optimizing sequential
decision-making strategies with historical data, has been extensively applied
in real-life applications. State-Of-The-Art algorithms usually leverage
powerful function approximators (e.g. neural networks) to alleviate the sample
complexity hurdle for better empirical performances. Despite the successes, a
more systematic understanding of the statistical complexity for function
approximation remains lacking. Towards bridging the gap, we take a step by
considering offline reinforcement learning with differentiable function class
approximation (DFA). This function class naturally incorporates a wide range of
models with nonlinear/nonconvex structures. Most importantly, we show offline
RL with differentiable function approximation is provably efficient by
analyzing the pessimistic fitted Q-learning (PFQL) algorithm, and our results
provide the theoretical basis for understanding a variety of practical
heuristics that rely on Fitted Q-Iteration style design. In addition, we
further improve our guarantee with a tighter instance-dependent
characterization. We hope our work could draw interest in studying
reinforcement learning with differentiable function approximation beyond the
scope of current research
Reduction of complexity and variational data assimilation
Conferencia plenaria por invitaciónReduced basis methods belong to a class of approaches of \emph{model reduction} for the approximation of the solution of mathematical models involved in many fields of research or decision making and in data assimilation. These approaches allow to tackle, in --- close to --- real time, problems requiring, a priori, a large number of computations by formalizing two steps : one known as "offline stage” that is a preparation step and is quite costly and an ``online stage'' that is used on demand and is very cheap.
The strategy uses the fact that the solutions we are interested in belong to a family, a manifold, parametrized by input coefficients, shapes or stochastic data, that has a small complexity. The complexity is measured in terms of a quantity like the ``Kolmogorov width'' that, when it is small, formalizes the fact that some small dimensional vectorial spaces allow to provide a good approximation of the elements on the manifold.
We shall make a review of the fundamental background and state some results proving that such a dimension is small for a large class of problems of interest, then use this fact to propose approximation strategies in various cases depending on the knowledge we have of the solution we want to approximate : either explicit through values at points, or through outputs evaluated from the solution, or implicit through the Partial Differential Equation it satisfies. We shall also present a strategy available when a mixed of the above informations is available allowing to propose new efficient approaches in data assimilation and data mining.
The theory on the numerical analysis (a priori and a posteriori) of these approaches will also be presented together with results on numerical simulations.
Work done in close collaboration with A. T. Patera (MIT, Cambridge) and has benefited from the collaboration with A. Buffa (IAN, Pavia), R. Chakir (IFSTAR, Paris), Y. Chen (U. of Massachusetts, Dartmouth), Y. Hesthaven (EPFL, Lausanne), E. Lovgren (Simula, Oslo), O. Mula (UPMC, Paris), NC Nguyen (MIT, Cambridge), J. Pen (MIT, Cambridge), C. Prud'homme (U. Strasbourg), J. Rodriguez (U. Santiago de Compostella), E. M. Ronquist (U. Trondheim), B. Stamm (UPMC, Paris), G. Turinici (Dauphine, Paris), M. Yano (MIT, Cambridge).Universidad de Málaga. Campus de Excelencia Internacional Andalucia Tech. Conferencias del plan propio de investigación UM
Rigorous Born Approximation and beyond for the Spin-Boson Model
Within the lowest-order Born approximation, we present an exact calculation
of the time dynamics of the spin-boson model in the ohmic regime. We observe
non-Markovian effects at zero temperature that scale with the system-bath
coupling strength and cause qualitative changes in the evolution of coherence
at intermediate times of order of the oscillation period. These changes could
significantly affect the performance of these systems as qubits. In the biased
case, we find a prompt loss of coherence at these intermediate times, whose
decay rate is set by , where is the coupling strength
to the environment. We also explore the calculation of the next order Born
approximation: we show that, at the expense of very large computational
complexity, interesting physical quantities can be rigorously computed at
fourth order using computer algebra, presented completely in an accompanying
Mathematica file. We compute the corrections to the long time
behavior of the system density matrix; the result is identical to the reduced
density matrix of the equilibrium state to the same order in . All
these calculations indicate precision experimental tests that could confirm or
refute the validity of the spin-boson model in a variety of systems.Comment: Greatly extended version of short paper cond-mat/0304118.
Accompanying Mathematica notebook fop5.nb, available in Source, is an
essential part of this work; it gives full details of the fourth-order Born
calculation summarized in the text. fop5.nb is prepared in arXiv style
(available from Wolfram Research
FREDE: Linear-Space Anytime Graph Embeddings
Low-dimensional representations, or embeddings, of a graph's nodes facilitate
data mining tasks. Known embedding methods explicitly or implicitly rely on a
similarity measure among nodes. As the similarity matrix is quadratic, a
tradeoff between space complexity and embedding quality arises; past research
initially opted for heuristics and linear-transform factorizations, which allow
for linear space but compromise on quality; recent research has proposed a
quadratic-space solution as a viable option too.
In this paper we observe that embedding methods effectively aim to preserve
the covariance among the rows of a similarity matrix, and raise the question:
is there a method that combines (i) linear space complexity, (ii) a nonlinear
transform as its basis, and (iii) nontrivial quality guarantees? We answer this
question in the affirmative, with FREDE(FREquent Directions Embedding), a
sketching-based method that iteratively improves on quality while processing
rows of the similarity matrix individually; thereby, it provides, at any
iteration, column-covariance approximation guarantees that are, in due course,
almost indistinguishable from those of the optimal row-covariance approximation
by SVD. Our experimental evaluation on variably sized networks shows that FREDE
performs as well as SVD and competitively against current state-of-the-art
methods in diverse data mining tasks, even when it derives an embedding based
on only 10% of node similarities
Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching
Personalization in marketing aims at improving the shopping experience of
customers by tailoring services to individuals. In order to achieve this,
businesses must be able to make personalized predictions regarding the next
purchase. That is, one must forecast the exact list of items that will comprise
the next purchase, i.e., the so-called market basket. Despite its relevance to
firm operations, this problem has received surprisingly little attention in
prior research, largely due to its inherent complexity. In fact,
state-of-the-art approaches are limited to intuitive decision rules for pattern
extraction. However, the simplicity of the pre-coded rules impedes performance,
since decision rules operate in an autoregressive fashion: the rules can only
make inferences from past purchases of a single customer without taking into
account the knowledge transfer that takes place between customers. In contrast,
our research overcomes the limitations of pre-set rules by contributing a novel
predictor of market baskets from sequential purchase histories: our predictions
are based on similarity matching in order to identify similar purchase habits
among the complete shopping histories of all customers. Our contributions are
as follows: (1) We propose similarity matching based on subsequential dynamic
time warping (SDTW) as a novel predictor of market baskets. Thereby, we can
effectively identify cross-customer patterns. (2) We leverage the Wasserstein
distance for measuring the similarity among embedded purchase histories. (3) We
develop a fast approximation algorithm for computing a lower bound of the
Wasserstein distance in our setting. An extensive series of computational
experiments demonstrates the effectiveness of our approach. The accuracy of
identifying the exact market baskets based on state-of-the-art decision rules
from the literature is outperformed by a factor of 4.0.Comment: Accepted for oral presentation at 25th ACM SIGKDD Conference on
Knowledge Discovery and Data Mining (KDD 2019
A Polynomial-Time Algorithm for 1/3-Approximate Nash Equilibria in Bimatrix Games
Since the celebrated PPAD-completeness result for Nash equilibria in bimatrix games, a long line of research has focused on polynomial-time algorithms that compute ?-approximate Nash equilibria. Finding the best possible approximation guarantee that we can have in polynomial time has been a fundamental and non-trivial pursuit on settling the complexity of approximate equilibria. Despite a significant amount of effort, the algorithm of Tsaknakis and Spirakis [Tsaknakis and Spirakis, 2008], with an approximation guarantee of (0.3393+?), remains the state of the art over the last 15 years. In this paper, we propose a new refinement of the Tsaknakis-Spirakis algorithm, resulting in a polynomial-time algorithm that computes a (1/3+?)-Nash equilibrium, for any constant ? > 0. The main idea of our approach is to go beyond the use of convex combinations of primal and dual strategies, as defined in the optimization framework of [Tsaknakis and Spirakis, 2008], and enrich the pool of strategies from which we build the strategy profiles that we output in certain bottleneck cases of the algorithm
- …