435 research outputs found
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
The question of the optimality of Thompson Sampling for solving the
stochastic multi-armed bandit problem had been open since 1933. In this paper
we answer it positively for the case of Bernoulli rewards by providing the
first finite-time analysis that matches the asymptotic rate given in the Lai
and Robbins lower bound for the cumulative regret. The proof is accompanied by
a numerical comparison with other optimal policies, experiments that have been
lacking in the literature until now for the Bernoulli case.Comment: 15 pages, 2 figures, submitted to ALT (Algorithmic Learning Theory
Study of Interaction Modes in Pyrene-Based Fluorescent Organogels
International audienc
Precautionary Measures for Credit Risk Management in Jump Models
Sustaining efficiency and stability by properly controlling the equity to
asset ratio is one of the most important and difficult challenges in bank
management. Due to unexpected and abrupt decline of asset values, a bank must
closely monitor its net worth as well as market conditions, and one of its
important concerns is when to raise more capital so as not to violate capital
adequacy requirements. In this paper, we model the tradeoff between avoiding
costs of delay and premature capital raising, and solve the corresponding
optimal stopping problem. In order to model defaults in a bank's loan/credit
business portfolios, we represent its net worth by Levy processes, and solve
explicitly for the double exponential jump diffusion process and for a general
spectrally negative Levy process.Comment: 31 pages, 4 figure
Asymptotic Normality of a Class of Adaptive Statistics with Applications to Synthetic Data Methods for Censored Regression
AbstractMotivated by regression analysis of censored survival data, we develop herein a general asymptotic distribution theory for estimators defined by estimating equations of the form ∑ni=1ξ (wi, θ, Ĝn) = 0, in which wi represents observed data, θ is an unknown parameter to be estimated, and Ĝn represents an estimate of some unknown underlying distribution. This general theory is used to establish asymptotic normality of synthetic least squares estimates in censored regression models and to evaluate the covariance matrices of the limiting normal distributions
A Neural Networks Committee for the Contextual Bandit Problem
This paper presents a new contextual bandit algorithm, NeuralBandit, which
does not need hypothesis on stationarity of contexts and rewards. Several
neural networks are trained to modelize the value of rewards knowing the
context. Two variants, based on multi-experts approach, are proposed to choose
online the parameters of multi-layer perceptrons. The proposed algorithms are
successfully tested on a large dataset with and without stationarity of
rewards.Comment: 21st International Conference on Neural Information Processin
Revisiting urea-based gelators: strong solvent- and casting-microstructure dependencies and organogel processing using an alumina template
Urea-based gelators have been thoroughly characterized through various techniques and exhibit a strong solvent-structuration dependency in both the gel and the xerogel states. In a ground-breaking manner, gels were introduced in alumina membranes, which act as templates, in order to shape these materials and force the alignment of the corresponding self-assembled nanofibers by confinement
Internal Probing of the Supramolecular Organization of Pyrene-Based Organogelators
A thorough study of the unexpected spectroscopic behavior of two new luminescent pyrene-urea-based organogelators is rationalized as a function of their aggregation state and provides a key method to probe the supramolecular organization of the material
Study of Photoactive Organic Gels and their Structure
Date du colloque : 06/2012</p
- …