Search CORE

2,319 research outputs found

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

Author: Kakade Sham M.
Krause Andreas
Seeger Matthias
Srinivas Niranjan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multi-armed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design. Moreover, by bounding the latter in terms of operator spectra, we obtain explicit sublinear regret bounds for many commonly used covariance functions. In some important cases, our bounds have surprisingly weak dependence on the dimensionality. In our experiments on real sensor data, GP-UCB compares favorably with other heuristical GP optimization approaches

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Caltech Authors

ScholarlyCommons@Penn

Residual Weighted Learning for Estimating Individualized Treatment Rules

Author: Khan Umer
Kosorok Michael R.
Mayer-Hamblett Nicole
Zhou Xin
Publication venue
Publication date: 13/08/2015
Field of study

Personalized medicine has received increasing attention among statisticians, computer scientists, and clinical practitioners. A major component of personalized medicine is the estimation of individualized treatment rules (ITRs). Recently, Zhao et al. (2012) proposed outcome weighted learning (OWL) to construct ITRs that directly optimize the clinical outcome. Although OWL opens the door to introducing machine learning techniques to optimal treatment regimes, it still has some problems in performance. In this article, we propose a general framework, called Residual Weighted Learning (RWL), to improve finite sample performance. Unlike OWL which weights misclassification errors by clinical outcomes, RWL weights these errors by residuals of the outcome from a regression fit on clinical covariates excluding treatment assignment. We utilize the smoothed ramp loss function in RWL, and provide a difference of convex (d.c.) algorithm to solve the corresponding non-convex optimization problem. By estimating residuals with linear models or generalized linear models, RWL can effectively deal with different types of outcomes, such as continuous, binary and count outcomes. We also propose variable selection methods for linear and nonlinear rules, respectively, to further improve the performance. We show that the resulting estimator of the treatment rule is consistent. We further obtain a rate of convergence for the difference between the expected outcome using the estimated ITR and that of the optimal treatment rule. The performance of the proposed RWL methods is illustrated in simulation studies and in an analysis of cystic fibrosis clinical trial data.Comment: 48 pages, 3 figure

arXiv.org e-Print Archive

FigShare

Censored Quantile Regression Redux

Author: Roger Koenker
Publication venue
Publication date
Field of study

Quantile regression for censored survival (duration) data offers a more flexible alternative to the Cox proportional hazard model for some applications. We describe three estimation methods for such applications that have been recently incorporated into the R package quantreg: the Powell (1986) estimator for fixed censoring, and two methods for random censoring, one introduced by Portnoy (2003), and the other by Peng and Huang (2008). The Portnoy and Peng-Huang estimators can be viewed, respectively, as generalizations to regression of the Kaplan-Meier and Nelson-Aalen estimators of univariate quantiles for censored observations. Some asymptotic and simulation comparisons are made to highlight advantages and disadvantages of the three methods.

Research Papers in Economics

Distributed stochastic optimization via matrix exponential learning

Author: Belmega E. Veronica
Mertikopoulos Panayotis
Negrel Romain
Sanguinetti Luca
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In this paper, we investigate a distributed learning scheme for a broad class of stochastic optimization problems and games that arise in signal processing and wireless communications. The proposed algorithm relies on the method of matrix exponential learning (MXL) and only requires locally computable gradient observations that are possibly imperfect and/or obsolete. To analyze it, we introduce the notion of a stable Nash equilibrium and we show that the algorithm is globally convergent to such equilibria - or locally convergent when an equilibrium is only locally stable. We also derive an explicit linear bound for the algorithm's convergence speed, which remains valid under measurement errors and uncertainty of arbitrarily high variance. To validate our theoretical analysis, we test the algorithm in realistic multi-carrier/multiple-antenna wireless scenarios where several users seek to maximize their energy efficiency. Our results show that learning allows users to attain a net increase between 100% and 500% in energy efficiency, even under very high uncertainty.Comment: 31 pages, 3 figure

arXiv.org e-Print Archive

HAL - Normandie Université

HAL-CentraleSupelec

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Archivio della Ricerca - Università di Pisa

HAL-Rennes 1

Variance Optimal Hedging for discrete time processes with independent increments. Application to Electricity Markets

Author: Goutte Stéphane
Oudjane Nadia
Russo Francesco
Publication venue
Publication date: 17/05/2012
Field of study

We consider the discretized version of a (continuous-time) two-factor model introduced by Benth and coauthors for the electricity markets. For this model, the underlying is the exponent of a sum of independent random variables. We provide and test an algorithm, which is based on the celebrated Foellmer-Schweizer decomposition for solving the mean-variance hedging problem. In particular, we establish that decomposition explicitely, for a large class of vanilla contingent claims. Interest is devoted in the choice of rebalancing dates and its impact on the hedging error, regarding the payoff regularity and the non stationarity of the log-price process

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

INRIA a CCSD electronic archive server

HAL-Paris 13

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM