2,319 research outputs found
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
Many applications require optimizing an unknown, noisy function that is
expensive to evaluate. We formalize this task as a multi-armed bandit problem,
where the payoff function is either sampled from a Gaussian process (GP) or has
low RKHS norm. We resolve the important open problem of deriving regret bounds
for this setting, which imply novel convergence rates for GP optimization. We
analyze GP-UCB, an intuitive upper-confidence based algorithm, and bound its
cumulative regret in terms of maximal information gain, establishing a novel
connection between GP optimization and experimental design. Moreover, by
bounding the latter in terms of operator spectra, we obtain explicit sublinear
regret bounds for many commonly used covariance functions. In some important
cases, our bounds have surprisingly weak dependence on the dimensionality. In
our experiments on real sensor data, GP-UCB compares favorably with other
heuristical GP optimization approaches
Residual Weighted Learning for Estimating Individualized Treatment Rules
Personalized medicine has received increasing attention among statisticians,
computer scientists, and clinical practitioners. A major component of
personalized medicine is the estimation of individualized treatment rules
(ITRs). Recently, Zhao et al. (2012) proposed outcome weighted learning (OWL)
to construct ITRs that directly optimize the clinical outcome. Although OWL
opens the door to introducing machine learning techniques to optimal treatment
regimes, it still has some problems in performance. In this article, we propose
a general framework, called Residual Weighted Learning (RWL), to improve finite
sample performance. Unlike OWL which weights misclassification errors by
clinical outcomes, RWL weights these errors by residuals of the outcome from a
regression fit on clinical covariates excluding treatment assignment. We
utilize the smoothed ramp loss function in RWL, and provide a difference of
convex (d.c.) algorithm to solve the corresponding non-convex optimization
problem. By estimating residuals with linear models or generalized linear
models, RWL can effectively deal with different types of outcomes, such as
continuous, binary and count outcomes. We also propose variable selection
methods for linear and nonlinear rules, respectively, to further improve the
performance. We show that the resulting estimator of the treatment rule is
consistent. We further obtain a rate of convergence for the difference between
the expected outcome using the estimated ITR and that of the optimal treatment
rule. The performance of the proposed RWL methods is illustrated in simulation
studies and in an analysis of cystic fibrosis clinical trial data.Comment: 48 pages, 3 figure
Censored Quantile Regression Redux
Quantile regression for censored survival (duration) data offers a more flexible alternative to the Cox proportional hazard model for some applications. We describe three estimation methods for such applications that have been recently incorporated into the R package quantreg: the Powell (1986) estimator for fixed censoring, and two methods for random censoring, one introduced by Portnoy (2003), and the other by Peng and Huang (2008). The Portnoy and Peng-Huang estimators can be viewed, respectively, as generalizations to regression of the Kaplan-Meier and Nelson-Aalen estimators of univariate quantiles for censored observations. Some asymptotic and simulation comparisons are made to highlight advantages and disadvantages of the three methods.
Distributed stochastic optimization via matrix exponential learning
In this paper, we investigate a distributed learning scheme for a broad class
of stochastic optimization problems and games that arise in signal processing
and wireless communications. The proposed algorithm relies on the method of
matrix exponential learning (MXL) and only requires locally computable gradient
observations that are possibly imperfect and/or obsolete. To analyze it, we
introduce the notion of a stable Nash equilibrium and we show that the
algorithm is globally convergent to such equilibria - or locally convergent
when an equilibrium is only locally stable. We also derive an explicit linear
bound for the algorithm's convergence speed, which remains valid under
measurement errors and uncertainty of arbitrarily high variance. To validate
our theoretical analysis, we test the algorithm in realistic
multi-carrier/multiple-antenna wireless scenarios where several users seek to
maximize their energy efficiency. Our results show that learning allows users
to attain a net increase between 100% and 500% in energy efficiency, even under
very high uncertainty.Comment: 31 pages, 3 figure
Variance Optimal Hedging for discrete time processes with independent increments. Application to Electricity Markets
We consider the discretized version of a (continuous-time) two-factor model
introduced by Benth and coauthors for the electricity markets. For this model,
the underlying is the exponent of a sum of independent random variables. We
provide and test an algorithm, which is based on the celebrated
Foellmer-Schweizer decomposition for solving the mean-variance hedging problem.
In particular, we establish that decomposition explicitely, for a large class
of vanilla contingent claims. Interest is devoted in the choice of rebalancing
dates and its impact on the hedging error, regarding the payoff regularity and
the non stationarity of the log-price process
- …