2,319 research outputs found

    Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

    Get PDF
    Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multi-armed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design. Moreover, by bounding the latter in terms of operator spectra, we obtain explicit sublinear regret bounds for many commonly used covariance functions. In some important cases, our bounds have surprisingly weak dependence on the dimensionality. In our experiments on real sensor data, GP-UCB compares favorably with other heuristical GP optimization approaches

    Residual Weighted Learning for Estimating Individualized Treatment Rules

    Full text link
    Personalized medicine has received increasing attention among statisticians, computer scientists, and clinical practitioners. A major component of personalized medicine is the estimation of individualized treatment rules (ITRs). Recently, Zhao et al. (2012) proposed outcome weighted learning (OWL) to construct ITRs that directly optimize the clinical outcome. Although OWL opens the door to introducing machine learning techniques to optimal treatment regimes, it still has some problems in performance. In this article, we propose a general framework, called Residual Weighted Learning (RWL), to improve finite sample performance. Unlike OWL which weights misclassification errors by clinical outcomes, RWL weights these errors by residuals of the outcome from a regression fit on clinical covariates excluding treatment assignment. We utilize the smoothed ramp loss function in RWL, and provide a difference of convex (d.c.) algorithm to solve the corresponding non-convex optimization problem. By estimating residuals with linear models or generalized linear models, RWL can effectively deal with different types of outcomes, such as continuous, binary and count outcomes. We also propose variable selection methods for linear and nonlinear rules, respectively, to further improve the performance. We show that the resulting estimator of the treatment rule is consistent. We further obtain a rate of convergence for the difference between the expected outcome using the estimated ITR and that of the optimal treatment rule. The performance of the proposed RWL methods is illustrated in simulation studies and in an analysis of cystic fibrosis clinical trial data.Comment: 48 pages, 3 figure

    Censored Quantile Regression Redux

    Get PDF
    Quantile regression for censored survival (duration) data offers a more flexible alternative to the Cox proportional hazard model for some applications. We describe three estimation methods for such applications that have been recently incorporated into the R package quantreg: the Powell (1986) estimator for fixed censoring, and two methods for random censoring, one introduced by Portnoy (2003), and the other by Peng and Huang (2008). The Portnoy and Peng-Huang estimators can be viewed, respectively, as generalizations to regression of the Kaplan-Meier and Nelson-Aalen estimators of univariate quantiles for censored observations. Some asymptotic and simulation comparisons are made to highlight advantages and disadvantages of the three methods.

    Distributed stochastic optimization via matrix exponential learning

    Get PDF
    In this paper, we investigate a distributed learning scheme for a broad class of stochastic optimization problems and games that arise in signal processing and wireless communications. The proposed algorithm relies on the method of matrix exponential learning (MXL) and only requires locally computable gradient observations that are possibly imperfect and/or obsolete. To analyze it, we introduce the notion of a stable Nash equilibrium and we show that the algorithm is globally convergent to such equilibria - or locally convergent when an equilibrium is only locally stable. We also derive an explicit linear bound for the algorithm's convergence speed, which remains valid under measurement errors and uncertainty of arbitrarily high variance. To validate our theoretical analysis, we test the algorithm in realistic multi-carrier/multiple-antenna wireless scenarios where several users seek to maximize their energy efficiency. Our results show that learning allows users to attain a net increase between 100% and 500% in energy efficiency, even under very high uncertainty.Comment: 31 pages, 3 figure

    Variance Optimal Hedging for discrete time processes with independent increments. Application to Electricity Markets

    Get PDF
    We consider the discretized version of a (continuous-time) two-factor model introduced by Benth and coauthors for the electricity markets. For this model, the underlying is the exponent of a sum of independent random variables. We provide and test an algorithm, which is based on the celebrated Foellmer-Schweizer decomposition for solving the mean-variance hedging problem. In particular, we establish that decomposition explicitely, for a large class of vanilla contingent claims. Interest is devoted in the choice of rebalancing dates and its impact on the hedging error, regarding the payoff regularity and the non stationarity of the log-price process
    • …
    corecore