5 research outputs found

    Online Bandit Learning for a Special Class of Non-Convex Losses

    No full text
    In online bandit learning, the learner aims to minimize a sequence of losses, while only observing the value of each loss at a single point. Although various algorithms and theories have been developed for online bandit learning, most of them are limited to convex losses. In this paper, we investigate the problem of online bandit learning with non-convex losses, and develop an efficient algorithm with formal theoretical guarantees. To be specific, we consider a class of losses which is a composition of a non-increasing scalar function and a linear function. This setting models a wide range of supervised learning applications such as online classification with a non-convex loss. Theoretical analysis shows that our algorithm achieves an O(poly(d)T2/3) regret bound when the variation of the loss function is small. To the best of our knowledge, this is the first work in online bandit learning that does not rely on convexity

    Convex Optimization and Online Learning: Their Applications in Discrete Choice Modeling and Pricing

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2018. Major: Industrial and Systems Engineering. Advisors: Shuzhong Zhang, Zizhuo Wang. 1 computer file (PDF); ix, 129 pages.The discrete choice model has been an important tool to model customers' demand when facing a set of substitutable choices. The random utility model, which is the most commonly used discrete choice framework, assumes that the utility of each alternative is random and follows a prescribed distribution. Due to the popularity of the random utility model, the probabilistic approach has been the major method to construct and analyze choice models. In recent years, several choice frameworks that are based on convex optimization are studied. Among them, the most widely used frameworks are the representative agent model and the semi-parametric choice model. In this dissertation, we first study a special class of the semi-parametric choice model - the cross moment model (CMM) - and reformulate it as a representative agent model. We also propose an efficient algorithm to calculate the choice probabilities in the CMM model. Then, motivated by the reformulation of the CMM model, we propose a new choice framework - the welfare-based choice model - and establish the equivalence between this framework and the other two choice frameworks: the representative agent model and the semi-parametric choice model. Lastly, motivated by the multi-product pricing problem, which is an important application of discrete choice models, we develop an online learning framework where the learning problem shares some similarities with the multi-product pricing problem. We propose efficient online learning algorithms and establish convergence rate results for these algorithms. The main techniques underlying our studies are continuous optimization and convex analysis
    corecore