11 research outputs found
Revenue Maximization and Learning in Products Ranking
We consider the revenue maximization problem for an online retailer who plans
to display a set of products differing in their prices and qualities and rank
them in order. The consumers have random attention spans and view the products
sequentially before purchasing a ``satisficing'' product or leaving the
platform empty-handed when the attention span gets exhausted. Our framework
extends the cascade model in two directions: the consumers have random
attention spans instead of fixed ones and the firm maximizes revenues instead
of clicking probabilities. We show a nested structure of the optimal product
ranking as a function of the attention span when the attention span is fixed
and design a -approximation algorithm accordingly for the random attention
spans. When the conditional purchase probabilities are not known and may depend
on consumer and product features, we devise an online learning algorithm that
achieves regret relative to the approximation
algorithm, despite of the censoring of information: the attention span of a
customer who purchases an item is not observable. Numerical experiments
demonstrate the outstanding performance of the approximation and online
learning algorithms
Federated Multi-Level Optimization over Decentralized Networks
Multi-level optimization has gained increasing attention in recent years, as
it provides a powerful framework for solving complex optimization problems that
arise in many fields, such as meta-learning, multi-player games, reinforcement
learning, and nested composition optimization. In this paper, we study the
problem of distributed multi-level optimization over a network, where agents
can only communicate with their immediate neighbors. This setting is motivated
by the need for distributed optimization in large-scale systems, where
centralized optimization may not be practical or feasible. To address this
problem, we propose a novel gossip-based distributed multi-level optimization
algorithm that enables networked agents to solve optimization problems at
different levels in a single timescale and share information through network
propagation. Our algorithm achieves optimal sample complexity, scaling linearly
with the network size, and demonstrates state-of-the-art performance on various
applications, including hyper-parameter tuning, decentralized reinforcement
learning, and risk-averse optimization.Comment: arXiv admin note: substantial text overlap with arXiv:2206.1087
Recommended from our members
Optimization and revenue management in complex networks
This thesis consists of three papers in optimization and revenue management over complex networks: Robust Linear Control in Transmission Systems, Online Learning and Optimization Under a New Linear-Threshold Model with Negative Influence, and Revenue Management with Complementarity Products. This thesis contributes to analytical methods for optimization problems in complex networks, namely, power network, social network and product network.
In Chapter 2, we describe a robust multiperiod transmission planning model including renewables and batteries, where battery output is used to partly offset renewable output deviations from forecast. A central element is a nonconvex battery operation model plus a robust model of forecast errors and a linear control scheme. Even though the problem is nonconvex we provide an efficient and theoretically valid algorithm that effectively solves cases on large transmission systems.
In Chapter 3, we propose a new class of Linear Threshold Model-based information-diffusion model that incorporates the formation and spread of negative attitude. We call such models negativity-aware. We show that in these models, the expected positive influence is a monotone sub-modular function of the seed set. Thus we can use a greedy algorithm to construct a solution with constant approximation guarantee when the objective is to select a seed set of fixed size to maximize positive influence. Our models are flexible enough to account for both the features of local users and the features of the information being propagated in the diffusion. We analyze an online-learning setting for a multi-round influence-maximization problem, where an agent is actively learning the diffusion parameters over time while trying to maximize total cumulative positive influence. We develop a class of online learning algorithms and provide the theoretical upper bound on the regret.
In Chapter 4, we propose a tractable information-diffusion-based framework to capture complementary relationships among products. Using this framework, we investigate how various revenue-management decisions can be optimized. In particular, we prove that several fundamental problems involving complementary products, such as promotional pricing, product recommendation, and category planning, can be formulated as sub-modular maximization problems, and can be solved by tractable greedy algorithms with guarantees on the quality of the solutions. We validate our model using a dataset that contains product reviews and metadata from Amazon from May 1996 to July 2014.
We also analyze an online-learning setting for revenue-maximization with complementary products. In this setting, we assume that the retailer has access only to sales observations. That is, she can only observe whether a product is purchased from her. This assumption leads to diffusion models with novel node-level feedback, in contrast to classical models that have edge-level feedback. We conduct confidence region analysis on the maximum likelihood estimator for our models, develop online-learning algorithms, and analyze their performance in both theoretical and practical perspectives
Bridging Adversarial and Nonstationary Multi-armed Bandit
In the multi-armed bandit framework, there are two formulations that are
commonly employed to handle time-varying reward distributions: adversarial
bandit and nonstationary bandit. Although their oracles, algorithms, and regret
analysis differ significantly, we provide a unified formulation in this paper
that smoothly bridges the two as special cases. The formulation uses an oracle
that takes the best-fixed arm within time windows. Depending on the window
size, it turns into the oracle in hindsight in the adversarial bandit and
dynamic oracle in the nonstationary bandit. We provide algorithms that attain
the optimal regret with the matching lower bound
Multi-Level Stochastic Gradient Methods for Nested Composition Optimization
Stochastic gradient methods are scalable for solving large-scale optimization
problems that involve empirical expectations of loss functions. Existing
results mainly apply to optimization problems where the objectives are one- or
two-level expectations. In this paper, we consider the multi-level
compositional optimization problem that involves compositions of multi-level
component functions and nested expectations over a random path. It finds
applications in risk-averse optimization and sequential planning. We propose a
class of multi-level stochastic gradient methods that are motivated from the
method of multi-timescale stochastic approximation. First we propose a basic
-level stochastic compositional gradient algorithm, establish its almost
sure convergence and obtain an -iteration error bound . Then
we develop accelerated multi-level stochastic gradient methods by using an
extrapolation-interpolation scheme to take advantage of the smoothness of
individual component functions. When all component functions are smooth, we
show that the convergence rate improves to for general
objectives and for strongly convex objectives. We also
provide almost sure convergence and rate of convergence results for nonconvex
problems. The proposed methods and theoretical results are validated using
numerical experiments
Optimality Conditions and Algorithms for Top-K Arm Identification
We consider the top-k arm identification problem for multi-armed bandits with
rewards belonging to a one-parameter canonical exponential family. The
objective is to select the set of k arms with the highest mean rewards by
sequential allocation of sampling efforts. We propose a unified optimal
allocation problem that identifies the complexity measures of this problem
under the fixed-confidence, fixed-budget settings, and the posterior
convergence rate from the Bayesian perspective. We provide the first
characterization of its optimality. We provide the first provably optimal
algorithm in the fixed-confidence setting for k>1. We also propose an efficient
heuristic algorithm for the top-k arm identification problem. Extensive
numerical experiments demonstrate superior performance compare to existing
methods in all three settings
Data-Driven Minimax Optimization with Expectation Constraints
Attention to data-driven optimization approaches, including the well-known
stochastic gradient descent method, has grown significantly over recent
decades, but data-driven constraints have rarely been studied, because of the
computational challenges of projections onto the feasible set defined by these
hard constraints. In this paper, we focus on the non-smooth convex-concave
stochastic minimax regime and formulate the data-driven constraints as
expectation constraints. The minimax expectation constrained problem subsumes a
broad class of real-world applications, including two-player zero-sum game and
data-driven robust optimization. We propose a class of efficient primal-dual
algorithms to tackle the minimax expectation-constrained problem, and show that
our algorithms converge at the optimal rate of
. We demonstrate the practical efficiency of
our algorithms by conducting numerical experiments on large-scale real-world
applications