24 research outputs found
A Data-Driven State Aggregation Approach for Dynamic Discrete Choice Models
We study dynamic discrete choice models, where a commonly studied problem
involves estimating parameters of agent reward functions (also known as
"structural" parameters), using agent behavioral data. Maximum likelihood
estimation for such models requires dynamic programming, which is limited by
the curse of dimensionality. In this work, we present a novel algorithm that
provides a data-driven method for selecting and aggregating states, which
lowers the computational and sample complexity of estimation. Our method works
in two stages. In the first stage, we use a flexible inverse reinforcement
learning approach to estimate agent Q-functions. We use these estimated
Q-functions, along with a clustering algorithm, to select a subset of states
that are the most pivotal for driving changes in Q-functions. In the second
stage, with these selected "aggregated" states, we conduct maximum likelihood
estimation using a commonly used nested fixed-point algorithm. The proposed
two-stage approach mitigates the curse of dimensionality by reducing the
problem dimension. Theoretically, we derive finite-sample bounds on the
associated estimation error, which also characterize the trade-off of
computational complexity, estimation error, and sample complexity. We
demonstrate the empirical performance of the algorithm in two classic dynamic
discrete choice estimation applications
An Efficient Bandit Algorithm for Realtime Multivariate Optimization
Optimization is commonly employed to determine the content of web pages, such
as to maximize conversions on landing pages or click-through rates on search
engine result pages. Often the layout of these pages can be decoupled into
several separate decisions. For example, the composition of a landing page may
involve deciding which image to show, which wording to use, what color
background to display, etc. Such optimization is a combinatorial problem over
an exponentially large decision space. Randomized experiments do not scale well
to this setting, and therefore, in practice, one is typically limited to
optimizing a single aspect of a web page at a time. This represents a missed
opportunity in both the speed of experimentation and the exploitation of
possible interactions between layout decisions.
Here we focus on multivariate optimization of interactive web pages. We
formulate an approach where the possible interactions between different
components of the page are modeled explicitly. We apply bandit methodology to
explore the layout space efficiently and use hill-climbing to select optimal
content in realtime. Our algorithm also extends to contextualization and
personalization of layout selection. Simulation results show the suitability of
our approach to large decision spaces with strong interactions between content.
We further apply our algorithm to optimize a message that promotes adoption of
an Amazon service. After only a single week of online optimization, we saw a
21% conversion increase compared to the median layout. Our technique is
currently being deployed to optimize content across several locations at
Amazon.com.Comment: KDD'17 Audience Appreciation Awar
Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits
We propose improved fixed-design confidence bounds for the linear logistic
model. Our bounds significantly improve upon the state-of-the-art bound by Li
et al. (2017) via recent developments of the self-concordant analysis of the
logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a
direct dependence on , where is the minimal variance over
all arms' reward distributions. In general, scales exponentially
with the norm of the unknown linear parameter . Instead of relying on
this worst-case quantity, our confidence bound for the reward of any given arm
depends directly on the variance of that arm's reward distribution. We present
two applications of our novel bounds to pure exploration and regret
minimization logistic bandits improving upon state-of-the-art performance
guarantees. For pure exploration, we also provide a lower bound highlighting a
dependence on for a family of instances
Bayesian Meta-Prior Learning Using Empirical Bayes
Adding domain knowledge to a learning system is known to improve results. In
multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior.
On the other hand, various model parameters can have different learning rates
in real-world problems, especially with skewed data. Two often-faced challenges
in Operation Management and Management Science applications are the absence of
informative priors, and the inability to control parameter learning rates. In
this study, we propose a hierarchical Empirical Bayes approach that addresses
both challenges, and that can generalize to any Bayesian framework. Our method
learns empirical meta-priors from the data itself and uses them to decouple the
learning rates of first-order and second-order features (or any other given
feature grouping) in a Generalized Linear Model. As the first-order features
are likely to have a more pronounced effect on the outcome, focusing on
learning first-order weights first is likely to improve performance and
convergence time. Our Empirical Bayes method clamps features in each group
together and uses the deployed model's observed data to empirically compute a
hierarchical prior in hindsight. We report theoretical results for the
unbiasedness, strong consistency, and optimal frequentist cumulative regret
properties of our meta-prior variance estimator. We apply our method to a
standard supervised learning optimization problem, as well as an online
combinatorial optimization problem in a contextual bandit setting implemented
in an Amazon production system. Both during simulations and live experiments,
our method shows marked improvements, especially in cases of small traffic. Our
findings are promising, as optimizing over sparse data is often a challenge.Comment: Expanded discussions on applications and extended literature review
section. Forthcoming in the Management Science Journa
Factor learning portfolio optimization informed by continuous-time finance models
https://openreview.net/forum?id=6TLqwuyg2sFirst author draf
Predicting invasive breast cancer versus DCIS in different age groups.
BackgroundIncreasing focus on potentially unnecessary diagnosis and treatment of certain breast cancers prompted our investigation of whether clinical and mammographic features predictive of invasive breast cancer versus ductal carcinoma in situ (DCIS) differ by age.MethodsWe analyzed 1,475 malignant breast biopsies, 1,063 invasive and 412 DCIS, from 35,871 prospectively collected consecutive diagnostic mammograms interpreted at University of California, San Francisco between 1/6/1997 and 6/29/2007. We constructed three logistic regression models to predict the probability of invasive cancer versus DCIS for the following groups: women ≥ 65 (older group), women 50-64 (middle age group), and women < 50 (younger group). We identified significant predictors and measured the performance in all models using area under the receiver operating characteristic curve (AUC).ResultsThe models for older and the middle age groups performed significantly better than the model for younger group (AUC = 0.848 vs, 0.778; p = 0.049 and AUC = 0.851 vs, 0.778; p = 0.022, respectively). Palpability and principal mammographic finding were significant predictors in distinguishing invasive from DCIS in all age groups. Family history of breast cancer, mass shape and mass margins were significant positive predictors of invasive cancer in the older group whereas calcification distribution was a negative predictor of invasive cancer (i.e. predicted DCIS). In the middle age group--mass margins, and in the younger group--mass size were positive predictors of invasive cancer.ConclusionsClinical and mammographic finding features predict invasive breast cancer versus DCIS better in older women than younger women. Specific predictive variables differ based on age