46 research outputs found
Bayesian Inference of Arrival Rate and Substitution Behavior from Sales Transaction Data with Stockouts
When an item goes out of stock, sales transaction data no longer reflect the
original customer demand, since some customers leave with no purchase while
others substitute alternative products for the one that was out of stock. Here
we develop a Bayesian hierarchical model for inferring the underlying customer
arrival rate and choice model from sales transaction data and the corresponding
stock levels. The model uses a nonhomogeneous Poisson process to allow the
arrival rate to vary throughout the day, and allows for a variety of choice
models. Model parameters are inferred using a stochastic gradient MCMC
algorithm that can scale to large transaction databases. We fit the model to
data from a local bakery and show that it is able to make accurate
out-of-sample predictions, and to provide actionable insight into lost cookie
sales
Scalable Meta-Learning for Bayesian Optimization
Bayesian optimization has become a standard technique for hyperparameter
optimization, including data-intensive models such as deep neural networks that
may take days or weeks to train. We consider the setting where previous
optimization runs are available, and we wish to use their results to warm-start
a new optimization run. We develop an ensemble model that can incorporate the
results of past optimization runs, while avoiding the poor scaling that comes
with putting all results into a single Gaussian process model. The ensemble
combines models from past runs according to estimates of their generalization
performance on the current optimization. Results from a large collection of
hyperparameter optimization benchmark problems and from optimization of a
production computer vision platform at Facebook show that the ensemble can
substantially reduce the time it takes to obtain near-optimal configurations,
and is useful for warm-starting expensive searches or running quick
re-optimizations
Sequential Event Prediction
In sequential event prediction, we are given a “sequence database” of past event sequences to learn from, and we aim to predict the next event within a current event sequence. We focus on applications where the set of the past events has predictive power and not the specific order of those past events. Such applications arise in recommender systems, equipment maintenance, medical informatics, and in other domains. Our formalization of sequential event prediction draws on ideas from supervised ranking. We show how specific choices within this approach lead to different sequential event prediction problems and algorithms. In recommender system applications, the observed sequence of events depends on user choices, which may be influenced by the recommendations, which are themselves tailored to the user’s choices. This leads to sequential event prediction algorithms involving a non-convex optimization problem. We apply our approach to an online grocery store recommender system, email recipient recommendation, and a novel application in the health event prediction domain
Learning Theory Analysis for Association Rules and Sequential Event Prediction
We present a theoretical analysis for prediction algorithms based on association rules. As part of this analysis, we introduce a problem for which rules are particularly natural, called “sequential event prediction." In sequential event prediction, events in a sequence are revealed one by one, and the goal is to determine which event will next be revealed. The training set is a collection of past sequences of events. An example application is to predict which item will next be placed into a customer's online shopping cart, given his/her past purchases. In the context of this problem, algorithms based on association rules have distinct advantages over classical statistical and machine learning methods: they look at correlations based on subsets of co-occurring past events (items a and b imply item c), they can be applied to the sequential event prediction problem in a natural way, they can potentially handle the “cold start" problem where the training set is small, and they yield interpretable predictions. In this work, we present two algorithms that incorporate association rules. These algorithms can be used both for sequential event prediction and for supervised classification, and they are simple enough that they can possibly be understood by users, customers, patients, managers, etc. We provide generalization guarantees on these algorithms based on algorithmic stability analysis from statistical learning theory. We include a discussion of the strict minimum support threshold often used in association rule mining, and introduce an “adjusted confidence" measure that provides a weaker minimum support condition that has advantages over the strict minimum support. The paper brings together ideas from statistical learning theory, association rule mining and Bayesian analysis
Recommended from our members
Learning Theory Analysis for Association Rules and Sequential Event Prediction
We present a theoretical analysis for prediction algorithms based on association rules. As part of this analysis, we introduce a problem for which rules are particularly natural, called “sequential event prediction." In sequential event prediction, events in a sequence are revealed one by one, and the goal is to determine which event will next be revealed. The training set is a collection of past sequences of events. An example application is to predict which item will next be placed into a customer's online shopping cart, given his/her past purchases. In the context of this problem, algorithms based on association rules have distinct advantages over classical statistical and machine learning methods: they look at correlations based on subsets of co-occurring past events (items a and b imply item c), they can be applied to the sequential event prediction problem in a natural way, they can potentially handle the “cold start" problem where the training set is small, and they yield interpretable predictions. In this work, we present two algorithms that incorporate association rules. These algorithms can be used both for sequential event prediction and for supervised classification, and they are simple enough that they can possibly be understood by users, customers, patients, managers, etc. We provide generalization guarantees on these algorithms based on algorithmic stability analysis from statistical learning theory. We include a discussion of the strict minimum support threshold often used in association rule mining, and introduce an “adjusted confidence" measure that provides a weaker minimum support condition that has advantages over the strict minimum support. The paper brings together ideas from statistical learning theory, association rule mining and Bayesian analysis