4,414 research outputs found
Uplift Modeling from Separate Labels
Uplift modeling is aimed at estimating the incremental impact of an action on an individual's behavior, which is useful in various application domains such as targeted marketing (advertisement campaigns) and personalized medicine (medical treatments). Conventional methods of uplift modeling require every instance to be jointly equipped with two types of labels: the taken action and its outcome. However, obtaining two labels for each instance at the same time is difficult or expensive in many real-world problems. In this paper, we propose a novel method of uplift modeling that is applicable to a more practical setting where only one type of labels is available for each instance. We show a mean squared error bound for the proposed estimator and demonstrate its effectiveness through experiments
A Practically Competitive and Provably Consistent Algorithm for Uplift Modeling
Randomized experiments have been critical tools of decision making for
decades. However, subjects can show significant heterogeneity in response to
treatments in many important applications. Therefore it is not enough to simply
know which treatment is optimal for the entire population. What we need is a
model that correctly customize treatment assignment base on subject
characteristics. The problem of constructing such models from randomized
experiments data is known as Uplift Modeling in the literature. Many algorithms
have been proposed for uplift modeling and some have generated promising
results on various data sets. Yet little is known about the theoretical
properties of these algorithms. In this paper, we propose a new tree-based
ensemble algorithm for uplift modeling. Experiments show that our algorithm can
achieve competitive results on both synthetic and industry-provided data. In
addition, by properly tuning the "node size" parameter, our algorithm is proved
to be consistent under mild regularity conditions. This is the first consistent
algorithm for uplift modeling that we are aware of.Comment: Accepted by 2017 IEEE International Conference on Data Minin
Predictive User Modeling with Actionable Attributes
Different machine learning techniques have been proposed and used for
modeling individual and group user needs, interests and preferences. In the
traditional predictive modeling instances are described by observable
variables, called attributes. The goal is to learn a model for predicting the
target variable for unseen instances. For example, for marketing purposes a
company consider profiling a new user based on her observed web browsing
behavior, referral keywords or other relevant information. In many real world
applications the values of some attributes are not only observable, but can be
actively decided by a decision maker. Furthermore, in some of such applications
the decision maker is interested not only to generate accurate predictions, but
to maximize the probability of the desired outcome. For example, a direct
marketing manager can choose which type of a special offer to send to a client
(actionable attribute), hoping that the right choice will result in a positive
response with a higher probability. We study how to learn to choose the value
of an actionable attribute in order to maximize the probability of a desired
outcome in predictive modeling. We emphasize that not all instances are equally
sensitive to changes in actions. Accurate choice of an action is critical for
those instances, which are on the borderline (e.g. users who do not have a
strong opinion one way or the other). We formulate three supervised learning
approaches for learning to select the value of an actionable attribute at an
instance level. We also introduce a focused training procedure which puts more
emphasis on the situations where varying the action is the most likely to take
the effect. The proof of concept experimental validation on two real-world case
studies in web analytics and e-learning domains highlights the potential of the
proposed approaches
Uplift Modeling with High Class Imbalance
Uplift modeling refers to estimating the causal effect of a treatment on an individual ob- servation, used for instance to identify customers worth targeting with a discount in e- commerce. We introduce a simple yet effective undersampling strategy for dealing with the prevalent problem of high class imbalance (low conversion rate) in such applications. Our strategy is agnostic to the base learners and produces a 6.5% improvement over the best published benchmark for the largest public uplift data which incidentally exhibits high class imbalance. We also introduce a new metric on calibration for uplift modeling and present a strategy to improve the calibration of the proposed method.Peer reviewe
Sample Complexity of Sample Average Approximation for Conditional Stochastic Optimization
In this paper, we study a class of stochastic optimization problems, referred
to as the \emph{Conditional Stochastic Optimization} (CSO), in the form of
\min_{x \in \mathcal{X}}
\EE_{\xi}f_\xi\Big({\EE_{\eta|\xi}[g_\eta(x,\xi)]}\Big), which finds a wide
spectrum of applications including portfolio selection, reinforcement learning,
robust learning, causal inference and so on. Assuming availability of samples
from the distribution \PP(\xi) and samples from the conditional distribution
\PP(\eta|\xi), we establish the sample complexity of the sample average
approximation (SAA) for CSO, under a variety of structural assumptions, such as
Lipschitz continuity, smoothness, and error bound conditions. We show that the
total sample complexity improves from \cO(d/\eps^4) to \cO(d/\eps^3) when
assuming smoothness of the outer function, and further to \cO(1/\eps^2) when
the empirical function satisfies the quadratic growth condition. We also
establish the sample complexity of a modified SAA, when and are
independent. Several numerical experiments further support our theoretical
findings.
Keywords: stochastic optimization, sample average approximation, large
deviations theoryComment: Typo corrected. Reference added. Revision comments handle
Joint inversion estimate of regional glacial isostatic adjustment in Antarctica considering a lateral varying Earth structure (ESA STSE Project REGINA)
A major uncertainty in determining the mass balance of the Antarctic ice sheet from measurements of satellite gravimetry, and
to a lesser extent satellite altimetry, is the poorly known correction for the ongoing deformation of the solid Earth caused by glacial isostatic adjustment (GIA). Although much progress has been made in consistently modelling the ice-sheet evolution throughout the last glacial cycle, as well as the induced bedrock deformation caused by these load changes, forward models of GIA remain ambiguous due to the lack of observational constraints on the ice sheet's past extent and thickness and mantle rheology beneath the continent. As an alternative to forward modelling GIA, we estimate GIA from multiple space-geodetic observations: GRACE, Envisat/ICESat and GPS. Making use of the different sensitivities of the respective satellite observations to current and past surface mass (ice mass) change and solid Earth processes, we estimate GIA based on viscoelastic response functions to disc load forcing. We calculate and distribute the viscoelastic response functions according to estimates of the variability of lithosphere thickness and mantle viscosity in Antarctica. We compare our GIA estimate with published GIA corrections and evaluate its impact in determining the ice mass balance in Antarctica from GRACE and satellite altimetry. Particular focus is applied to the Amundsen Sea Sector in West Antarctica, where uplift rates of several cm/yr have been measured by GPS. We show that most of this uplift is caused by the rapid viscoelastic response to recent ice-load changes, enabled by the presence of a low-viscosity upper mantle in West Antarctica. This paper presents the second and final contribution summarizing the work carried out within a European Space Agency funded study, REGINA, (www.regina-science.eu)
- …