69,711 research outputs found
Data Mining in Electronic Commerce
Modern business is rushing toward e-commerce. If the transition is done
properly, it enables better management, new services, lower transaction costs
and better customer relations. Success depends on skilled information
technologists, among whom are statisticians. This paper focuses on some of the
contributions that statisticians are making to help change the business world,
especially through the development and application of data mining methods. This
is a very large area, and the topics we cover are chosen to avoid overlap with
other papers in this special issue, as well as to respect the limitations of
our expertise. Inevitably, electronic commerce has raised and is raising fresh
research problems in a very wide range of statistical areas, and we try to
emphasize those challenges.Comment: Published at http://dx.doi.org/10.1214/088342306000000204 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
PREDICTING CROSS-GAMING PROPENSITY USING E-CHAID ANALYSIS
Cross-selling different types of games could provide an opportunity for casino operators to generate additional time and money spent on gaming from existing patrons. One way to identify the patrons who are likely to cross-play is mining individual players’ gaming data using predictive analytics. Hence, this study aims to predict casino patrons’ propensity to play both slots and table games, also known as cross-gaming, by applying a data-mining algorithm to patrons’ gaming data. The Exhaustive Chi-squared Automatic Interaction Detector (E-CHAID) method was employed to predict cross-gaming propensity. The E-CHAID models based on the gaming-related behavioral data produced actionable model accuracy rates for classifying cross-gamers and non-cross gamers along with the cross-gaming propensity scores for each patron. Using these scores, casino managers can accurately identify likely cross-gamers and develop a more targeted approach to market to them. Furthermore, the results of this study would enable casino managers to estimate incremental gaming revenues through cross-gaming. This, in turn, will assist them in spending marketing dollars more efficiently while maximizing gaming revenues
Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling
We evaluate the impact of probabilistically-constructed digital identity data
collected from Sep. to Dec. 2017 (approx.), in the context of
Lookalike-targeted campaigns. The backbone of this study is a large set of
probabilistically-constructed "identities", represented as small bags of
cookies and mobile ad identifiers with associated metadata, that are likely all
owned by the same underlying user. The identity data allows to generate
"identity-based", rather than "identifier-based", user models, giving a fuller
picture of the interests of the users underlying the identifiers. We employ
off-policy techniques to evaluate the potential of identity-powered lookalike
models without incurring the risk of allowing untested models to direct large
amounts of ad spend or the large cost of performing A/B tests. We add to
historical work on off-policy evaluation by noting a significant type of
"finite-sample bias" that occurs for studies combining modestly-sized datasets
and evaluation metrics involving rare events (e.g., conversions). We illustrate
this bias using a simulation study that later informs the handling of inverse
propensity weights in our analyses on real data. We demonstrate significant
lift in identity-powered lookalikes versus an identity-ignorant baseline: on
average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for
identifiers having little data themselves, but that can be inferred to belong
to users with substantial data to aggregate across identifiers. This implies
that identity-powered user modeling is especially important in the context of
identifiers having very short lifespans (i.e., frequently churned cookies). Our
work motivates and informs the use of probabilistically-constructed identities
in marketing. It also deepens the canon of examples in which off-policy
learning has been employed to evaluate the complex systems of the internet
economy.Comment: Accepted by WSDM 201
Recommended from our members
Tackling food marketing to children in a digital world: trans-disciplinary perspectives. Children’s rights, evidence of impact, methodological challenges, regulatory options and policy implications for the WHO European Region
There is unequivocal evidence that childhood obesity is influenced by marketing of foods and non-alcoholic beverages high in saturated fat, salt and/or free sugars (HFSS), and a core recommendation of the WHO Commission on Ending Childhood Obesity is to reduce children’s exposure to all such marketing. As a result, WHO has called on Member States to introduce restrictions on marketing of HFSS foods to children, covering all media, including digital, and to close any regulatory loopholes. This publication provides up-to-date information on the marketing of foods and non-alcoholic beverages to children and the changes that have occurred in recent years, focusing in particular on the major shift to digital marketing. It examines trends in media use among children, marketing methods in the new digital media landscape and children’s engagement with such marketing. It also considers the impact on children and their ability to counter marketing as well as the implications for children’s rights and digital privacy. Finally the report discusses the policy implications and some of the recent policy action by WHO European Member States
Antipsychotics and Torsadogenic Risk: Signals Emerging from the US FDA Adverse Event Reporting System Database
Background: Drug-induced torsades de pointes (TdP) and related clinical entities represent a current regulatory and clinical burden. Objective: As part of the FP7 ARITMO (Arrhythmogenic Potential of Drugs) project, we explored the publicly available US FDA Adverse Event Reporting System (FAERS) database to detect signals of torsadogenicity for antipsychotics (APs). Methods: Four groups of events in decreasing order of drug-attributable risk were identified: (1) TdP, (2) QT-interval abnormalities, (3) ventricular fibrillation/tachycardia, and (4) sudden cardiac death. The reporting odds ratio (ROR) with 95 % confidence interval (CI) was calculated through a cumulative analysis from group 1 to 4. For groups 1+2, ROR was adjusted for age, gender, and concomitant drugs (e.g., antiarrhythmics) and stratified for AZCERT drugs, lists I and II (http://www.azcert.org, as of June 2011). A potential signal of torsadogenicity was defined if a drug met all the following criteria: (a) four or more cases in group 1+2; (b) significant ROR in group 1+2 that persists through the cumulative approach; (c) significant adjusted ROR for group 1+2 in the stratum without AZCERT drugs; (d) not included in AZCERT lists (as of June 2011). Results: Over the 7-year period, 37 APs were reported in 4,794 cases of arrhythmia: 140 (group 1), 883 (group 2), 1,651 (group 3), and 2,120 (group 4). Based on our criteria, the following potential signals of torsadogenicity were found: amisulpride (25 cases; adjusted ROR in the stratum without AZCERT drugs = 43.94, 95 % CI 22.82-84.60), cyamemazine (11; 15.48, 6.87-34.91), and olanzapine (189; 7.74, 6.45-9.30). Conclusions: This pharmacovigilance analysis on the FAERS found 3 potential signals of torsadogenicity for drugs previously unknown for this risk
Social Bots for Online Public Health Interventions
According to the Center for Disease Control and Prevention, in the United
States hundreds of thousands initiate smoking each year, and millions live with
smoking-related dis- eases. Many tobacco users discuss their habits and
preferences on social media. This work conceptualizes a framework for targeted
health interventions to inform tobacco users about the consequences of tobacco
use. We designed a Twitter bot named Notobot (short for No-Tobacco Bot) that
leverages machine learning to identify users posting pro-tobacco tweets and
select individualized interventions to address their interest in tobacco use.
We searched the Twitter feed for tobacco-related keywords and phrases, and
trained a convolutional neural network using over 4,000 tweets dichotomously
manually labeled as either pro- tobacco or not pro-tobacco. This model achieves
a 90% recall rate on the training set and 74% on test data. Users posting pro-
tobacco tweets are matched with former smokers with similar interests who
posted anti-tobacco tweets. Algorithmic matching, based on the power of peer
influence, allows for the systematic delivery of personalized interventions
based on real anti-tobacco tweets from former smokers. Experimental evaluation
suggests that our system would perform well if deployed. This research offers
opportunities for public health researchers to increase health awareness at
scale. Future work entails deploying the fully operational Notobot system in a
controlled experiment within a public health campaign
- …