69,711 research outputs found

    Data Mining in Electronic Commerce

    Full text link
    Modern business is rushing toward e-commerce. If the transition is done properly, it enables better management, new services, lower transaction costs and better customer relations. Success depends on skilled information technologists, among whom are statisticians. This paper focuses on some of the contributions that statisticians are making to help change the business world, especially through the development and application of data mining methods. This is a very large area, and the topics we cover are chosen to avoid overlap with other papers in this special issue, as well as to respect the limitations of our expertise. Inevitably, electronic commerce has raised and is raising fresh research problems in a very wide range of statistical areas, and we try to emphasize those challenges.Comment: Published at http://dx.doi.org/10.1214/088342306000000204 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    PREDICTING CROSS-GAMING PROPENSITY USING E-CHAID ANALYSIS

    Full text link
    Cross-selling different types of games could provide an opportunity for casino operators to generate additional time and money spent on gaming from existing patrons. One way to identify the patrons who are likely to cross-play is mining individual players’ gaming data using predictive analytics. Hence, this study aims to predict casino patrons’ propensity to play both slots and table games, also known as cross-gaming, by applying a data-mining algorithm to patrons’ gaming data. The Exhaustive Chi-squared Automatic Interaction Detector (E-CHAID) method was employed to predict cross-gaming propensity. The E-CHAID models based on the gaming-related behavioral data produced actionable model accuracy rates for classifying cross-gamers and non-cross gamers along with the cross-gaming propensity scores for each patron. Using these scores, casino managers can accurately identify likely cross-gamers and develop a more targeted approach to market to them. Furthermore, the results of this study would enable casino managers to estimate incremental gaming revenues through cross-gaming. This, in turn, will assist them in spending marketing dollars more efficiently while maximizing gaming revenues

    Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

    Full text link
    We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed "identities", represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate "identity-based", rather than "identifier-based", user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of "finite-sample bias" that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.Comment: Accepted by WSDM 201

    Antipsychotics and Torsadogenic Risk: Signals Emerging from the US FDA Adverse Event Reporting System Database

    Get PDF
    Background: Drug-induced torsades de pointes (TdP) and related clinical entities represent a current regulatory and clinical burden. Objective: As part of the FP7 ARITMO (Arrhythmogenic Potential of Drugs) project, we explored the publicly available US FDA Adverse Event Reporting System (FAERS) database to detect signals of torsadogenicity for antipsychotics (APs). Methods: Four groups of events in decreasing order of drug-attributable risk were identified: (1) TdP, (2) QT-interval abnormalities, (3) ventricular fibrillation/tachycardia, and (4) sudden cardiac death. The reporting odds ratio (ROR) with 95 % confidence interval (CI) was calculated through a cumulative analysis from group 1 to 4. For groups 1+2, ROR was adjusted for age, gender, and concomitant drugs (e.g., antiarrhythmics) and stratified for AZCERT drugs, lists I and II (http://www.azcert.org, as of June 2011). A potential signal of torsadogenicity was defined if a drug met all the following criteria: (a) four or more cases in group 1+2; (b) significant ROR in group 1+2 that persists through the cumulative approach; (c) significant adjusted ROR for group 1+2 in the stratum without AZCERT drugs; (d) not included in AZCERT lists (as of June 2011). Results: Over the 7-year period, 37 APs were reported in 4,794 cases of arrhythmia: 140 (group 1), 883 (group 2), 1,651 (group 3), and 2,120 (group 4). Based on our criteria, the following potential signals of torsadogenicity were found: amisulpride (25 cases; adjusted ROR in the stratum without AZCERT drugs = 43.94, 95 % CI 22.82-84.60), cyamemazine (11; 15.48, 6.87-34.91), and olanzapine (189; 7.74, 6.45-9.30). Conclusions: This pharmacovigilance analysis on the FAERS found 3 potential signals of torsadogenicity for drugs previously unknown for this risk

    Social Bots for Online Public Health Interventions

    Full text link
    According to the Center for Disease Control and Prevention, in the United States hundreds of thousands initiate smoking each year, and millions live with smoking-related dis- eases. Many tobacco users discuss their habits and preferences on social media. This work conceptualizes a framework for targeted health interventions to inform tobacco users about the consequences of tobacco use. We designed a Twitter bot named Notobot (short for No-Tobacco Bot) that leverages machine learning to identify users posting pro-tobacco tweets and select individualized interventions to address their interest in tobacco use. We searched the Twitter feed for tobacco-related keywords and phrases, and trained a convolutional neural network using over 4,000 tweets dichotomously manually labeled as either pro- tobacco or not pro-tobacco. This model achieves a 90% recall rate on the training set and 74% on test data. Users posting pro- tobacco tweets are matched with former smokers with similar interests who posted anti-tobacco tweets. Algorithmic matching, based on the power of peer influence, allows for the systematic delivery of personalized interventions based on real anti-tobacco tweets from former smokers. Experimental evaluation suggests that our system would perform well if deployed. This research offers opportunities for public health researchers to increase health awareness at scale. Future work entails deploying the fully operational Notobot system in a controlled experiment within a public health campaign
    • …
    corecore