5,618 research outputs found

    Data Mining in Electronic Commerce

    Full text link
    Modern business is rushing toward e-commerce. If the transition is done properly, it enables better management, new services, lower transaction costs and better customer relations. Success depends on skilled information technologists, among whom are statisticians. This paper focuses on some of the contributions that statisticians are making to help change the business world, especially through the development and application of data mining methods. This is a very large area, and the topics we cover are chosen to avoid overlap with other papers in this special issue, as well as to respect the limitations of our expertise. Inevitably, electronic commerce has raised and is raising fresh research problems in a very wide range of statistical areas, and we try to emphasize those challenges.Comment: Published at http://dx.doi.org/10.1214/088342306000000204 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    What Makes them Click: Empirical Analysis of Consumer Demand for Search Advertising

    Get PDF
    We study users' response to sponsored-search advertising using data from Microsoft's Live AdCenter distributed in the "Beyond Search" initiative. We estimate a structural model of utility maximizing users, which quantifies "user experience" based on their "revealed preferences," and predicts user responses to counterfactual ad placements. In the model, each user chooses clicks sequentially to maximize his expected utility under incomplete information about the relevance of ads. We estimate the substitutability of ads in users' utility function, the fixed effects of different ads and positions, user uncertainty about ads' relevance, and user heterogeneity. We find substantial substitutability of ads, which generates large negative externalities: 40% more clicks would occur in a hypothetical world in which each ad faces no competition. As for counterfactual ad placements, our simulations indicate that CTR-optimal matching increases CTR by 10.1% while user-optimal matching increases user welfare by 13.3%. Moreover, targeting ad placement to specific users could raise user welfare by 59%. Here, we find a significant suboptimality (up to 16% of total welfare) in case the search engine tries to implement a sophisticated matching policy using a misspecified model that does not account for externalities. Finally, user welfare could be raised by 14% if they had full information about the relevance of ads to them.

    The Pennsylvania reemployment bonus experiments : how a survival model helps in the analysis of the data

    Get PDF
    Survival models for life-time data and other time-to-event data are widely used in many fields, including medicine, the environmental sciences, engineering etc. They have also found recognition in the analysis of economic duration data. This paper provides a reanalysis of the Pennsylvania Reemployment Bonus Experiments, which were conducted in 1988-89 to examine the effect of different types of reemployment bonus offers on the unemployment spell. A Cox-proportional-hazards survival-model is fitted to the data and the results are compared to the results of a linear regression approach and to the results of a quantile regression approach. The Cox-proportional-hazards model provides for a remarkable goodness of fit and yields less effective treatment responses, therefore lower expectations concerning the overall implications of the Pennsylvania experiment. An influence analysis is proposed for obtaining qualitative information on the influence of the covariates at different quantiles. The results of the quantile regression and of the influence analysis show that both the linear regression and the Cox-model still impose stringent restrictions on the way covariates influence the duration distribution, however, due to its flexibility, the Cox-proportional hazards model is more appropriate for analysing the data

    Non-convex Optimization for Machine Learning

    Full text link
    A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.Comment: The official publication is available from now publishers via http://dx.doi.org/10.1561/220000005
    • …
    corecore