1,519 research outputs found

    CASP-DM: Context Aware Standard Process for Data Mining

    Get PDF
    We propose an extension of the Cross Industry Standard Process for Data Mining (CRISPDM) which addresses specific challenges of machine learning and data mining for context and model reuse handling. This new general context-aware process model is mapped with CRISP-DM reference model proposing some new or enhanced outputs

    An Overview of Issues in Developing Industrial Data Mining and Knowledge Discovery Applications

    Get PDF
    This paper surveys the growing number of indu5 trial applications of data mining and knowledge discovery. We look at the existing tools, describe some representative applications, and discuss the major issues and problems for building and deploying successful applications and their adoption by business users. Finally, we examine how to assess the potential of a knowledge discovery application.

    Data Mining and Decision Support: An Integrative Approach

    Get PDF

    On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection

    Full text link
    Humans are the final decision makers in critical tasks that involve ethical and legal concerns, ranging from recidivism prediction, to medical diagnosis, to fighting against fake news. Although machine learning models can sometimes achieve impressive performance in these tasks, these tasks are not amenable to full automation. To realize the potential of machine learning for improving human decisions, it is important to understand how assistance from machine learning models affects human performance and human agency. In this paper, we use deception detection as a testbed and investigate how we can harness explanations and predictions of machine learning models to improve human performance while retaining human agency. We propose a spectrum between full human agency and full automation, and develop varying levels of machine assistance along the spectrum that gradually increase the influence of machine predictions. We find that without showing predicted labels, explanations alone slightly improve human performance in the end task. In comparison, human performance is greatly improved by showing predicted labels (>20% relative improvement) and can be further improved by explicitly suggesting strong machine performance. Interestingly, when predicted labels are shown, explanations of machine predictions induce a similar level of accuracy as an explicit statement of strong machine performance. Our results demonstrate a tradeoff between human performance and human agency and show that explanations of machine predictions can moderate this tradeoff.Comment: 17 pages, 19 figures, in Proceedings of ACM FAT* 2019, dataset & demo available at https://deception.machineintheloop.co

    Understanding and Improving Continuous Experimentation : From A/B Testing to Continuous Software Optimization

    Get PDF
    Controlled experiments (i.e. A/B tests) are used by many companies with user-intensive products to improve their software with user data. Some companies adopt an experiment-driven approach to software development with continuous experimentation (CE). With CE, every user-affecting software change is evaluated in an experiment and specialized roles seek out opportunities to experiment with functionality. The goal of the thesis is to describe current practice and support CE in industry. The main contributions are threefold. First, a review of the CE literature on: infrastructure and processes, the problem-solution pairs applied in industry practice, and the benefits and challenges of the practice. Second, a multi-case study with 12 companies to analyze how experimentation is used and why some companies fail to fully realize the benefits of CE. A theory for Factors Affecting Continuous Experimentation (FACE) is constructed to realize this goal. Finally, a toolkit called Constraint Oriented Multi-variate Bandit Optimization (COMBO) is developed for supporting automated experimentation with many variables simultaneously, live in a production environment.The research in the thesis is conducted under the design science paradigm using empirical research methods, with simulation experiments of tool proposals and a multi-case study on company usage of CE. Other research methods include systematic literature review and theory building.From FACE we derive three factors that explain CE utility: (1) investments in data infrastructure, (2) user problem complexity, and (3) incentive structures for experimentation. Guidelines are provided on how to strive towards state-of-the-art CE based on company factors. All three factors are relevant for companies wanting to use CE, in particular, for those companies wanting to apply algorithms such as those in COMBO to support personalization of software to users' context in a process of continuous optimization

    Consumer Life Cycle and Profiling: A Data Mining Perspective

    Get PDF
    With the development of technology and continuously increasing of the market demand, the concept to produce better merchandises is generated in the companies. Each customer wants an individual approach or exclusive product, which creates the concept: “one customer one product.” The implementation of the one-to-one approach in the current days is the main exciting task of companies. Millions of customers lead to millions of exclusive products from the manufactures’ views. It is the primary step to study the needs of customers in the market economy. The main task for a company is to know the customer and to provide their desired products and services. In order to get knowledge ahead of the customers’ wishes, a system of profiling potential customers is created accordingly. This chapter provides the review of the customer lifetime from the reach customer (claim future customer’s attention) to the loyalty customer (turn a customer into a company advocate). During the discussion about the customer lifetime, readers will get acquainted with such technologies as funnel analysis, data management platform, customer profiling, customer behavior analysis, and others. The listed technologies in a complex will be created as the one-to-one product or service with a high Return on Investment (ROI)

    Linkage Knowledge Management and Data Mining in E-business: Case study

    Get PDF

    Knowledge – Driven CRM: Issues and challenges

    Get PDF
    In this paper, we will examine the issues surrounding the convergence of KDD (Knowledge Discovery in Databases) and CRM (Customer Relationship Management) in building knowledge – driven CRM. By understanding the issues and challenges, we hope to achieve better customer understanding and thus, create a better CRM solution
    corecore