1,513 research outputs found

    Modeling Customer Experience in a Contact Center through Process Log Mining

    Get PDF
    The use of data mining and modeling methods in service industry is a promising avenue for optimizing current processes in a targeted manner, ultimately reducing costs and improving customer experience. However, the introduction of such tools in already established pipelines often must adapt to the way data is sampled and to its content. In this study, we tackle the challenge of characterizing and predicting customer experience having available only process log data with time-stamp information, without any ground truth feedback from the customers. As a case study, we consider the context of a contact center managed by TeleWare and analyze phone call logs relative to a two months span. We develop an approach to interpret the phone call process events registered in the logs and infer concrete points of improvement in the service management. Our approach is based on latent tree modeling and multi-class NaĂŻve Bayes classification, which jointly allow us to infer a spectrum of customer experiences and test their predictability based on the current data sampling strategy. Moreover, such approach can overcome limitations in customer feedback collection and sharing across organizations, thus having wide applicability and being complementary to tools relying on more heavily constrained data

    Fame for sale: efficient detection of fake Twitter followers

    Get PDF
    Fake followers\textit{Fake followers} are those Twitter accounts specifically created to inflate the number of followers of a target account. Fake followers are dangerous for the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere - hence impacting on economy, politics, and society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a baseline dataset of verified human and fake follower accounts. Such baseline dataset is publicly available to the scientific community. Then, we exploit the baseline dataset to train a set of machine-learning classifiers built over the reviewed rules and features. Our results show that most of the rules proposed by Media provide unsatisfactory performance in revealing fake followers, while features proposed in the past by Academia for spam detection provide good results. Building on the most promising features, we revise the classifiers both in terms of reduction of overfitting and cost for gathering the data needed to compute the features. The final result is a novel Class A\textit{Class A} classifier, general enough to thwart overfitting, lightweight thanks to the usage of the less costly features, and still able to correctly classify more than 95% of the accounts of the original training set. We ultimately perform an information fusion-based sensitivity analysis, to assess the global sensitivity of each of the features employed by the classifier. The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the novel issue of fake Twitter followers

    From complex questionnaire and interviewing data to intelligent Bayesian network models for medical decision support

    Get PDF
    OBJECTIVES: 1) To develop a rigorous and repeatable method for building effective Bayesian network (BN) models for medical decision support from complex, unstructured and incomplete patient questionnaires and interviews that inevitably contain examples of repetitive, redundant and contradictory responses; 2) To exploit expert knowledge in the BN development since further data acquisition is usually not possible; 3) To ensure the BN model can be used for interventional analysis; 4) To demonstrate why using data alone to learn the model structure and parameters is often unsatisfactory even when extensive data is available. METHOD: The method is based on applying a range of recent BN developments targeted at helping experts build BNs given limited data. While most of the components of the method are based on established work, its novelty is that it provides a rigorous consolidated and generalised framework that addresses the whole life-cycle of BN model development. The method is based on two original and recent validated BN models in forensic psychiatry, known as DSVM-MSS and DSVM-P. RESULTS: When employed with the same datasets, the DSVM-MSS demonstrated competitive to superior predictive performance (AUC scores 0.708 and 0.797) against the state-of-the-art (AUC scores ranging from 0.527 to 0.705), and the DSVM-P demonstrated superior predictive performance (cross-validated AUC score of 0.78) against the state-of-the-art (AUC scores ranging from 0.665 to 0.717). More importantly, the resulting models go beyond improving predictive accuracy and into usefulness for risk management purposes through intervention, and enhanced decision support in terms of answering complex clinical questions that are based on unobserved evidence. CONCLUSIONS: This development process is applicable to any application domain which involves large-scale decision analysis based on such complex information, rather than based on data with hard facts, and in conjunction with the incorporation of expert knowledge for decision support via intervention. The novelty extends to challenging the decision scientists to reason about building models based on what information is really required for inference, rather than based on what data is available and hence, forces decision scientists to use available data in a much smarter way

    TEXTUAL DATA MINING FOR NEXT GENERATION INTELLIGENT DECISION MAKING IN INDUSTRIAL ENVIRONMENT: A SURVEY

    Get PDF
    This paper proposes textual data mining as a next generation intelligent decision making technology for sustainable knowledge management solutions in any industrial environment. A detailed survey of applications of Data Mining techniques for exploiting information from different data formats and transforming this information into knowledge is presented in the literature survey. The focus of the survey is to show the power of different data mining techniques for exploiting information from data. The literature surveyed in this paper shows that intelligent decision making is of great importance in many contexts within manufacturing, construction and business generally. Business intelligence tools, which can be interpreted as decision support tools, are of increasing importance to companies for their success within competitive global markets. However, these tools are dependent on the relevancy, accuracy and overall quality of the knowledge on which they are based and which they use. Thus the research work presented in the paper uncover the importance and power of different data mining techniques supported by text mining methods used to exploit information from semi-structured or un-structured data formats. A great source of information is available in these formats and when exploited by combined efforts of data and text mining tools help the decision maker to take effective decision for the enhancement of business of industry and discovery of useful knowledge is made for next generation of intelligent decision making. Thus the survey shows the power of textual data mining as the next generation technology for intelligent decision making in the industrial environment
    • …
    corecore