18,665 research outputs found

    Actionable knowledge discovery : methodologies and frameworks

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.Most data mining algorithms and tools stop at the mining and delivery of patterns satisfying expected technical interestingness. There are often many patterns mined but business people either are not interested in them or do not know what follow-up actions to take to support their business decisions. This issue has seriously affected the widespread employment of advanced data mining techniques in greatly promoting enterprise operational quality and productivity. In this thesis, a formal and systematic view of actionable knowledge discovery (AKD for short) has been proposed from the system and microeconomy perspectives. AKD is a closed-loop optimization problem-solving process from problem definition, framework/model design to actionable pattern discovery, and to deliver operationalizable business rules that can be seamlessly associated or integrated with business processes and systems. To support AKD, corresponding methodologies, frameworks and tools have been proposed with case studies in the real world to address critical challenges facing the traditional KDD and. to cater for crucially important factors surrounding real-life AKD. First, a comprehensive survey and retrospection on the existing data mining methodologies, issues and challenges in actionable knowledge discovery are reviewed. Second, a practical data mining methodology: domain driven data mining is addressed. Third, several frameworks have been proposed to support domain drivenactionable knowledge discovery. Fourth, case studies of domain-driven actionable pattern mining in stock markets and social security data are presented to demonstrate the usefulness and potential of the proposed domain driven actionable knowledge discovery. In summary, this thesis explores in detail how domain driven actionable knowledge discovery can be effectively and efficiently applied to the discovery and delivery of knowledge satisfying both technical and business concerns as well as to support smart decision-making in the real world. The issues and techniques addressed in this thesis have potential to promote the research on critical KDD challenges, and contribute to the paradigm shift from data-centered and technical significance-oriented hidden pattern mining to domain-driven and balanced actionable knowledge discovery. The proposed methodologies and frameworks are flexible, general and effective to be expanded and applied to mining real-life complex data for actionable knowledge

    Predictive User Modeling with Actionable Attributes

    Get PDF
    Different machine learning techniques have been proposed and used for modeling individual and group user needs, interests and preferences. In the traditional predictive modeling instances are described by observable variables, called attributes. The goal is to learn a model for predicting the target variable for unseen instances. For example, for marketing purposes a company consider profiling a new user based on her observed web browsing behavior, referral keywords or other relevant information. In many real world applications the values of some attributes are not only observable, but can be actively decided by a decision maker. Furthermore, in some of such applications the decision maker is interested not only to generate accurate predictions, but to maximize the probability of the desired outcome. For example, a direct marketing manager can choose which type of a special offer to send to a client (actionable attribute), hoping that the right choice will result in a positive response with a higher probability. We study how to learn to choose the value of an actionable attribute in order to maximize the probability of a desired outcome in predictive modeling. We emphasize that not all instances are equally sensitive to changes in actions. Accurate choice of an action is critical for those instances, which are on the borderline (e.g. users who do not have a strong opinion one way or the other). We formulate three supervised learning approaches for learning to select the value of an actionable attribute at an instance level. We also introduce a focused training procedure which puts more emphasis on the situations where varying the action is the most likely to take the effect. The proof of concept experimental validation on two real-world case studies in web analytics and e-learning domains highlights the potential of the proposed approaches

    Autoencoders for strategic decision support

    Full text link
    In the majority of executive domains, a notion of normality is involved in most strategic decisions. However, few data-driven tools that support strategic decision-making are available. We introduce and extend the use of autoencoders to provide strategically relevant granular feedback. A first experiment indicates that experts are inconsistent in their decision making, highlighting the need for strategic decision support. Furthermore, using two large industry-provided human resources datasets, the proposed solution is evaluated in terms of ranking accuracy, synergy with human experts, and dimension-level feedback. This three-point scheme is validated using (a) synthetic data, (b) the perspective of data quality, (c) blind expert validation, and (d) transparent expert evaluation. Our study confirms several principal weaknesses of human decision-making and stresses the importance of synergy between a model and humans. Moreover, unsupervised learning and in particular the autoencoder are shown to be valuable tools for strategic decision-making

    A case of using formal concept analysis in combination with emergent self organizing maps for detecting domestic violence.

    Get PDF
    In this paper, we propose a framework for iterative knowledge discovery from unstructured text using Formal Concept Analysis and Emergent Self Organizing Maps. We apply the framework to a real life case study using data from the Amsterdam-Amstelland police. The case zooms in on the problem of distilling concepts for domestic violence from the unstructured text in police reports. Our human-centered framework facilitates the exploration of the data and allows for an efficient incorporation of prior expert knowledge to steer the discovery process. This exploration resulted in the discovery of faulty case labellings, common classification errors made by police officers, confusing situations, missing values in police reports, etc. The framework was also used for iteratively expanding a domain-specific thesaurus. Furthermore, we showed how the presented method was used to develop a highly accurate and comprehensible classification model that automatically assigns a domestic or non-domestic violence label to police reports.Formal concept analysis; Emergent self organizing map; Text mining; Actionable knowledge discovery; Domestic violence;

    Alexandria: Extensible Framework for Rapid Exploration of Social Media

    Full text link
    The Alexandria system under development at IBM Research provides an extensible framework and platform for supporting a variety of big-data analytics and visualizations. The system is currently focused on enabling rapid exploration of text-based social media data. The system provides tools to help with constructing "domain models" (i.e., families of keywords and extractors to enable focus on tweets and other social media documents relevant to a project), to rapidly extract and segment the relevant social media and its authors, to apply further analytics (such as finding trends and anomalous terms), and visualizing the results. The system architecture is centered around a variety of REST-based service APIs to enable flexible orchestration of the system capabilities; these are especially useful to support knowledge-worker driven iterative exploration of social phenomena. The architecture also enables rapid integration of Alexandria capabilities with other social media analytics system, as has been demonstrated through an integration with IBM Research's SystemG. This paper describes a prototypical usage scenario for Alexandria, along with the architecture and key underlying analytics.Comment: 8 page

    Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking

    Full text link
    Machine-learned models are often described as "black boxes". In many real-world applications however, models may have to sacrifice predictive power in favour of human-interpretability. When this is the case, feature engineering becomes a crucial task, which requires significant and time-consuming human effort. Whilst some features are inherently static, representing properties that cannot be influenced (e.g., the age of an individual), others capture characteristics that could be adjusted (e.g., the daily amount of carbohydrates taken). Nonetheless, once a model is learned from the data, each prediction it makes on new instances is irreversible - assuming every instance to be a static point located in the chosen feature space. There are many circumstances however where it is important to understand (i) why a model outputs a certain prediction on a given instance, (ii) which adjustable features of that instance should be modified, and finally (iii) how to alter such a prediction when the mutated instance is input back to the model. In this paper, we present a technique that exploits the internals of a tree-based ensemble classifier to offer recommendations for transforming true negative instances into positively predicted ones. We demonstrate the validity of our approach using an online advertising application. First, we design a Random Forest classifier that effectively separates between two types of ads: low (negative) and high (positive) quality ads (instances). Then, we introduce an algorithm that provides recommendations that aim to transform a low quality ad (negative instance) into a high quality one (positive instance). Finally, we evaluate our approach on a subset of the active inventory of a large ad network, Yahoo Gemini.Comment: 10 pages, KDD 201

    Curbing domestic violence: instantiating C-K theory with formal concept analysis and emergent self organizing maps.

    Get PDF
    In this paper we propose a human-centered process for knowledge discovery from unstructured text that makes use of Formal Concept Analysis and Emergent Self Organizing Maps. The knowledge discovery process is conceptualized and interpreted as successive iterations through the Concept-Knowledge (C-K) theory design square. To illustrate its effectiveness, we report on a real-life case study of using the process at the Amsterdam-Amstelland police in the Netherlands aimed at distilling concepts to identify domestic violence from the unstructured text in actual police reports. The case study allows us to show how the process was not only able to uncover the nature of a phenomenon such as domestic violence, but also enabled analysts to identify many types of anomalies in the practice of policing. We will illustrate how the insights obtained from this exercise resulted in major improvements in the management of domestic violence cases.Formal concept analysis; Emergent self organizing map; C-K theory; Text mining; Actionable knowledge discovery; Domestic violence;

    Community standards for open cell migration data

    Get PDF
    Cell migration research has become a high-content field. However, the quantitative information encapsulated in these complex and high-dimensional datasets is not fully exploited owing to the diversity of experimental protocols and non-standardized output formats. In addition, typically the datasets are not open for reuse. Making the data open and Findable, Accessible, Interoperable, and Reusable (FAIR) will enable meta-analysis, data integration, and data mining. Standardized data formats and controlled vocabularies are essential for building a suitable infrastructure for that purpose but are not available in the cell migration domain. We here present standardization efforts by the Cell Migration Standardisation Organisation (CMSO), an open community-driven organization to facilitate the development of standards for cell migration data. This work will foster the development of improved algorithms and tools and enable secondary analysis of public datasets, ultimately unlocking new knowledge of the complex biological process of cell migration
    • …
    corecore