158,810 research outputs found

    DATA MINING LANGUAGES STANDARDS

    Get PDF
    The increasing of the database dimension creates many problems, especially when we need to access, use and analyze data. The data overflow phenomenon in database environments imposes the application of different data mining methods, in order to find relevant information from large databases. A lot of data mining tools emerged in the last years. The standardization of data mining languages become in the last years a very important topic. The paper presents Predictive Model Markup Language (PMML) standards from the Data Mining Group. PMML, a standard language for defining data mining models, which allows users to develop models within one vendor's application, and use other vendors' applications to visualize, analyze, evaluate or otherwise use the models.

    Integrating Data Mining Into Business Intelligence

    Get PDF
    Data Mining is a broad term often used to describe the process of using database technology, modeling techniques, statistical analysis, and machine learning to analyze large amounts of data in an automated fashion to discover hidden patterns and predictive information in the data. By building highly complex and sophisticated statistical and mathematical models, organizations can gain new insight into their activities. The purpose of this document is to provide users with a background of a few key data mining concepts and business intelligence and about benefits of integrating business intelligence and data mining.Business Intelligence, platform, data mining

    You’ve Data Mined. Now What?

    Get PDF
    Data-mining technologies are within the grasp of many organizations. Commercially available data-mining packages make it relatively easy for firms to transform their data resources into predictive models. Yet, despite technological advances, the precise manner in which data-mining output should be incorporated into an organization’s decision-making processes remains unclear. This paper attempts to clarify the role of data mining by situating it within the context of Simon’s model of decision making. We use a complex decision problem from the video game development industry to illustrate several practical challenges managers face when using data-mining output as a decision making input. We then show how some of these challenges can be overcome by incorporating data-mined predictive models into a conventional decision-analytic formulation of the problem

    An intelligent assistant for exploratory data analysis

    Get PDF
    In this paper we present an account of the main features of SNOUT, an intelligent assistant for exploratory data analysis (EDA) of social science survey data that incorporates a range of data mining techniques. EDA has much in common with existing data mining techniques: its main objective is to help an investigator reach an understanding of the important relationships ina data set rather than simply develop predictive models for selectd variables. Brief descriptions of a number of novel techniques developed for use in SNOUT are presented. These include heuristic variable level inference and classification, automatic category formation, the use of similarity trees to identify groups of related variables, interactive decision tree construction and model selection using a genetic algorithm

    PREDICTIVE DIAGNOSIS THROUGH DATA MINING FOR CARDIOVASCULAR DISEASES

    Get PDF
    Abstract Cardiovascular diseases (CVDs) are a leading cause of mortality worldwide, and early detection and accurate diagnosis are critical for effective treatment and prevention. Data mining techniques have emerged as powerful tools for analyzing large datasets to extract meaningful patterns and make predictions. This research paper aims to explore the application of data mining in predictive diagnosis for cardiovascular diseases. The study will start by collecting a comprehensive dataset comprising patient information, including demographics, medical history, lifestyle factors, and diagnostic test results. Various data mining techniques, such as classification, clustering, and association rule mining, will be applied to uncover hidden patterns and relationships within the data. Feature selection methods will be employed to identify the most relevant attributes for accurate prediction. The research will investigate different predictive models, including decision trees, support vector machines, and neural networks, to develop a reliable diagnostic system. Model performance will be evaluated using metrics such as accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC-ROC). Additionally, the study will employ cross-validation techniques to ensure the generalizability and robustness of the developed models. The research will explore the integration of advanced techniques, such as deep learning and ensemble methods, to enhance the predictive accuracy of the diagnosis. The use of explainable AI techniques will also be considered to provide interpretable insights into the predictive models' decision-making process. The findings of this research will contribute to the advancement of predictive diagnosis for cardiovascular diseases by leveraging data mining techniques. The developed diagnostic models will assist healthcare professionals in making accurate and timely predictions, leading to improved patient outcomes, personalized treatment plans, and effective preventive measures

    Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning

    Full text link
    Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model. The primary challenge in predictive pattern mining lies in the exponential growth of the number of patterns with the complexity of the structured data. In this study, we propose the Safe Pattern Pruning (SPP) method to address the explosion of pattern numbers in predictive pattern mining. We also discuss how it can be effectively employed throughout the entire model building process in practical data analysis. To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences

    Causality-based cost-effective action mining

    Get PDF
    In many business contexts, the ultimate goal of knowledge discovery is not the knowledge itself, but putting it to use. Models or patterns found by data mining methods often require further post-processing to bring this about. For instance, in churn prediction, data mining may give a model that predicts which customers are likely to end their contract, but companies are not just interested in knowing who is likely to do so, they want to know what they can do to avoid this. The models or patterns have to be transformed into actionable knowledge. Action mining explicitly addresses this. Currently, many action mining methods rely on a predictive model, obtained through data mining, to estimate the effect of certain actions and finally suggest actions with desirable effects. A major problem with this approach is that predictive models do not necessarily reflect a causal relationship between their inputs and outputs. This makes the existing action mining methods less reliable. In this paper, we introduce ICE-CREAM, a novel approach to action mining that explicitly relies on an automatically obtained best estimate of the causal relationships in the data. Experiments confirm that ICE-CREAM performs much better than the current state of the art in action mining
    • …
    corecore