59,952 research outputs found

    Process mining techniques applied in industry

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsGiven the overview of today’s information era, several scientific fields related to data raised. Process Mining is relatively new and it aims to leverage merged techniques from two separate scientific areas: Business Process Management and Data Science. The main purpose of Process Mining is the discovery, monitoring and improvement of real processes. As a result, in the last few years, Process Mining has increased remarkably, and the importance of the process insights has become more and more relevant, directly proportional to the amount and quality of data that supports the analyses. As a Data Engineer Intern at Nokia, I had the opportunity to be involved in the development phase of two business cases, being part of a team that has the main objective of exploring and analyzing several business processes within the company leveraging Data Science techniques

    Customers Behavior Modeling by Semi-Supervised Learning in Customer Relationship Management

    Full text link
    Leveraging the power of increasing amounts of data to analyze customer base for attracting and retaining the most valuable customers is a major problem facing companies in this information age. Data mining technologies extract hidden information and knowledge from large data stored in databases or data warehouses, thereby supporting the corporate decision making process. CRM uses data mining (one of the elements of CRM) techniques to interact with customers. This study investigates the use of a technique, semi-supervised learning, for the management and analysis of customer-related data warehouse and information. The idea of semi-supervised learning is to learn not only from the labeled training data, but to exploit also the structural information in additionally available unlabeled data. The proposed semi-supervised method is a model by means of a feed-forward neural network trained by a back propagation algorithm (multi-layer perceptron) in order to predict the category of an unknown customer (potential customers). In addition, this technique can be used with Rapid Miner tools for both labeled and unlabeled data

    The WHY in Business Processes: Discovery of Causal Execution Dependencies

    Full text link
    A crucial element in predicting the outcomes of process interventions and making informed decisions about the process is unraveling the genuine relationships between the execution of process activities. Contemporary process discovery algorithms exploit time precedence as their main source of model derivation. Such reliance can sometimes be deceiving from a causal perspective. This calls for faithful new techniques to discover the true execution dependencies among the tasks in the process. To this end, our work offers a systematic approach to the unveiling of the true causal business process by leveraging an existing causal discovery algorithm over activity timing. In addition, this work delves into a set of conditions under which process mining discovery algorithms generate a model that is incongruent with the causal business process model, and shows how the latter model can be methodologically employed for a sound analysis of the process. Our methodology searches for such discrepancies between the two models in the context of three causal patterns, and derives a new view in which these inconsistencies are annotated over the mined process model. We demonstrate our methodology employing two open process mining algorithms, the IBM Process Mining tool, and the LiNGAM causal discovery technique. We apply it on a synthesized dataset and on two open benchmark data sets.Comment: 20 pages, 19 figure

    Towards simplified insurance application via sparse questionnaire optimization

    Full text link
    © 2017 IEEE. Life insurance application requires in-person meetings with underwriters, tedious paperwork, and an average waiting period of six weeks before an offer can be made. This outdated process has become a barrier for broader consumer adoption, resulting large coverage gap. In this work, we aim to closing this gap by leveraging data mining techniques to optimize the insurance questionnaire form. Our experiment on 10 years of insurance application data has identified that only ∼2% of all questions have shown high relevancy to determining the risks of applicants, resulting a significantly simplified questionnaire

    PREDICTIVE DIAGNOSIS THROUGH DATA MINING FOR CARDIOVASCULAR DISEASES

    Get PDF
    Abstract Cardiovascular diseases (CVDs) are a leading cause of mortality worldwide, and early detection and accurate diagnosis are critical for effective treatment and prevention. Data mining techniques have emerged as powerful tools for analyzing large datasets to extract meaningful patterns and make predictions. This research paper aims to explore the application of data mining in predictive diagnosis for cardiovascular diseases. The study will start by collecting a comprehensive dataset comprising patient information, including demographics, medical history, lifestyle factors, and diagnostic test results. Various data mining techniques, such as classification, clustering, and association rule mining, will be applied to uncover hidden patterns and relationships within the data. Feature selection methods will be employed to identify the most relevant attributes for accurate prediction. The research will investigate different predictive models, including decision trees, support vector machines, and neural networks, to develop a reliable diagnostic system. Model performance will be evaluated using metrics such as accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC-ROC). Additionally, the study will employ cross-validation techniques to ensure the generalizability and robustness of the developed models. The research will explore the integration of advanced techniques, such as deep learning and ensemble methods, to enhance the predictive accuracy of the diagnosis. The use of explainable AI techniques will also be considered to provide interpretable insights into the predictive models' decision-making process. The findings of this research will contribute to the advancement of predictive diagnosis for cardiovascular diseases by leveraging data mining techniques. The developed diagnostic models will assist healthcare professionals in making accurate and timely predictions, leading to improved patient outcomes, personalized treatment plans, and effective preventive measures

    Using Neural Networks for Relation Extraction from Biomedical Literature

    Full text link
    Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely, using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1
    corecore