59,952 research outputs found
Process mining techniques applied in industry
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsGiven the overview of today’s information era, several scientific fields related to data raised. Process Mining is relatively new and it aims to leverage merged techniques from two separate scientific areas: Business Process Management and Data Science. The main purpose of Process Mining is the discovery, monitoring and improvement of real processes. As a result, in the last few years, Process Mining has increased remarkably, and the importance of the process insights has become more and more relevant, directly proportional to the amount and quality of data that supports the analyses.
As a Data Engineer Intern at Nokia, I had the opportunity to be involved in the development phase of two business cases, being part of a team that has the main objective of exploring and analyzing several business processes within the company leveraging Data Science techniques
Customers Behavior Modeling by Semi-Supervised Learning in Customer Relationship Management
Leveraging the power of increasing amounts of data to analyze customer base
for attracting and retaining the most valuable customers is a major problem
facing companies in this information age. Data mining technologies extract
hidden information and knowledge from large data stored in databases or data
warehouses, thereby supporting the corporate decision making process. CRM uses
data mining (one of the elements of CRM) techniques to interact with customers.
This study investigates the use of a technique, semi-supervised learning, for
the management and analysis of customer-related data warehouse and information.
The idea of semi-supervised learning is to learn not only from the labeled
training data, but to exploit also the structural information in additionally
available unlabeled data. The proposed semi-supervised method is a model by
means of a feed-forward neural network trained by a back propagation algorithm
(multi-layer perceptron) in order to predict the category of an unknown
customer (potential customers). In addition, this technique can be used with
Rapid Miner tools for both labeled and unlabeled data
The WHY in Business Processes: Discovery of Causal Execution Dependencies
A crucial element in predicting the outcomes of process interventions and
making informed decisions about the process is unraveling the genuine
relationships between the execution of process activities. Contemporary process
discovery algorithms exploit time precedence as their main source of model
derivation. Such reliance can sometimes be deceiving from a causal perspective.
This calls for faithful new techniques to discover the true execution
dependencies among the tasks in the process. To this end, our work offers a
systematic approach to the unveiling of the true causal business process by
leveraging an existing causal discovery algorithm over activity timing. In
addition, this work delves into a set of conditions under which process mining
discovery algorithms generate a model that is incongruent with the causal
business process model, and shows how the latter model can be methodologically
employed for a sound analysis of the process. Our methodology searches for such
discrepancies between the two models in the context of three causal patterns,
and derives a new view in which these inconsistencies are annotated over the
mined process model. We demonstrate our methodology employing two open process
mining algorithms, the IBM Process Mining tool, and the LiNGAM causal discovery
technique. We apply it on a synthesized dataset and on two open benchmark data
sets.Comment: 20 pages, 19 figure
Towards simplified insurance application via sparse questionnaire optimization
© 2017 IEEE. Life insurance application requires in-person meetings with underwriters, tedious paperwork, and an average waiting period of six weeks before an offer can be made. This outdated process has become a barrier for broader consumer adoption, resulting large coverage gap. In this work, we aim to closing this gap by leveraging data mining techniques to optimize the insurance questionnaire form. Our experiment on 10 years of insurance application data has identified that only ∼2% of all questions have shown high relevancy to determining the risks of applicants, resulting a significantly simplified questionnaire
PREDICTIVE DIAGNOSIS THROUGH DATA MINING FOR CARDIOVASCULAR DISEASES
Abstract
Cardiovascular diseases (CVDs) are a leading cause of mortality worldwide, and early detection and accurate diagnosis are critical for effective treatment and prevention. Data mining techniques have emerged as powerful tools for analyzing large datasets to extract meaningful patterns and make predictions. This research paper aims to explore the application of data mining in predictive diagnosis for cardiovascular diseases. The study will start by collecting a comprehensive dataset comprising patient information, including demographics, medical history, lifestyle factors, and diagnostic test results. Various data mining techniques, such as classification, clustering, and association rule mining, will be applied to uncover hidden patterns and relationships within the data. Feature selection methods will be employed to identify the most relevant attributes for accurate prediction. The research will investigate different predictive models, including decision trees, support vector machines, and neural networks, to develop a reliable diagnostic system. Model performance will be evaluated using metrics such as accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC-ROC). Additionally, the study will employ cross-validation techniques to ensure the generalizability and robustness of the developed models. The research will explore the integration of advanced techniques, such as deep learning and ensemble methods, to enhance the predictive accuracy of the diagnosis. The use of explainable AI techniques will also be considered to provide interpretable insights into the predictive models' decision-making process. The findings of this research will contribute to the advancement of predictive diagnosis for cardiovascular diseases by leveraging data mining techniques. The developed diagnostic models will assist healthcare professionals in making accurate and timely predictions, leading to improved patient outcomes, personalized treatment plans, and effective preventive measures
Using Neural Networks for Relation Extraction from Biomedical Literature
Using different sources of information to support automated extracting of
relations between biomedical concepts contributes to the development of our
understanding of biological systems. The primary comprehensive source of these
relations is biomedical literature. Several relation extraction approaches have
been proposed to identify relations between concepts in biomedical literature,
namely, using neural networks algorithms. The use of multichannel architectures
composed of multiple data representations, as in deep neural networks, is
leading to state-of-the-art results. The right combination of data
representations can eventually lead us to even higher evaluation scores in
relation extraction tasks. Thus, biomedical ontologies play a fundamental role
by providing semantic and ancestry information about an entity. The
incorporation of biomedical ontologies has already been proved to enhance
previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1
- …