2,484 research outputs found

    Mining structured Petri nets for the visualization of process behavior

    Get PDF
    Visualization is essential for understanding the models obtained by process mining. Clear and efficient visual representations make the embedded information more accessible and analyzable. This work presents a novel approach for generating process models with structural properties that induce visually friendly layouts. Rather than generating a single model that captures all behaviors, a set of Petri net models is delivered, each one covering a subset of traces of the log. The models are mined by extracting slices of labelled transition systems with specific properties from the complete state space produced by the process logs. In most cases, few Petri nets are sufficient to cover a significant part of the behavior produced by the log.Peer ReviewedPostprint (author's final draft

    Mining complete, precise and simple process models

    Get PDF
    Process discovery algorithms are generally used to discover the underlying process that has been followed to achieve an objective. In general, these algorithms do not take into account any domain knowledge to derive process models, allowing to apply them in a general manner. However, depending on the selected approach, a different kind of process models can be discovered, as each technique has its strengths and weaknesses, e.g., the expressiveness of the used notation. Hence, it is important to take into account the requirements of the domain when deciding which algorithm to use, as the correct assumptions can lead to richer process models. For instance, among the different domains of application of process mining we can identify several fields that share an interesting requirement about the discovered process models. In security audits, discovered processes have to fulfill strict requisites. This means that the process model should reproduce as much behavior as possible; otherwise some violations may go undetected (replay fitness). On the other hand, in order to avoid false positives, process models should reproduce only the recorded behavior (precision). Finally, process models should be easily readable to better detect deviations (simplicity). Another clear example concerns the educational domain, as in order to be of value for both teachers and learners, a discovered learning process should satisfy the aforementioned requirements. That is, to guarantee feasible and correct evaluations, teachers need to access to all the activities performed by learners, thereby the learning process should be able to reproduce as much behavior as possible (replay fitness). Furthermore, the learning process should focus on the recorded behavior seen in the event log (precision), i.e., show only what the students did, and not what they might have done, while being easily interpretable by the teachers (simplicity). One of the previous requirements is related to the readability of process models: simplicity. In process mining, one of the identified challenges is the appropriate visualization of process models, i.e., to present the results of process discovery in such a way that people actually gain insights about the process. Process models that are unnecessary complex can hinder the real behavior of the process rather than to provide an intuition of what is really happening in an organization. However, achieving a good level of readability is not always straightforward, for instance, due the used representation. Within the different approaches focused to reduce the complexity of a process model, the interest in this PhD Thesis relies on two techniques. On the one hand, to improve the readability of an already discovered process model through the inclusion of duplicate labels. On the other hand, the hierarchization of a process model, i.e., to provide a well known structure to the process model. However, regarding the latter, this technique requires to take into account domain knowledge, as different domains may rely on different requirements when improving the readability of the process model. In other words, in order to improve the interpretability and understandability of a process model, the hierarchization has to be driven by the domain. To sum up, concerning the aim of this PhD Thesis, we can identify two main topics of interest. On the one hand, we are interested in retrieving process models that reproduce as much behavior recorded in the log as possible, without introducing unseen behavior. On the other hand, we try to reduce the complexity of the mined models in order to improve their readability. Hence, the aim of this PhD Thesis is to discover process models considering replay fitness, precision and simplicity, while paying special attention in retrieving highly interpretable process models

    Verslo procesų prognozavimo ir imitavimo taikant sisteminių įvykių žurnalų analizės metodus tyrimas

    Get PDF
    Business process (BP) analysis is one of the core activities in organisations that lead to improvements and achievement of a competitive edge. BP modelling and simulation are one of the most widely applied methods for analysing and improving BPs. The analysis requires to model BP and to apply analysis techniques to the models to answer queries leading to improvements. The input of the analysis process is BP models. The models can be in the form of BP models using industry-accepted BP modelling languages, mathematical models, simulation models and others. The model creation is the most important part of the BP analysis, and it is both time-consuming and costly activity. Nowadays most of the data generated in the organisations are electronic. Therefore, the re-use of such data can improve the results of the analysis. Thus, the main goal of the thesis is to improve BP analysis and simulation by proposing a method to discover a BP model from an event log and automate simulation model generation. The dissertation consists of an introduction, three main chapters and general conclusions. The first chapter discusses BP analysis methods. In addition, the process mining research area is presented, the techniques for automated model discovery, model validation and execution prediction are analysed. The second part of the chapter investigates the area of BP simula-tion. The second chapter of the dissertation presents a novel method which automatically discovers Bayesian Belief Network from an event log and, furthermore, automatically generates BP simulation model. The discovery of the Bayesian Belief Network consists of three steps: the discovery of a directed acyclic graph, generation of conditional probability tables and their combination. The BP simulation model is generated from the discovered directed acyclic graph and uses the belief network inferences during the simulation to infer the execution of the BP and to generate activity data dur-ing the simulation. The third chapter presents the experimental research of the proposed network and discusses the validity of the research and experiments. The experiments use selected logs that exhibit a wide array of behaviour. The experiments are performed in order to test the discovery of the graphs, the inference of the current process instance execution probability, the predic-tion of the future execution of the process instances and the correctness of the simulation. The results of the dissertation were published in 9 scientific publica-tions, 2 of which were in reviewed scientific journals indexed in Clarivate Analytics Science Citation Index

    Privacy Preserving Utility Mining: A Survey

    Full text link
    In big data era, the collected data usually contains rich information and hidden knowledge. Utility-oriented pattern mining and analytics have shown a powerful ability to explore these ubiquitous data, which may be collected from various fields and applications, such as market basket analysis, retail, click-stream analysis, medical analysis, and bioinformatics. However, analysis of these data with sensitive private information raises privacy concerns. To achieve better trade-off between utility maximizing and privacy preserving, Privacy-Preserving Utility Mining (PPUM) has become a critical issue in recent years. In this paper, we provide a comprehensive overview of PPUM. We first present the background of utility mining, privacy-preserving data mining and PPUM, then introduce the related preliminaries and problem formulation of PPUM, as well as some key evaluation criteria for PPUM. In particular, we present and discuss the current state-of-the-art PPUM algorithms, as well as their advantages and deficiencies in detail. Finally, we highlight and discuss some technical challenges and open directions for future research on PPUM.Comment: 2018 IEEE International Conference on Big Data, 10 page

    A guided search genetic algorithm using mined rules for optimal affective product design

    Get PDF
    Affective design is an important aspect of new product development, especially for consumer products, to achieve a competitive edge in the marketplace. It can help companies to develop new products that can better satisfy the emotional needs of customers. However, product designers usually encounter difficulties in determining the optimal settings of the design attributes for affective design. In this article, a novel guided search genetic algorithm (GA) approach is proposed to determine the optimal design attribute settings for affective design. The optimization model formulated based on the proposed approach applied constraints and guided search operators, which were formulated based on mined rules, to guide the GA search and to achieve desirable solutions. A case study on the affective design of mobile phones was conducted to illustrate the proposed approach and validate its effectiveness. Validation tests were conducted, and the results show that the guided search GA approach outperforms the GA approach without the guided search strategy in terms of GA convergence and computational time. In addition, the guided search optimization model is capable of improving GA to generate good solutions for affective design

    Longterm schedule optimization of an underground mine under geotechnical and ventilation constraints using SOT

    Get PDF
    Long-term mine scheduling is complex as well time and labour intensive. Yet in the mainstream of the mining industry, there is no computing program for schedule optimization and, in consequence, schedules are still created manually. The objective of this study was to compare a base case schedule generated with the Enhanced Production Scheduler (EPS®) and an optimized schedule generated with the Schedule Optimization Tool (SOT). The intent of having an optimized schedule is to improve the project value for underground mines. This study shows that SOT generates mine schedules that improve the Net Present Value (NPV) associated with orebody extraction. It does so by means of systematically and automatically exploring the options to vary the sequence and timing of mine activities, subject to constraints. First, a conventional scheduling method (EPS®) was adopted to identify a schedule of mining activities that satisfied basic sets of constraints, including physical adjacencies of mining activities and operational resource capacity. Additional constraint scenarios explored were geotechnical and ventilation, which negatively effect development rates. Next, the automated SOT procedure was applied to determine whether the schedules could be improved upon. It was demonstrated that SOT permitted the rapid re-assessment of project value when new constraint scenarios were applied. This study showed that the automated schedule optimization added value to the project every time it was applied. In addition, the reoptimizing and re-evaluating was quickly achieved. Therefore, the tool used in this research produced more optimized schedules than those produced using conventional scheduling methods.Master of Applied Science (MASc) in Natural Resources Engineerin

    The application of process mining to care pathway analysis in the NHS

    Get PDF
    Background: Prostate cancer is the most common cancer in men in the UK and the sixth-fastest increasing cancer in males. Within England survival rates are improving, however, these are comparatively poorer than other countries. Currently, information available on outcomes of care is scant and there is an urgent need for techniques to improve healthcare systems and processes. Aims: To provide prostate cancer pathway analysis, by applying concepts of process mining and visualisation and comparing the performance metrics against the standard pathway laid out by national guidelines. Methods: A systematic review was conducted to see how process mining has been used in healthcare. Appropriate datasets for prostate cancer were identified within Imperial College Healthcare NHS Trust London. A process model was constructed by linking and transforming cohort data from six distinct database sources. The cohort dataset was filtered to include patients who had a PSA from 2010-2015, and validated by comparing the medical patient records against a Case-note audit. Process mining techniques were applied to the data to analyse performance and conformance of the prostate cancer pathway metrics to national guideline metrics. These techniques were evaluated with stakeholders to ascertain its impact on user experience. Results: Case note audit revealed 90% match against patients found in medical records. Application of process mining techniques showed massive heterogeneity as compared to the homogenous path laid out by national guidelines. This also gave insight into bottlenecks and deviations in the pathway. Evaluation with stakeholders showed that the visualisation and technology was well accepted, high quality and recommended to be used in healthcare decision making. Conclusion: Process mining is a promising technique used to give insight into complex and flexible healthcare processes. It can map the patient journey at a local level and audit it against explicit standards of good clinical practice, which will enable us to intervene at the individual and system level to improve care.Open Acces

    Process Mining Concepts for Discovering User Behavioral Patterns in Instrumented Software

    Get PDF
    Process Mining is a technique for discovering “in-use” processes from traces emitted to event logs. Researchers have recently explored applying this technique to documenting processes discovered in software applications. However, the requirements for emitting events to support Process Mining against software applications have not been well documented. Furthermore, the linking of end-user intentional behavior to software quality as demonstrated in the discovered processes has not been well articulated. After evaluating the literature, this thesis suggested focusing on user goals and actual, in-use processes as an input to an Agile software development life cycle in order to improve software quality. It also provided suggestions for instrumenting software applications to support Process Mining techniques
    corecore