228 research outputs found

    Acta Polytechnica Hungarica 2006

    Get PDF

    Process mining : conformance and extension

    Get PDF
    Today’s business processes are realized by a complex sequence of tasks that are performed throughout an organization, often involving people from different departments and multiple IT systems. For example, an insurance company has a process to handle insurance claims for their clients, and a hospital has processes to diagnose and treat patients. Because there are many activities performed by different people throughout the organization, there is a lack of transparency about how exactly these processes are executed. However, understanding the process reality (the "as is" process) is the first necessary step to save cost, increase quality, or ensure compliance. The field of process mining aims to assist in creating process transparency by automatically analyzing processes based on existing IT data. Most processes are supported by IT systems nowadays. For example, Enterprise Resource Planning (ERP) systems such as SAP log all transaction information, and Customer Relationship Management (CRM) systems are used to keep track of all interactions with customers. Process mining techniques use these low-level log data (so-called event logs) to automatically generate process maps that visualize the process reality from different perspectives. For example, it is possible to automatically create process models that describe the causal dependencies between activities in the process. So far, process mining research has mostly focused on the discovery aspect (i.e., the extraction of models from event logs). This dissertation broadens the field of process mining to include the aspect of conformance and extension. Conformance aims at the detection of deviations from documented procedures by comparing the real process (as recorded in the event log) with an existing model that describes the assumed or intended process. Conformance is relevant for two reasons: 1. Most organizations document their processes in some form. For example, process models are created manually to understand and improve the process, comply with regulations, or for certification purposes. In the presence of existing models, it is often more important to point out the deviations from these existing models than to discover completely new models. Discrepancies emerge because business processes change, or because the models did not accurately reflect the real process in the first place (due to the manual and subjective creation of these models). If the existing models do not correspond to the actual processes, then they have little value. 2. Automatically discovered process models typically do not completely "fit" the event logs from which they were created. These discrepancies are due to noise and/or limitations of the used discovery techniques. Furthermore, in the context of complex and diverse process environments the discovered models often need to be simplified to obtain useful insights. Therefore, it is crucial to be able to check how much a discovered process model actually represents the real process. Conformance techniques can be used to quantify the representativeness of a mined model before drawing further conclusions. They thus constitute an important quality measurement to effectively use process discovery techniques in a practical setting. Once one is confident in the quality of an existing or discovered model, extension aims at the enrichment of these models by the integration of additional characteristics such as time, cost, or resource utilization. By extracting aditional information from an event log and projecting it onto an existing model, bottlenecks can be highlighted and correlations with other process perspectives can be identified. Such an integrated view on the process is needed to understand root causes for potential problems and actually make process improvements. Furthermore, extension techniques can be used to create integrated simulation models from event logs that resemble the real process more closely than manually created simulation models. In Part II of this thesis, we provide a comprehensive framework for the conformance checking of process models. First, we identify the evaluation dimensions fitness, decision/generalization, and structure as the relevant conformance dimensions.We develop several Petri-net based approaches to measure conformance in these dimensions and describe five case studies in which we successfully applied these conformance checking techniques to real and artificial examples. Furthermore, we provide a detailed literature review of related conformance measurement approaches (Chapter 4). Then, we study existing model evaluation approaches from the field of data mining. We develop three data mining-inspired evaluation approaches for discovered process models, one based on Cross Validation (CV), one based on the Minimal Description Length (MDL) principle, and one using methods based on Hidden Markov Models (HMMs). We conclude that process model evaluation faces similar yet different challenges compared to traditional data mining. Additional challenges emerge from the sequential nature of the data and the higher-level process models, which include concurrent dynamic behavior (Chapter 5). Finally, we point out current shortcomings and identify general challenges for conformance checking techniques. These challenges relate to the applicability of the conformance metric, the metric quality, and the bridging of different process modeling languages. We develop a flexible, language-independent conformance checking approach that provides a starting point to effectively address these challenges (Chapter 6). In Part III, we develop a concrete extension approach, provide a general model for process extensions, and apply our approach for the creation of simulation models. First, we develop a Petri-net based decision mining approach that aims at the discovery of decision rules at process choice points based on data attributes in the event log. While we leverage classification techniques from the data mining domain to actually infer the rules, we identify the challenges that relate to the initial formulation of the learning problem from a process perspective. We develop a simple approach to partially overcome these challenges, and we apply it in a case study (Chapter 7). Then, we develop a general model for process extensions to create integrated models including process, data, time, and resource perspective.We develop a concrete representation based on Coloured Petri-nets (CPNs) to implement and deploy this model for simulation purposes (Chapter 8). Finally, we evaluate the quality of automatically discovered simulation models in two case studies and extend our approach to allow for operational decision making by incorporating the current process state as a non-empty starting point in the simulation (Chapter 9). Chapter 10 concludes this thesis with a detailed summary of the contributions and a list of limitations and future challenges. The work presented in this dissertation is supported and accompanied by concrete implementations, which have been integrated in the ProM and ProMimport frameworks. Appendix A provides a comprehensive overview about the functionality of the developed software. The results presented in this dissertation have been presented in more than twenty peer-reviewed scientific publications, including several high-quality journals

    Automatic and adaptive preprocessing for the development of predictive models.

    Get PDF
    In recent years, there has been an increasing interest in extracting valuable information from large amounts of data. This information can be useful for making predictions about the future or inferring unknown values. There exists a multitude of predictive models for the most common tasks of classification and regression. However, researchers often assume that data is clean and far too little attention has been paid to data pre-processing. Despite the fact that there are a number of methods for accomplishing individual pre-processing tasks (e.g. outlier detection or feature selection), the effort of performing comprehensive data preparation and cleaning can take between 60% and 80% of the whole data mining process time. One of the goals of this research is to speed up this process and make it more efficient. To this end, an approach for automating the selection and optimisation of multiple preprocessing methods and predictors has been proposed. The combination of multiple data mining methods forming a workflow is known as Multi-Component Predictive System (MCPS). There are multiple software platforms like Weka and RapidMiner to create and run MCPSs including a large variety of pre-processing methods and predictors. There is, however, no common mathematical representation of MCPSs. An objective of this thesis is to establish a common representation framework of MCPSs. This will allow validating workflows before beginning the implementation phase with any particular platform. The validation of workflows becomes even more relevant when considering the automatic generation of MCPSs. In order to automate the composition and optimisation of MCPSs, a search space is defined consisting of a number of preprocessing methods, predictive models and their hyperparameters. Then, the space is explored using a Bayesian optimisation strategy within a given time or computational budget. As a result, a parametrised sequence of methods is returned which after training form a complete predictive system. The whole process is data-driven and does not require human intervention once it has been started. The generated predictive system can then be used to make predictions in an online scenario. However, it is possible that the nature of the input data changes over time. As a result, predictive models may need to be updated to capture the new characteristics of the data in order to reduce the loss of predictive performance. Similarly, preprocessing methods may have to be adapted as well. A novel hybrid strategy combining Bayesian optimisation and common adaptive techniques is proposed to automatically adapt MCPSs. This approach performs a global adaptation of the MCPS. However, in some situations, it could be costly to update the whole predictive system when maybe just a little adjustment is needed. The consequences of adapting a single component can, however, be significant. This thesis also analyses the impact of adapting individual components in an MCPS and proposes an approach to propagate changes through the system. This thesis was initiated due to a joint research project with a chemical production company, which has provided several datasets with common raw data issues in the process industry. The final part of this thesis evaluates the feasibility of applying such automatic techniques for building and maintaining predictive models for real chemical production processes

    Proceedings of The Multi-Agent Logics, Languages, and Organisations Federated Workshops (MALLOW 2010)

    Get PDF
    http://ceur-ws.org/Vol-627/allproceedings.pdfInternational audienceMALLOW-2010 is a third edition of a series initiated in 2007 in Durham, and pursued in 2009 in Turin. The objective, as initially stated, is to "provide a venue where: the cost of participation was minimum; participants were able to attend various workshops, so fostering collaboration and cross-fertilization; there was a friendly atmosphere and plenty of time for networking, by maximizing the time participants spent together"

    Diagnosis of an EPS module

    Get PDF
    Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Electrotécnica e ComputadoresThis thesis addresses and contextualizes the problem of diagnostic of an Evolvable Production System (EPS). An EPS is a complex and lively entity composed of intelligent modules that interact through bio-inspired mechanisms, to ensure high system availability and seamless reconfiguration. The actual economic situation together with the increasing demand of high quality and low priced customized products imposed a shift in the production policies of enterprises. Shop floors have to become more agile and flexible to accommodate the new production paradigms. Rather than selling products enterprises are establishing a trend of offering services to explore business opportunities. The new production paradigms, potentiated by the advances in Information Technologies (IT), especially in web related standards and technologies as well as the progressive acceptance of the multi-agent systems (MAS) concept and related technologies, envision collections of modules whose individual and collective function adapts and evolves ensuring the fitness and adequacy of the shop floor in tackling profitable but volatile business opportunities. Despite the richness of the interactions and the effort set in modelling them, their potential to favour fault propagation and interference, in these complex environments, has been ignored from a diagnostic point of view. With the increase of distributed and autonomous components that interact in the execution of processes current diagnostic approaches will soon be insufficient. While current system dynamics are complex and to a certain extent unpredictable the adoption of the next generation of approaches and technologies comes at the cost of a yet increased complexity.Whereas most of the research in such distributed industrial systems is focused in the study and establishment of control structures, the problem of diagnosis has been left relatively unattended. There are however significant open challenges in the diagnosis of such modular systems including: understanding fault propagation and ensuring scalability and co-evolution. This work provides an implementation of a state-of-the-art agent-based interaction-oriented architecture compliant with the EPS paradigm that supports the introduction of a new developed diagnostic algorithm that has the ability to cope with the modern manufacturing paradigm challenges and to provide diagnostic analysis that explores the network dimension of multi-agent systems

    Research in Business Process Management: A bibliometric analysis

    Get PDF
    It contains several growing subtopics such as process mining, process flexibility and process compliance. BPM is also highly relevant for numerous related fields, such as Business Intelligence, ERP systems or Knowledge Management. The growing number of publications and the variety of topics in BPM make it useful to apply bibliometric methods on this scientific field. With bibliometric methods, topical clusters, essential authors and the relationships between them can be discovered. In this work, the BibTechMon software from the Austrian Institute of Technology is utilized to perform the bibliometric analyses. As a novelty for the work with BibTechMon, data from Google Scholar is used as the basis of the analyses. The nature of Google Scholar data differs significantly from the data of other scientific databases. These differences lead to changes on how the bibliometric analyses can be performed. After these changes have been assessed, several bibliometric analyses in the BPM field and related fields are performed. As a result of these analyses, diverse topical clusters in BPM and its related fields could be discovered. Additionally, important authors for each cluster and for the BPM field as a whole were determined. In order to evaluate the results of the bibliometric analyses, I conducted an interview on BPM with Professor Reichert, who is an active researcher in the field. Subsequently, his statements are compared with the results of the bibliometric analyses and the match between the bibliometric analyses and his statements is assessed

    Industrial Applications: New Solutions for the New Era

    Get PDF
    This book reprints articles from the Special Issue "Industrial Applications: New Solutions for the New Age" published online in the open-access journal Machines (ISSN 2075-1702). This book consists of twelve published articles. This special edition belongs to the "Mechatronic and Intelligent Machines" section
    • …
    corecore