3,807 research outputs found

    Effective Removal of Operational Log Messages: an Application to Model Inference

    Full text link
    Model inference aims to extract accurate models from the execution logs of software systems. However, in reality, logs may contain some "noise" that could deteriorate the performance of model inference. One form of noise can commonly be found in system logs that contain not only transactional messages---logging the functional behavior of the system---but also operational messages---recording the operational state of the system (e.g., a periodic heartbeat to keep track of the memory usage). In low-quality logs, transactional and operational messages are randomly interleaved, leading to the erroneous inclusion of operational behaviors into a system model, that ideally should only reflect the functional behavior of the system. It is therefore important to remove operational messages in the logs before inferring models. In this paper, we propose LogCleaner, a novel technique for removing operational logs messages. LogCleaner first performs a periodicity analysis to filter out periodic messages, and then it performs a dependency analysis to calculate the degree of dependency for all log messages and to remove operational messages based on their dependencies. The experimental results on two proprietary and 11 publicly available log datasets show that LogCleaner, on average, can accurately remove 98% of the operational messages and preserve 81% of the transactional messages. Furthermore, using logs pre-processed with LogCleaner decreases the execution time of model inference (with a speed-up ranging from 1.5 to 946.7 depending on the characteristics of the system) and significantly improves the accuracy of the inferred models, by increasing their ability to accept correct system behaviors (+43.8 pp on average, with pp=percentage points) and to reject incorrect system behaviors (+15.0 pp on average)

    Recommender System Based on Process Mining

    Get PDF
    Automation of repetitive tasks can be achieved with Robotic Process Automation (RPA) using scripts that encode fine-grained interactions with software applications on desktops and the web. Automating these processes can be achieved through several applications. It is possible for users to record desktop activity, including metadata, with these tools. The very fine-grained steps in the processes contain details about very small steps that the user takes. Several steps are involved in this process, including clicking on buttons, typing text, selecting the text, and changing the focus. Automating these processes requires connectors connecting them to the appropriate applications. Currently, users choose these connectors manually rather than automatically being linked to processes. In this thesis, we propose a method for recommending the top-k suitable connectors based on event logs for each process. This method indicates that we can use process discovery, create the process models of the train processes with identified connectors, and calculate the conformance checking between the process models and test event logs (unknown connectors). Then we select top-k maximum values of the conformance checking results and observe that we have the suitable connector with 80% accuracy among the top-3 recommended connectors. This solution can be configurable by changing the parameters and the methods of process discovery and conformance checking.Automation of repetitive tasks can be achieved with Robotic Process Automation (RPA) using scripts that encode fine-grained interactions with software applications on desktops and the web. Automating these processes can be achieved through several applications. It is possible for users to record desktop activity, including metadata, with these tools. The very fine-grained steps in the processes contain details about very small steps that the user takes. Several steps are involved in this process, including clicking on buttons, typing text, selecting the text, and changing the focus. Automating these processes requires connectors connecting them to the appropriate applications. Currently, users choose these connectors manually rather than automatically being linked to processes. In this thesis, we propose a method for recommending the top-k suitable connectors based on event logs for each process. This method indicates that we can use process discovery, create the process models of the train processes with identified connectors, and calculate the conformance checking between the process models and test event logs (unknown connectors). Then we select top-k maximum values of the conformance checking results and observe that we have the suitable connector with 80% accuracy among the top-3 recommended connectors. This solution can be configurable by changing the parameters and the methods of process discovery and conformance checking

    Automated Process Discovery: A Literature Review and a Comparative Evaluation with Domain Experts

    Get PDF
    Äriprotsesside kaeve meetodi võimaldavad analüütikul kasutada logisid saamaks teadmisi protsessi tegeliku toimise kohta. Neist meetodist üks enim uuritud on automaatne äriprotsesside avastamine. Sündmuste logi võetakse kui sisend automaatse äriprotsesside avastamise meetodi poolt ning väljundina toodetakse äriprotsessi mudel, mis kujutab logis talletatud sündmuste kontrollvoogu. Viimase kahe kümnendi jooksul on väljapakutud mitmeidki automaatseid äriprotsessi avastamise meetodeid balansseerides erinevalt toodetavate mudelite skaleeruvuse, täpsuse ning keerukuse vahel. Siiani on automaatsed äriprotsesside avastamise meetodid testitud ad-hoc kombel, kus erinevad autorid kasutavad erinevaid andmestike, seadistusi, hindamismeetrikuid ning alustõdesid, mis viib tihti võrdlematute tulemusteni ning mõnikord ka mittetaastoodetavate tulemusteni suletud andmestike kasutamise tõttu. Eelpool toodu mõistes sooritatakse antud magistritöö raames süstemaatiline kirjanduse ülevaade automaatsete äriprotsesside avastamise meetoditest ja ka süstemaatiline hindav võrdlus üle nelja kvaliteedimeetriku olemasolevate automaatsete äriprotsesside avastamise meetodite kohta koostöös domeeniekspertidega ning kasutades reaalset logi rahvusvahelisest tarkvara firmast. Kirjanduse ülevaate ning hindamise tulemused tõstavad esile puudujääke ning seni uurimata kompromisse mudelite loomiseks nelja kvaliteedimeetriku kontekstis. Antud magistritöö tulemused võimaldavad teaduritel parandada puudujäägid meetodites. Samuti vastatakse küsimusele automaatsete äriprotsesside avastamise meetodite kasutamise kohta väljaspool akadeemilist maailma.Process mining methods allow analysts to use logs of historical executions of business processes in order to gain knowledge about the actual performance of these processes.One of the most widely studied process mining operations is automated process discovery.An event log is taken as input by an automated process discovery method and produces a business process model as output that captures the control-flow relations between tasks that are described by the event log.Several automated process discovery methods have been proposed in the past two decades, striking different tradeoffs between scalability, accuracy and complexity of the resulting models.So far, automated process discovery methods have been evaluated in an ad hoc manner, with different authors employing different datasets, experimental setups, evaluation measures and baselines, often leading to incomparable conclusions and sometimes unreproducible results due to the use of non-publicly available datasets.In this setting, this thesis provides a systematic review of automated process discovery methods and a systematic comparative evaluation of existing implementations of these methods with domain experts by using a real-life event log extracted from a international software engineering company and four quality metrics.The review and evaluation results highlight gaps and unexplored tradeoffs in the field in the context of four business process model quality metrics.The results of this master thesis allows researchers to improve the lacks in the automated process discovery methods and also answers question about the usability of process discovery techniques in industry

    Turning Logs into Lumber: Preprocessing Tasks in Process Mining

    Full text link
    Event logs are invaluable for conducting process mining projects, offering insights into process improvement and data-driven decision-making. However, data quality issues affect the correctness and trustworthiness of these insights, making preprocessing tasks a necessity. Despite the recognized importance, the execution of preprocessing tasks remains ad-hoc, lacking support. This paper presents a systematic literature review that establishes a comprehensive repository of preprocessing tasks and their usage in case studies. We identify six high-level and 20 low-level preprocessing tasks in case studies. Log filtering, transformation, and abstraction are commonly used, while log enriching, integration, and reduction are less frequent. These results can be considered a first step in contributing to more structured, transparent event log preprocessing, enhancing process mining reliability.Comment: Accepted by EdbA'23 workshop, co-located with ICPM 202

    Health intelligence: Discovering the process model using process mining by constructing Start-to-End patient journeys

    Get PDF
    Archived with the publisher's permission. Copyright © 2014, Australian Computer Society, Inc. This paper appeared at the Australasian Workshop on Health Informatics and Knowledge Management (HIKM 2014), Auckland, New Zealand. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 153. J. Warren and K. Gray, Eds. Reproduction for academic, not-for profit purposes permitted provided this text is included.Australian Public Hospitals are continually engaged in various process improvement activities to improve patient care and to improve hospital efficiency as the demand for service intensifies. As a consequence there are many initiatives within the health sector focusing on gaining insight into the underlying health processes which are assessed for compliance with specified Key Performance Indicators (KPIs). Process Mining is classified as a Business Intelligence (BI) tool. The aim of process mining activities is to gain insight into the underlying process or processes. The fundamental element needed for process mining is a historical event log of a process. Generally, these event logs are easily sourced from Process Aware Information Systems (PAIS). Simulation is widely used by hospitals as a tool to study the complex hospital setting and for prediction. Generally, simulation models are constructed by ‘hand’. This paper presents a novel way of deriving event logs for health data in the absence of PAIS. The constructed event log is then used as an input for process mining activities taking advantage of existing process mining algorithms aiding the discovery of knowledge of the underlying processes which leads to Health Intelligence (HI). One such output of process mining activity, presented in this paper, is the discovery of process model for simulation using the derived event log as an input for process mining by constructing start-to-end patient journey. The study was undertaken using data from Flinders Medical Centre to gain insight into patient journeys from the point of admission to the Emergency Department (ED) until the patient is discharged from the hospital.

    AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments

    Get PDF
    This report considers the application of Articial Intelligence (AI) techniques to the problem of misuse detection and misuse localisation within telecommunications environments. A broad survey of techniques is provided, that covers inter alia rule based systems, model-based systems, case based reasoning, pattern matching, clustering and feature extraction, articial neural networks, genetic algorithms, arti cial immune systems, agent based systems, data mining and a variety of hybrid approaches. The report then considers the central issue of event correlation, that is at the heart of many misuse detection and localisation systems. The notion of being able to infer misuse by the correlation of individual temporally distributed events within a multiple data stream environment is explored, and a range of techniques, covering model based approaches, `programmed' AI and machine learning paradigms. It is found that, in general, correlation is best achieved via rule based approaches, but that these suffer from a number of drawbacks, such as the difculty of developing and maintaining an appropriate knowledge base, and the lack of ability to generalise from known misuses to new unseen misuses. Two distinct approaches are evident. One attempts to encode knowledge of known misuses, typically within rules, and use this to screen events. This approach cannot generally detect misuses for which it has not been programmed, i.e. it is prone to issuing false negatives. The other attempts to `learn' the features of event patterns that constitute normal behaviour, and, by observing patterns that do not match expected behaviour, detect when a misuse has occurred. This approach is prone to issuing false positives, i.e. inferring misuse from innocent patterns of behaviour that the system was not trained to recognise. Contemporary approaches are seen to favour hybridisation, often combining detection or localisation mechanisms for both abnormal and normal behaviour, the former to capture known cases of misuse, the latter to capture unknown cases. In some systems, these mechanisms even work together to update each other to increase detection rates and lower false positive rates. It is concluded that hybridisation offers the most promising future direction, but that a rule or state based component is likely to remain, being the most natural approach to the correlation of complex events. The challenge, then, is to mitigate the weaknesses of canonical programmed systems such that learning, generalisation and adaptation are more readily facilitated
    corecore