3,807 research outputs found
Effective Removal of Operational Log Messages: an Application to Model Inference
Model inference aims to extract accurate models from the execution logs of
software systems. However, in reality, logs may contain some "noise" that could
deteriorate the performance of model inference. One form of noise can commonly
be found in system logs that contain not only transactional messages---logging
the functional behavior of the system---but also operational
messages---recording the operational state of the system (e.g., a periodic
heartbeat to keep track of the memory usage). In low-quality logs,
transactional and operational messages are randomly interleaved, leading to the
erroneous inclusion of operational behaviors into a system model, that ideally
should only reflect the functional behavior of the system. It is therefore
important to remove operational messages in the logs before inferring models.
In this paper, we propose LogCleaner, a novel technique for removing
operational logs messages. LogCleaner first performs a periodicity analysis to
filter out periodic messages, and then it performs a dependency analysis to
calculate the degree of dependency for all log messages and to remove
operational messages based on their dependencies. The experimental results on
two proprietary and 11 publicly available log datasets show that LogCleaner, on
average, can accurately remove 98% of the operational messages and preserve 81%
of the transactional messages. Furthermore, using logs pre-processed with
LogCleaner decreases the execution time of model inference (with a speed-up
ranging from 1.5 to 946.7 depending on the characteristics of the system) and
significantly improves the accuracy of the inferred models, by increasing their
ability to accept correct system behaviors (+43.8 pp on average, with
pp=percentage points) and to reject incorrect system behaviors (+15.0 pp on
average)
Recommender System Based on Process Mining
Automation of repetitive tasks can be achieved with Robotic Process Automation (RPA) using scripts that encode fine-grained interactions with software applications on desktops and the web. Automating these processes can be achieved through several applications. It is possible for users to record desktop activity, including metadata, with these tools. The very fine-grained steps in the processes contain details about very small steps that the user takes. Several steps are involved in this process, including clicking on buttons, typing text, selecting the text, and changing the focus. Automating these processes requires connectors connecting them to the appropriate applications. Currently, users choose these connectors manually rather than automatically being linked to processes.
In this thesis, we propose a method for recommending the top-k suitable connectors based on event logs for each process. This method indicates that we can use process discovery, create the process models of the train processes with identified connectors, and calculate
the conformance checking between the process models and test event logs (unknown connectors). Then we select top-k maximum values of the conformance checking results and observe that we have the suitable connector with 80% accuracy among the top-3 recommended connectors. This solution can be configurable by changing the parameters and the methods of process discovery and conformance checking.Automation of repetitive tasks can be achieved with Robotic Process Automation (RPA) using scripts that encode fine-grained interactions with software applications on desktops and the web. Automating these processes can be achieved through several applications. It is possible for users to record desktop activity, including metadata, with these tools. The very fine-grained steps in the processes contain details about very small steps that the user takes. Several steps are involved in this process, including clicking on buttons, typing text, selecting the text, and changing the focus. Automating these processes requires connectors connecting them to the appropriate applications. Currently, users choose these connectors manually rather than automatically being linked to processes.
In this thesis, we propose a method for recommending the top-k suitable connectors based on event logs for each process. This method indicates that we can use process discovery, create the process models of the train processes with identified connectors, and calculate the conformance checking between the process models and test event logs (unknown connectors). Then we select top-k maximum values of the conformance checking results and observe that we have the suitable connector with 80% accuracy among the top-3 recommended connectors. This solution can be configurable by changing the parameters and the methods of process discovery and conformance checking
Automated Process Discovery: A Literature Review and a Comparative Evaluation with Domain Experts
Äriprotsesside kaeve meetodi võimaldavad analüütikul kasutada logisid saamaks teadmisi protsessi tegeliku toimise kohta. Neist meetodist üks enim uuritud on automaatne äriprotsesside avastamine. Sündmuste logi võetakse kui sisend automaatse äriprotsesside avastamise meetodi poolt ning väljundina toodetakse äriprotsessi mudel, mis kujutab logis talletatud sündmuste kontrollvoogu. Viimase kahe kümnendi jooksul on väljapakutud mitmeidki automaatseid äriprotsessi avastamise meetodeid balansseerides erinevalt toodetavate mudelite skaleeruvuse, täpsuse ning keerukuse vahel. Siiani on automaatsed äriprotsesside avastamise meetodid testitud ad-hoc kombel, kus erinevad autorid kasutavad erinevaid andmestike, seadistusi, hindamismeetrikuid ning alustõdesid, mis viib tihti võrdlematute tulemusteni ning mõnikord ka mittetaastoodetavate tulemusteni suletud andmestike kasutamise tõttu. Eelpool toodu mõistes sooritatakse antud magistritöö raames süstemaatiline kirjanduse ülevaade automaatsete äriprotsesside avastamise meetoditest ja ka süstemaatiline hindav võrdlus üle nelja kvaliteedimeetriku olemasolevate automaatsete äriprotsesside avastamise meetodite kohta koostöös domeeniekspertidega ning kasutades reaalset logi rahvusvahelisest tarkvara firmast. Kirjanduse ülevaate ning hindamise tulemused tõstavad esile puudujääke ning seni uurimata kompromisse mudelite loomiseks nelja kvaliteedimeetriku kontekstis. Antud magistritöö tulemused võimaldavad teaduritel parandada puudujäägid meetodites. Samuti vastatakse küsimusele automaatsete äriprotsesside avastamise meetodite kasutamise kohta väljaspool akadeemilist maailma.Process mining methods allow analysts to use logs of historical executions of business processes in order to gain knowledge about the actual performance of these processes.One of the most widely studied process mining operations is automated process discovery.An event log is taken as input by an automated process discovery method and produces a business process model as output that captures the control-flow relations between tasks that are described by the event log.Several automated process discovery methods have been proposed in the past two decades, striking different tradeoffs between scalability, accuracy and complexity of the resulting models.So far, automated process discovery methods have been evaluated in an ad hoc manner, with different authors employing different datasets, experimental setups, evaluation measures and baselines, often leading to incomparable conclusions and sometimes unreproducible results due to the use of non-publicly available datasets.In this setting, this thesis provides a systematic review of automated process discovery methods and a systematic comparative evaluation of existing implementations of these methods with domain experts by using a real-life event log extracted from a international software engineering company and four quality metrics.The review and evaluation results highlight gaps and unexplored tradeoffs in the field in the context of four business process model quality metrics.The results of this master thesis allows researchers to improve the lacks in the automated process discovery methods and also answers question about the usability of process discovery techniques in industry
Turning Logs into Lumber: Preprocessing Tasks in Process Mining
Event logs are invaluable for conducting process mining projects, offering
insights into process improvement and data-driven decision-making. However,
data quality issues affect the correctness and trustworthiness of these
insights, making preprocessing tasks a necessity. Despite the recognized
importance, the execution of preprocessing tasks remains ad-hoc, lacking
support. This paper presents a systematic literature review that establishes a
comprehensive repository of preprocessing tasks and their usage in case
studies. We identify six high-level and 20 low-level preprocessing tasks in
case studies. Log filtering, transformation, and abstraction are commonly used,
while log enriching, integration, and reduction are less frequent. These
results can be considered a first step in contributing to more structured,
transparent event log preprocessing, enhancing process mining reliability.Comment: Accepted by EdbA'23 workshop, co-located with ICPM 202
Health intelligence: Discovering the process model using process mining by constructing Start-to-End patient journeys
Archived with the publisher's permission. Copyright © 2014, Australian Computer Society, Inc.
This paper appeared at the Australasian Workshop on
Health Informatics and Knowledge Management (HIKM
2014), Auckland, New Zealand. Conferences in Research
and Practice in Information Technology (CRPIT), Vol.
153. J. Warren and K. Gray, Eds. Reproduction for
academic, not-for profit purposes permitted provided this
text is included.Australian Public Hospitals are continually engaged in
various process improvement activities to improve patient
care and to improve hospital efficiency as the demand for
service intensifies. As a consequence there are many
initiatives within the health sector focusing on gaining
insight into the underlying health processes which are
assessed for compliance with specified Key Performance
Indicators (KPIs). Process Mining is classified as a
Business Intelligence (BI) tool. The aim of process
mining activities is to gain insight into the underlying
process or processes. The fundamental element needed
for process mining is a historical event log of a process.
Generally, these event logs are easily sourced from
Process Aware Information Systems (PAIS). Simulation
is widely used by hospitals as a tool to study the complex
hospital setting and for prediction. Generally, simulation
models are constructed by ‘hand’. This paper presents a
novel way of deriving event logs for health data in the
absence of PAIS. The constructed event log is then used
as an input for process mining activities taking advantage
of existing process mining algorithms aiding the
discovery of knowledge of the underlying processes
which leads to Health Intelligence (HI). One such output
of process mining activity, presented in this paper, is the
discovery of process model for simulation using the
derived event log as an input for process mining by
constructing start-to-end patient journey. The study was
undertaken using data from Flinders Medical Centre to
gain insight into patient journeys from the point of
admission to the Emergency Department (ED) until the
patient is discharged from the hospital.
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
- …