9,718 research outputs found

    Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis (Extended)

    Get PDF
    This extended paper presents 1) a novel hierarchy and recursion extension to the process tree model; and 2) the first, recursion aware process model discovery technique that leverages hierarchical information in event logs, typically available for software systems. This technique allows us to analyze the operational processes of software systems under real-life conditions at multiple levels of granularity. The work can be positioned in-between reverse engineering and process mining. An implementation of the proposed approach is available as a ProM plugin. Experimental results based on real-life (software) event logs demonstrate the feasibility and usefulness of the approach and show the huge potential to speed up discovery by exploiting the available hierarchy.Comment: Extended version (14 pages total) of the paper Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis. This Technical Report version includes the guarantee proofs for the proposed discovery algorithm

    Data mining based cyber-attack detection

    Get PDF

    A Method to Improve the Early Stages of the Robotic Process Automation Lifecycle

    Get PDF
    The robotic automation of processes is of much interest to organizations. A common use case is to automate the repetitive manual tasks (or processes) that are currently done by back-office staff through some information system (IS). The lifecycle of any Robotic Process Automation (RPA) project starts with the analysis of the process to automate. This is a very time-consuming phase, which in practical settings often relies on the study of process documentation. Such documentation is typically incomplete or inaccurate, e.g., some documented cases never occur, occurring cases are not documented, or documented cases differ from reality. To deploy robots in a production environment that are designed on such a shaky basis entails a high risk. This paper describes and evaluates a new proposal for the early stages of an RPA project: the analysis of a process and its subsequent design. The idea is to leverage the knowledge of back-office staff, which starts by monitoring them in a non-invasive manner. This is done through a screen-mousekey- logger, i.e., a sequence of images, mouse actions, and key actions are stored along with their timestamps. The log which is obtained in this way is transformed into a UI log through image-analysis techniques (e.g., fingerprinting or OCR) and then transformed into a process model by the use of process discovery algorithms. We evaluated this method for two real-life, industrial cases. The evaluation shows clear and substantial benefits in terms of accuracy and speed. This paper presents the method, along with a number of limitations that need to be addressed such that it can be applied in wider contexts.Ministerio de Economía y Competitividad TIN2016-76956-C3-2-

    Closing the loop of SIEM analysis to Secure Critical Infrastructures

    Get PDF
    Critical Infrastructure Protection is one of the main challenges of last years. Security Information and Event Management (SIEM) systems are widely used for coping with this challenge. However, they currently present several limitations that have to be overcome. In this paper we propose an enhanced SIEM system in which we have introduced novel components to i) enable multiple layer data analysis; ii) resolve conflicts among security policies, and discover unauthorized data paths in such a way to be able to reconfigure network devices. Furthermore, the system is enriched by a Resilient Event Storage that ensures integrity and unforgeability of events stored.Comment: EDCC-2014, BIG4CIP-2014, Security Information and Event Management, Decision Support System, Hydroelectric Da

    Integrating E-Commerce and Data Mining: Architecture and Challenges

    Full text link
    We show that the e-commerce domain can provide all the right ingredients for successful data mining and claim that it is a killer domain for data mining. We describe an integrated architecture, based on our expe-rience at Blue Martini Software, for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning, and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e.g., clickstreams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200

    Automatic event log abstraction to support forensic investigation

    Get PDF
    Abstraction of event logs is the creation of a template that contains the most common words representing all members in a group of event log entries. Abstraction helps the forensic investigators to obtain an overall view of the main events in a log file. Existing log abstraction methods require user input parameters. This manual input is time consuming due to the need to identify the best parameters, especially when a log file is large. We propose an automatic method to facilitate event log abstraction avoiding the need for the user to manually identify suitable parameters. We model event logs as a graph and propose a new graph clustering approach to group log entries. The abstraction is then extracted from each cluster. Experimental results show that the proposed method achieves superior performance compared to existing approaches with an F-measure of 95.35%

    Data-Flow Modeling: A Survey of Issues and Approaches

    Get PDF
    This paper presents a survey of previous research on modeling the data flow perspective of business processes. When it comes to modeling and analyzing business process models the current research focuses on control flow modeling (i.e. the activities of the process) and very little attention is paid to the data-flow perspective. But data is essential in a process. In order to execute a workflow, the tasks need data. Without data or without data available on time, the control flow cannot be executed. For some time, various researchers tried to investigate the data flow perspective of process models or to combine the control and data flow in one model. This paper surveys those approaches. We conclude that there is no model showing a clear data flow perspective focusing on how data changes during a process execution. The literature offers some similar approaches ranging from data modeling using elements from relational database domain, going through process model verification and ending with elements related to Web Services

    Data-Aware Declarative Process Mining with SAT

    Get PDF
    Process Mining is a family of techniques for analyzing business process execution data recorded in event logs. Process models can be obtained as output of automated process discovery techniques or can be used as input of techniques for conformance checking or model enhancement. In Declarative Process Mining, process models are represented as sets of temporal constraints (instead of procedural descriptions where all control-flow details are explicitly modeled). An open research direction in Declarative Process Mining is whether multi-perspective specifications can be supported, i.e., specifications that not only describe the process behavior from the control-flow point of view, but also from other perspectives like data or time. In this paper, we address this question by considering SAT (Propositional Satisfiability Problem) as a solving technology for a number of classical problems in Declarative Process Mining, namely log generation, conformance checking and temporal query checking. To do so, we first express each problem as a suitable FO (First-Order) theory whose bounded models represent solutions to the problem, and then find a bounded model of such theory by compilation into SAT
    corecore