45,543 research outputs found

    Alignment-based trace clustering

    Get PDF
    A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.Peer ReviewedPostprint (author's final draft

    Generalized alignment-based trace clustering of process behavior

    Get PDF
    Process mining techniques use event logs containing real process executions in order to mine, align and extend process models. The partition of an event log into trace variants facilitates the understanding and analysis of traces, so it is a common pre-processing in process mining environments. Trace clustering automates this partition; traditionally it has been applied without taking into consideration the availability of a process model. In this paper we extend our previous work on process model based trace clustering, by allowing cluster centroids to have a complex structure, that can range from a partial order, down to a subnet of the initial process model. This way, the new clustering framework presented in this paper is able to cluster together traces that are distant only due to concurrency or loop constructs in process models. We show the complexity analysis of the different instantiations of the trace clustering framework, and have implemented it in a prototype tool that has been tested on different datasets.Peer ReviewedPostprint (author's final draft

    Generalized Alignment-Based Trace Clustering of Process Behavior

    Get PDF
    International audienceProcess mining techniques use event logs containing real process executions in order to mine, align and extend process models. The partition of an event log into trace variants facilitates the understanding and analysis of traces, so it is a common pre-processing in process mining environments. Trace clustering automates this partition; traditionally it has been applied without taking into consideration the availability of a process model. In this paper we extend our previous work on process model based trace clustering, by allowing cluster centroids to have a complex structure, that can range from a partial order, down to a sub-net of the initial process model. This way, the new clustering framework presented in this paper is able to cluster together traces that are distant only due to concurrency or loop constructs in process models. We show the complexity analysis of the different instantiations of the trace clustering framework, and have implemented it in a prototype tool that has been tested on different datasets

    Opaque Service Virtualisation: A Practical Tool for Emulating Endpoint Systems

    Full text link
    Large enterprise software systems make many complex interactions with other services in their environment. Developing and testing for production-like conditions is therefore a very challenging task. Current approaches include emulation of dependent services using either explicit modelling or record-and-replay approaches. Models require deep knowledge of the target services while record-and-replay is limited in accuracy. Both face developmental and scaling issues. We present a new technique that improves the accuracy of record-and-replay approaches, without requiring prior knowledge of the service protocols. The approach uses Multiple Sequence Alignment to derive message prototypes from recorded system interactions and a scheme to match incoming request messages against prototypes to generate response messages. We use a modified Needleman-Wunsch algorithm for distance calculation during message matching. Our approach has shown greater than 99% accuracy for four evaluated enterprise system messaging protocols. The approach has been successfully integrated into the CA Service Virtualization commercial product to complement its existing techniques.Comment: In Proceedings of the 38th International Conference on Software Engineering Companion (pp. 202-211). arXiv admin note: text overlap with arXiv:1510.0142
    corecore