45,543 research outputs found
Alignment-based trace clustering
A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.Peer ReviewedPostprint (author's final draft
Generalized alignment-based trace clustering of process behavior
Process mining techniques use event logs containing real process executions in order to mine, align and extend process models. The partition of an event log into trace variants facilitates the understanding and analysis of traces, so it is a common pre-processing in process mining environments. Trace clustering automates this partition; traditionally it has been applied without taking into consideration the availability of a process model. In this paper we extend our previous work on process model based trace clustering, by allowing cluster centroids to have a complex structure, that can range from a partial order, down to a subnet of the initial process model. This way, the new clustering framework presented in this paper is able to cluster together traces that are distant only due to concurrency or loop constructs in process models. We show the complexity analysis of the different instantiations of the trace clustering framework, and have implemented it in a prototype tool that has been tested on different datasets.Peer ReviewedPostprint (author's final draft
Generalized Alignment-Based Trace Clustering of Process Behavior
International audienceProcess mining techniques use event logs containing real process executions in order to mine, align and extend process models. The partition of an event log into trace variants facilitates the understanding and analysis of traces, so it is a common pre-processing in process mining environments. Trace clustering automates this partition; traditionally it has been applied without taking into consideration the availability of a process model. In this paper we extend our previous work on process model based trace clustering, by allowing cluster centroids to have a complex structure, that can range from a partial order, down to a sub-net of the initial process model. This way, the new clustering framework presented in this paper is able to cluster together traces that are distant only due to concurrency or loop constructs in process models. We show the complexity analysis of the different instantiations of the trace clustering framework, and have implemented it in a prototype tool that has been tested on different datasets
Recommended from our members
ManiNetCluster: a novel manifold learning approach to reveal the functional links between gene networks.
BACKGROUND:The coordination of genomic functions is a critical and complex process across biological systems such as phenotypes or states (e.g., time, disease, organism, environmental perturbation). Understanding how the complexity of genomic function relates to these states remains a challenge. To address this, we have developed a novel computational method, ManiNetCluster, which simultaneously aligns and clusters gene networks (e.g., co-expression) to systematically reveal the links of genomic function between different conditions. Specifically, ManiNetCluster employs manifold learning to uncover and match local and non-linear structures among networks, and identifies cross-network functional links. RESULTS:We demonstrated that ManiNetCluster better aligns the orthologous genes from their developmental expression profiles across model organisms than state-of-the-art methods (p-value <2.2×10-16). This indicates the potential non-linear interactions of evolutionarily conserved genes across species in development. Furthermore, we applied ManiNetCluster to time series transcriptome data measured in the green alga Chlamydomonas reinhardtii to discover the genomic functions linking various metabolic processes between the light and dark periods of a diurnally cycling culture. We identified a number of genes putatively regulating processes across each lighting regime. CONCLUSIONS:ManiNetCluster provides a novel computational tool to uncover the genes linking various functions from different networks, providing new insight on how gene functions coordinate across different conditions. ManiNetCluster is publicly available as an R package at https://github.com/daifengwanglab/ManiNetCluster
Opaque Service Virtualisation: A Practical Tool for Emulating Endpoint Systems
Large enterprise software systems make many complex interactions with other
services in their environment. Developing and testing for production-like
conditions is therefore a very challenging task. Current approaches include
emulation of dependent services using either explicit modelling or
record-and-replay approaches. Models require deep knowledge of the target
services while record-and-replay is limited in accuracy. Both face
developmental and scaling issues. We present a new technique that improves the
accuracy of record-and-replay approaches, without requiring prior knowledge of
the service protocols. The approach uses Multiple Sequence Alignment to derive
message prototypes from recorded system interactions and a scheme to match
incoming request messages against prototypes to generate response messages. We
use a modified Needleman-Wunsch algorithm for distance calculation during
message matching. Our approach has shown greater than 99% accuracy for four
evaluated enterprise system messaging protocols. The approach has been
successfully integrated into the CA Service Virtualization commercial product
to complement its existing techniques.Comment: In Proceedings of the 38th International Conference on Software
Engineering Companion (pp. 202-211). arXiv admin note: text overlap with
arXiv:1510.0142
- …