15,727 research outputs found

    Unfolding-Based Process Discovery

    Get PDF
    This paper presents a novel technique for process discovery. In contrast to the current trend, which only considers an event log for discovering a process model, we assume two additional inputs: an independence relation on the set of logged activities, and a collection of negative traces. After deriving an intermediate net unfolding from them, we perform a controlled folding giving rise to a Petri net which contains both the input log and all independence-equivalent traces arising from it. Remarkably, the derived Petri net cannot execute any trace from the negative collection. The entire chain of transformations is fully automated. A tool has been developed and experimental results are provided that witness the significance of the contribution of this paper.Comment: This is the unabridged version of a paper with the same title appearead at the proceedings of ATVA 201

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

    Theoretical Interpretations and Applications of Radial Basis Function Networks

    Get PDF
    Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains

    Stacked Generalizations in Imbalanced Fraud Data Sets using Resampling Methods

    Full text link
    This study uses stacked generalization, which is a two-step process of combining machine learning methods, called meta or super learners, for improving the performance of algorithms in step one (by minimizing the error rate of each individual algorithm to reduce its bias in the learning set) and then in step two inputting the results into the meta learner with its stacked blended output (demonstrating improved performance with the weakest algorithms learning better). The method is essentially an enhanced cross-validation strategy. Although the process uses great computational resources, the resulting performance metrics on resampled fraud data show that increased system cost can be justified. A fundamental key to fraud data is that it is inherently not systematic and, as of yet, the optimal resampling methodology has not been identified. Building a test harness that accounts for all permutations of algorithm sample set pairs demonstrates that the complex, intrinsic data structures are all thoroughly tested. Using a comparative analysis on fraud data that applies stacked generalizations provides useful insight needed to find the optimal mathematical formula to be used for imbalanced fraud data sets.Comment: 19 pages, 3 figures, 8 table

    Neural networks in geophysical applications

    Get PDF
    Neural networks are increasingly popular in geophysics. Because they are universal approximators, these tools can approximate any continuous function with an arbitrary precision. Hence, they may yield important contributions to finding solutions to a variety of geophysical applications. However, knowledge of many methods and techniques recently developed to increase the performance and to facilitate the use of neural networks does not seem to be widespread in the geophysical community. Therefore, the power of these tools has not yet been explored to their full extent. In this paper, techniques are described for faster training, better overall performance, i.e., generalization,and the automatic estimation of network size and architecture

    Incorporating negative information to process discovery of complex systems

    Get PDF
    The discovery of a formal process model from event logs describing real process executions is a challenging problem that has been studied from several angles. Most of the contributions consider the extraction of a model as a one-class supervised learning problem where only a set of process instances is available. Moreover, the majority of techniques cannot generate complex models, a crucial feature in some areas like manufacturing. In this paper we present a fresh look at process discovery where undesired process behaviors can also be taken into account. This feature may be crucial for deriving process models which are less complex, fitting and precise, but also good on generalizing the right behavior underlying an event log. The technique is based on the theory of convex polyhedra and satisfiability modulo theory (SMT) and can be combined with other process discovery approach as a post processing step to further simplify complex models. We show in detail how to apply the proposed technique in combination with a recent method that uses numerical abstract domains. Experiments performed in a new prototype implementation show the effectiveness of the technique and the ability to be combined with other discovery techniques.Peer ReviewedPostprint (author's final draft

    Conformance checking: A state-of-the-art literature review

    Full text link
    Conformance checking is a set of process mining functions that compare process instances with a given process model. It identifies deviations between the process instances' actual behaviour ("as-is") and its modelled behaviour ("to-be"). Especially in the context of analyzing compliance in organizations, it is currently gaining momentum -- e.g. for auditors. Researchers have proposed a variety of conformance checking techniques that are geared towards certain process model notations or specific applications such as process model evaluation. This article reviews a set of conformance checking techniques described in 37 scholarly publications. It classifies the techniques along the dimensions "modelling language", "algorithm type", "quality metric", and "perspective" using a concept matrix so that the techniques can be better accessed by practitioners and researchers. The matrix highlights the dimensions where extant research concentrates and where blind spots exist. For instance, process miners use declarative process modelling languages often, but applications in conformance checking are rare. Likewise, process mining can investigate process roles or process metrics such as duration, but conformance checking techniques narrow on analyzing control-flow. Future research may construct techniques that support these neglected approaches to conformance checking

    Multivariate Spatiotemporal Hawkes Processes and Network Reconstruction

    Full text link
    There is often latent network structure in spatial and temporal data and the tools of network analysis can yield fascinating insights into such data. In this paper, we develop a nonparametric method for network reconstruction from spatiotemporal data sets using multivariate Hawkes processes. In contrast to prior work on network reconstruction with point-process models, which has often focused on exclusively temporal information, our approach uses both temporal and spatial information and does not assume a specific parametric form of network dynamics. This leads to an effective way of recovering an underlying network. We illustrate our approach using both synthetic networks and networks constructed from real-world data sets (a location-based social media network, a narrative of crime events, and violent gang crimes). Our results demonstrate that, in comparison to using only temporal data, our spatiotemporal approach yields improved network reconstruction, providing a basis for meaningful subsequent analysis --- such as community structure and motif analysis --- of the reconstructed networks
    corecore