7,088 research outputs found

    Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis (Extended)

    Get PDF
    This extended paper presents 1) a novel hierarchy and recursion extension to the process tree model; and 2) the first, recursion aware process model discovery technique that leverages hierarchical information in event logs, typically available for software systems. This technique allows us to analyze the operational processes of software systems under real-life conditions at multiple levels of granularity. The work can be positioned in-between reverse engineering and process mining. An implementation of the proposed approach is available as a ProM plugin. Experimental results based on real-life (software) event logs demonstrate the feasibility and usefulness of the approach and show the huge potential to speed up discovery by exploiting the available hierarchy.Comment: Extended version (14 pages total) of the paper Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis. This Technical Report version includes the guarantee proofs for the proposed discovery algorithm

    A Multi-Gene Genetic Programming Application for Predicting Students Failure at School

    Full text link
    Several efforts to predict student failure rate (SFR) at school accurately still remains a core problem area faced by many in the educational sector. The procedure for forecasting SFR are rigid and most often times require data scaling or conversion into binary form such as is the case of the logistic model which may lead to lose of information and effect size attenuation. Also, the high number of factors, incomplete and unbalanced dataset, and black boxing issues as in Artificial Neural Networks and Fuzzy logic systems exposes the need for more efficient tools. Currently the application of Genetic Programming (GP) holds great promises and has produced tremendous positive results in different sectors. In this regard, this study developed GPSFARPS, a software application to provide a robust solution to the prediction of SFR using an evolutionary algorithm known as multi-gene genetic programming. The approach is validated by feeding a testing data set to the evolved GP models. Result obtained from GPSFARPS simulations show its unique ability to evolve a suitable failure rate expression with a fast convergence at 30 generations from a maximum specified generation of 500. The multi-gene system was also able to minimize the evolved model expression and accurately predict student failure rate using a subset of the original expressionComment: 14 pages, 9 figures, Journal paper. arXiv admin note: text overlap with arXiv:1403.0623 by other author

    Artifact Lifecycle Discovery

    Get PDF
    Artifact-centric modeling is a promising approach for modeling business processes based on the so-called business artifacts - key entities driving the company's operations and whose lifecycles define the overall business process. While artifact-centric modeling shows significant advantages, the overwhelming majority of existing process mining methods cannot be applied (directly) as they are tailored to discover monolithic process models. This paper addresses the problem by proposing a chain of methods that can be applied to discover artifact lifecycle models in Guard-Stage-Milestone notation. We decompose the problem in such a way that a wide range of existing (non-artifact-centric) process discovery and analysis methods can be reused in a flexible manner. The methods presented in this paper are implemented as software plug-ins for ProM, a generic open-source framework and architecture for implementing process mining tools

    A new approach for discovering business process models from event logs.

    Get PDF
    Process mining is the automated acquisition of process models from the event logs of information systems. Although process mining has many useful applications, not all inherent difficulties have been sufficiently solved. A first difficulty is that process mining is often limited to a setting of non-supervised learnings since negative information is often not available. Moreover, state transitions in processes are often dependent on the traversed path, which limits the appropriateness of search techniques based on local information in the event log. Another difficulty is that case data and resource properties that can also influence state transitions are time-varying properties, such that they cannot be considered ascross-sectional.This article investigates the use of first-order, ILP classification learners for process mining and describes techniques for dealing with each of the above mentioned difficulties. To make process mining a supervised learning task, we propose to include negative events in the event log. When event logs contain no negative information, a technique is described to add artificial negative examples to a process log. To capture history-dependent behavior the article proposes to take advantage of the multi-relational nature of ILP classification learners. Multi-relational process mining allows to search for patterns among multiple event rows in the event log, effectively basing its search on global information. To deal with time-varying case data and resource properties, a closed-world version of the Event Calculus has to be added as background knowledge, transforming the event log effectively in a temporal database. First experiments on synthetic event logs show that first-order classification learners are capable of predicting the behavior with high accuracy, even under conditions of noise.Credit; Credit scoring; Models; Model; Applications; Performance; Space; Decision; Yield; Real life; Risk; Evaluation; Rules; Neural networks; Networks; Classification; Research; Business; Processes; Event; Information; Information systems; Systems; Learning; Data; Behavior; Patterns; IT; Event calculus; Knowledge; Database; Noise;

    A Game-theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search

    Full text link
    Sponsored search is an important monetization channel for search engines, in which an auction mechanism is used to select the ads shown to users and determine the prices charged from advertisers. There have been several pieces of work in the literature that investigate how to design an auction mechanism in order to optimize the revenue of the search engine. However, due to some unrealistic assumptions used, the practical values of these studies are not very clear. In this paper, we propose a novel \emph{game-theoretic machine learning} approach, which naturally combines machine learning and game theory, and learns the auction mechanism using a bilevel optimization framework. In particular, we first learn a Markov model from historical data to describe how advertisers change their bids in response to an auction mechanism, and then for any given auction mechanism, we use the learnt model to predict its corresponding future bid sequences. Next we learn the auction mechanism through empirical revenue maximization on the predicted bid sequences. We show that the empirical revenue will converge when the prediction period approaches infinity, and a Genetic Programming algorithm can effectively optimize this empirical revenue. Our experiments indicate that the proposed approach is able to produce a much more effective auction mechanism than several baselines.Comment: Twenty-third International Conference on Artificial Intelligence (IJCAI 2013

    AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments

    Get PDF
    This report considers the application of Articial Intelligence (AI) techniques to the problem of misuse detection and misuse localisation within telecommunications environments. A broad survey of techniques is provided, that covers inter alia rule based systems, model-based systems, case based reasoning, pattern matching, clustering and feature extraction, articial neural networks, genetic algorithms, arti cial immune systems, agent based systems, data mining and a variety of hybrid approaches. The report then considers the central issue of event correlation, that is at the heart of many misuse detection and localisation systems. The notion of being able to infer misuse by the correlation of individual temporally distributed events within a multiple data stream environment is explored, and a range of techniques, covering model based approaches, `programmed' AI and machine learning paradigms. It is found that, in general, correlation is best achieved via rule based approaches, but that these suffer from a number of drawbacks, such as the difculty of developing and maintaining an appropriate knowledge base, and the lack of ability to generalise from known misuses to new unseen misuses. Two distinct approaches are evident. One attempts to encode knowledge of known misuses, typically within rules, and use this to screen events. This approach cannot generally detect misuses for which it has not been programmed, i.e. it is prone to issuing false negatives. The other attempts to `learn' the features of event patterns that constitute normal behaviour, and, by observing patterns that do not match expected behaviour, detect when a misuse has occurred. This approach is prone to issuing false positives, i.e. inferring misuse from innocent patterns of behaviour that the system was not trained to recognise. Contemporary approaches are seen to favour hybridisation, often combining detection or localisation mechanisms for both abnormal and normal behaviour, the former to capture known cases of misuse, the latter to capture unknown cases. In some systems, these mechanisms even work together to update each other to increase detection rates and lower false positive rates. It is concluded that hybridisation offers the most promising future direction, but that a rule or state based component is likely to remain, being the most natural approach to the correlation of complex events. The challenge, then, is to mitigate the weaknesses of canonical programmed systems such that learning, generalisation and adaptation are more readily facilitated
    corecore