24,353 research outputs found

    Evolving temporal association rules with genetic algorithms

    Get PDF
    A novel framework for mining temporal association rules by discovering itemsets with a genetic algorithm is introduced. Metaheuristics have been applied to association rule mining, we show the efficacy of extending this to another variant - temporal association rule mining. Our framework is an enhancement to existing temporal association rule mining methods as it employs a genetic algorithm to simultaneously search the rule space and temporal space. A methodology for validating the ability of the proposed framework isolates target temporal itemsets in synthetic datasets. The Iterative Rule Learning method successfully discovers these targets in datasets with varying levels of difficulty

    Using patterns position distribution for software failure detection

    Get PDF
    Pattern-based software failure detection is an important topic of research in recent years. In this method, a set of patterns from program execution traces are extracted, and represented as features, while their occurrence frequencies are treated as the corresponding feature values. But this conventional method has its limitation due to ignore the pattern’s position information, which is important for the classification of program traces. Patterns occurs in the different positions of the trace are likely to represent different meanings. In this paper, we present a novel approach for using pattern’s position distribution as features to detect software failure. The comparative experiments in both artificial and real datasets show the effectiveness of this method

    Discovering Exclusive Patterns in Frequent Sequences

    Get PDF
    This paper presents a new concept for pattern discovery in frequent sequences with potentially interesting applications. Based on data mining, the approach aims to discover exclusive sequential patterns (ESP) by checking the relative exclusion of patterns across data sequences. ESP mining pursues the post-processing of sequential patterns and augments existing work on structural relations patterns mining. A three phase ESP mining method is proposed together with component algorithms, where a running worked example explains the process. Experiments are performed on real-world and synthetic datasets which showcase the results of ESP mining and demonstrate its effectiveness, illuminating the theories developed. An outline case study in workflow modelling gives some insight into future applicability

    Temporal and Spatial Data Mining with Second-Order Hidden Models

    Get PDF
    In the frame of designing a knowledge discovery system, we have developed stochastic models based on high-order hidden Markov models. These models are capable to map sequences of data into a Markov chain in which the transitions between the states depend on the \texttt{n} previous states according to the order of the model. We study the process of achieving information extraction fromspatial and temporal data by means of an unsupervised classification. We use therefore a French national database related to the land use of a region, named Teruti, which describes the land use both in the spatial and temporal domain. Land-use categories (wheat, corn, forest, ...) are logged every year on each site regularly spaced in the region. They constitute a temporal sequence of images in which we look for spatial and temporal dependencies. The temporal segmentation of the data is done by means of a second-order Hidden Markov Model (\hmmd) that appears to have very good capabilities to locate stationary segments, as shown in our previous work in speech recognition. Thespatial classification is performed by defining a fractal scanning ofthe images with the help of a Hilbert-Peano curve that introduces atotal order on the sites, preserving the relation ofneighborhood between the sites. We show that the \hmmd performs aclassification that is meaningful for the agronomists.Spatial and temporal classification may be achieved simultaneously by means of a 2 levels \hmmd that measures the \aposteriori probability to map a temporal sequence of images onto a set of hidden classes

    Prediction of peptides binding to MHC class I alleles by partial periodic pattern mining

    Get PDF
    MHC (Major Histocompatibility Complex) is a key player in the immune response of an organism. It is important to be able to predict which antigenic peptides will bind to a specific MHC allele and which will not, creating possibilities for controlling immune response and for the applications of immunotherapy. However, a problem for MHC class I is the presence of bulges and loops in the peptides, changing the total length. Most machine learning methods in use today require the sequences to be of same length to successfully mine the binding motifs. We propose the use of time-based data mining methods in motif mining to be able to mine motifs position-independently. Also, the information for both binding and non-binding peptides is used on the contrary to the other methods which only rely on binding peptides. The prediction results are between 60-95% for the tested alleles

    Clear Visual Separation of Temporal Event Sequences

    Full text link
    Extracting and visualizing informative insights from temporal event sequences becomes increasingly difficult when data volume and variety increase. Besides dealing with high event type cardinality and many distinct sequences, it can be difficult to tell whether it is appropriate to combine multiple events into one or utilize additional information about event attributes. Existing approaches often make use of frequent sequential patterns extracted from the dataset, however, these patterns are limited in terms of interpretability and utility. In addition, it is difficult to assess the role of absolute and relative time when using pattern mining techniques. In this paper, we present methods that addresses these challenges by automatically learning composite events which enables better aggregation of multiple event sequences. By leveraging event sequence outcomes, we present appropriate linked visualizations that allow domain experts to identify critical flows, to assess validity and to understand the role of time. Furthermore, we explore information gain and visual complexity metrics to identify the most relevant visual patterns. We compare composite event learning with two approaches for extracting event patterns using real world company event data from an ongoing project with the Danish Business Authority.Comment: In Proceedings of the 3rd IEEE Symposium on Visualization in Data Science (VDS), 201
    • 

    corecore