7,918 research outputs found
Periodic Pattern Mining a Algorithms and Applications
Owing to a large number of applications periodic pattern mining has been extensively studied for over a decade Periodic pattern is a pattern that repeats itself with a specific period in a give sequence Periodic patterns can be mined from datasets like biological sequences continuous and discrete time series data spatiotemporal data and social networks Periodic patterns are classified based on different criteria Periodic patterns are categorized as frequent periodic patterns and statistically significant patterns based on the frequency of occurrence Frequent periodic patterns are in turn classified as perfect and imperfect periodic patterns full and partial periodic patterns synchronous and asynchronous periodic patterns dense periodic patterns approximate periodic patterns This paper presents a survey of the state of art research on periodic pattern mining algorithms and their application areas A discussion of merits and demerits of these algorithms was given The paper also presents a brief overview of algorithms that can be applied for specific types of datasets like spatiotemporal data and social network
Evolving temporal association rules with genetic algorithms
A novel framework for mining temporal association rules by discovering itemsets with a genetic algorithm is introduced. Metaheuristics have been applied to association rule mining, we show the efficacy of extending this to another variant - temporal association rule mining. Our framework is an enhancement to existing temporal association rule mining methods as it employs a genetic algorithm to simultaneously search the rule space and temporal space. A methodology for validating the ability of the proposed framework isolates target temporal itemsets in synthetic datasets. The Iterative Rule Learning method successfully discovers these targets in datasets with varying levels of difficulty
Evolving temporal fuzzy association rules from quantitative data with a multi-objective evolutionary algorithm
A novel method for mining association rules that are both quantitative and temporal using a multi-objective evolutionary algorithm is presented. This method successfully identifies numerous temporal association rules that occur more frequently in areas of a dataset with specific quantitative values represented with fuzzy sets. The novelty of this research lies in exploring the composition of quantitative and temporal fuzzy association rules and the approach of using a hybridisation of a multi-objective evolutionary algorithm with fuzzy sets. Results show the ability of a multi-objective evolutionary algorithm (NSGA-II) to evolve multiple target itemsets that have been augmented into synthetic datasets
A Better Alternative to Piecewise Linear Time Series Segmentation
Time series are difficult to monitor, summarize and predict. Segmentation
organizes time series into few intervals having uniform characteristics
(flatness, linearity, modality, monotonicity and so on). For scalability, we
require fast linear time algorithms. The popular piecewise linear model can
determine where the data goes up or down and at what rate. Unfortunately, when
the data does not follow a linear model, the computation of the local slope
creates overfitting. We propose an adaptive time series model where the
polynomial degree of each interval vary (constant, linear and so on). Given a
number of regressors, the cost of each interval is its polynomial degree:
constant intervals cost 1 regressor, linear intervals cost 2 regressors, and so
on. Our goal is to minimize the Euclidean (l_2) error for a given model
complexity. Experimentally, we investigate the model where intervals can be
either constant or linear. Over synthetic random walks, historical stock market
prices, and electrocardiograms, the adaptive model provides a more accurate
segmentation than the piecewise linear model without increasing the
cross-validation error or the running time, while providing a richer vocabulary
to applications. Implementation issues, such as numerical stability and
real-world performance, are discussed.Comment: to appear in SIAM Data Mining 200
Temporal fuzzy association rule mining with 2-tuple linguistic representation
This paper reports on an approach that contributes towards the problem of discovering fuzzy association rules that exhibit a temporal pattern. The novel application of the 2-tuple linguistic representation identifies fuzzy association rules in a temporal context, whilst maintaining the interpretability of linguistic terms. Iterative Rule Learning (IRL) with a Genetic Algorithm (GA) simultaneously induces rules and tunes the membership functions. The discovered rules were compared with those from a traditional method of discovering fuzzy association rules and results demonstrate how the traditional method can loose information because rules occur at the intersection of membership function boundaries. New information can be mined from the proposed approach by improving upon rules discovered with the traditional method and by discovering new rules
DRSP : Dimension Reduction For Similarity Matching And Pruning Of Time Series Data Streams
Similarity matching and join of time series data streams has gained a lot of
relevance in today's world that has large streaming data. This process finds
wide scale application in the areas of location tracking, sensor networks,
object positioning and monitoring to name a few. However, as the size of the
data stream increases, the cost involved to retain all the data in order to aid
the process of similarity matching also increases. We develop a novel framework
to addresses the following objectives. Firstly, Dimension reduction is
performed in the preprocessing stage, where large stream data is segmented and
reduced into a compact representation such that it retains all the crucial
information by a technique called Multi-level Segment Means (MSM). This reduces
the space complexity associated with the storage of large time-series data
streams. Secondly, it incorporates effective Similarity Matching technique to
analyze if the new data objects are symmetric to the existing data stream. And
finally, the Pruning Technique that filters out the pseudo data object pairs
and join only the relevant pairs. The computational cost for MSM is O(l*ni) and
the cost for pruning is O(DRF*wsize*d), where DRF is the Dimension Reduction
Factor. We have performed exhaustive experimental trials to show that the
proposed framework is both efficient and competent in comparison with earlier
works.Comment: 20 pages,8 figures, 6 Table
Modeling Individual Cyclic Variation in Human Behavior
Cycles are fundamental to human health and behavior. However, modeling cycles
in time series data is challenging because in most cases the cycles are not
labeled or directly observed and need to be inferred from multidimensional
measurements taken over time. Here, we present CyHMMs, a cyclic hidden Markov
model method for detecting and modeling cycles in a collection of
multidimensional heterogeneous time series data. In contrast to previous cycle
modeling methods, CyHMMs deal with a number of challenges encountered in
modeling real-world cycles: they can model multivariate data with discrete and
continuous dimensions; they explicitly model and are robust to missing data;
and they can share information across individuals to model variation both
within and between individual time series. Experiments on synthetic and
real-world health-tracking data demonstrate that CyHMMs infer cycle lengths
more accurately than existing methods, with 58% lower error on simulated data
and 63% lower error on real-world data compared to the best-performing
baseline. CyHMMs can also perform functions which baselines cannot: they can
model the progression of individual features/symptoms over the course of the
cycle, identify the most variable features, and cluster individual time series
into groups with distinct characteristics. Applying CyHMMs to two real-world
health-tracking datasets -- of menstrual cycle symptoms and physical activity
tracking data -- yields important insights including which symptoms to expect
at each point during the cycle. We also find that people fall into several
groups with distinct cycle patterns, and that these groups differ along
dimensions not provided to the model. For example, by modeling missing data in
the menstrual cycles dataset, we are able to discover a medically relevant
group of birth control users even though information on birth control is not
given to the model.Comment: Accepted at WWW 201
Visual Analysis of Spatio-Temporal Event Predictions: Investigating the Spread Dynamics of Invasive Species
Invasive species are a major cause of ecological damage and commercial
losses. A current problem spreading in North America and Europe is the vinegar
fly Drosophila suzukii. Unlike other Drosophila, it infests non-rotting and
healthy fruits and is therefore of concern to fruit growers, such as vintners.
Consequently, large amounts of data about infestations have been collected in
recent years. However, there is a lack of interactive methods to investigate
this data. We employ ensemble-based classification to predict areas susceptible
to infestation by D. suzukii and bring them into a spatio-temporal context
using maps and glyph-based visualizations. Following the information-seeking
mantra, we provide a visual analysis system Drosophigator for spatio-temporal
event prediction, enabling the investigation of the spread dynamics of invasive
species. We demonstrate the usefulness of this approach in two use cases
- …