26,541 research outputs found
Periodic Pattern Mining a Algorithms and Applications
Owing to a large number of applications periodic pattern mining has been extensively studied for over a decade Periodic pattern is a pattern that repeats itself with a specific period in a give sequence Periodic patterns can be mined from datasets like biological sequences continuous and discrete time series data spatiotemporal data and social networks Periodic patterns are classified based on different criteria Periodic patterns are categorized as frequent periodic patterns and statistically significant patterns based on the frequency of occurrence Frequent periodic patterns are in turn classified as perfect and imperfect periodic patterns full and partial periodic patterns synchronous and asynchronous periodic patterns dense periodic patterns approximate periodic patterns This paper presents a survey of the state of art research on periodic pattern mining algorithms and their application areas A discussion of merits and demerits of these algorithms was given The paper also presents a brief overview of algorithms that can be applied for specific types of datasets like spatiotemporal data and social network
Multiple Uncertainties in Time-Variant Cosmological Particle Data
Though the mediums for visualization are limited, the potential dimensions of a dataset are not. In many areas of scientific study, understanding the correlations between those dimensions and their uncertainties is pivotal to mining useful information from a dataset. Obtaining this insight can necessitate visualizing the many relationships among temporal, spatial, and other dimensionalities of data and its uncertainties. We utilize multiple views for interactive dataset exploration and selection of important features, and we apply those techniques to the unique challenges of cosmological particle datasets. We show how interactivity and incorporation of multiple visualization techniques help overcome the problem of limited visualization dimensions and allow many types of uncertainty to be seen in correlation with other variables
Compressive Mining: Fast and Optimal Data Mining in the Compressed Domain
Real-world data typically contain repeated and periodic patterns. This
suggests that they can be effectively represented and compressed using only a
few coefficients of an appropriate basis (e.g., Fourier, Wavelets, etc.).
However, distance estimation when the data are represented using different sets
of coefficients is still a largely unexplored area. This work studies the
optimization problems related to obtaining the \emph{tightest} lower/upper
bound on Euclidean distances when each data object is potentially compressed
using a different set of orthonormal coefficients. Our technique leads to
tighter distance estimates, which translates into more accurate search,
learning and mining operations \textit{directly} in the compressed domain.
We formulate the problem of estimating lower/upper distance bounds as an
optimization problem. We establish the properties of optimal solutions, and
leverage the theoretical analysis to develop a fast algorithm to obtain an
\emph{exact} solution to the problem. The suggested solution provides the
tightest estimation of the -norm or the correlation. We show that typical
data-analysis operations, such as k-NN search or k-Means clustering, can
operate more accurately using the proposed compression and distance
reconstruction technique. We compare it with many other prevalent compression
and reconstruction techniques, including random projections and PCA-based
techniques. We highlight a surprising result, namely that when the data are
highly sparse in some basis, our technique may even outperform PCA-based
compression.
The contributions of this work are generic as our methodology is applicable
to any sequential or high-dimensional data as well as to any orthogonal data
transformation used for the underlying data compression scheme.Comment: 25 pages, 20 figures, accepted in VLD
- …