43,502 research outputs found
Sequential Pattern Mining with Multidimensional Interval Items
In real sequence pattern mining scenarios, the interval information between two item sets is very important. However, although existing algorithms can effectively mine frequent subsequence sets, the interval information is ignored. This paper aims to mine sequential patterns with multidimensional interval items in sequence databases. In order to address this problem, this paper defines and specifies the interval event problem in the sequential pattern mining task. Then, the interval event items framework is proposed to handle the multidimensional interval event items. Moreover, the MII-Prefixspan algorithm is introduced for the sequential pattern with multidimensional interval event items mining tasks. This algorithm adds the processing of interval event items in the mining process. We can get richer and more in line with actual needs information from mined sequence patterns through these methods. This scheme is applied to the actual website behaviour analysis task to obtain more valuable information for web optimization and provide more valuable sequence pattern information for practical problems. This work also opens a new pathway toward more efficient sequential pattern mining tasks
A Regularized Graph Layout Framework for Dynamic Network Visualization
Many real-world networks, including social and information networks, are
dynamic structures that evolve over time. Such dynamic networks are typically
visualized using a sequence of static graph layouts. In addition to providing a
visual representation of the network structure at each time step, the sequence
should preserve the mental map between layouts of consecutive time steps to
allow a human to interpret the temporal evolution of the network. In this
paper, we propose a framework for dynamic network visualization in the on-line
setting where only present and past graph snapshots are available to create the
present layout. The proposed framework creates regularized graph layouts by
augmenting the cost function of a static graph layout algorithm with a grouping
penalty, which discourages nodes from deviating too far from other nodes
belonging to the same group, and a temporal penalty, which discourages large
node movements between consecutive time steps. The penalties increase the
stability of the layout sequence, thus preserving the mental map. We introduce
two dynamic layout algorithms within the proposed framework, namely dynamic
multidimensional scaling (DMDS) and dynamic graph Laplacian layout (DGLL). We
apply these algorithms on several data sets to illustrate the importance of
both grouping and temporal regularization for producing interpretable
visualizations of dynamic networks.Comment: To appear in Data Mining and Knowledge Discovery, supporting material
(animations and MATLAB toolbox) available at
http://tbayes.eecs.umich.edu/xukevin/visualization_dmkd_201
Pattern Mining and Sense-Making Support for Enhancing the User Experience
While data mining techniques such as frequent itemset and sequence mining are well established as powerful pattern discovery tools in domains from science, medicine to business, a detriment is the lack of support for interactive exploration of high numbers of patterns generated with diverse parameter settings and the relationships among the mined patterns. To enhance the user experience, real-time query turnaround times and improved support for interactive mining are desired. There is also an increasing interest in applying data mining solutions for mobile data. Patterns mined over mobile data may enable context-aware applications ranging from automating frequently repeated tasks to providing personalized recommendations. Overall, this dissertation addresses three problems that limit the utility of data mining, namely, (a.) lack of interactive exploration tools for mined patterns, (b.) insufficient support for mining localized patterns, and (c.) high computational mining requirements prohibiting mining of patterns on smaller compute units such as a smartphone.
This dissertation develops interactive frameworks for the guided exploration of mined patterns and their relationships. Contributions include the PARAS pre- processing and indexing framework; enabling analysts to gain key insights into rule relationships in a parameter space view due to the compact storage of rules that enables query-time reconstruction of complete rulesets. Contributions also include the visual rule exploration framework FIRE that presents an interactive dual view of the parameter space and the rule space, that together enable enhanced sense-making of rule relationships. This dissertation also supports the online mining of localized association rules computed on data subsets by selectively deploying alternative execution strategies that leverage multidimensional itemset-based data partitioning index. Finally, we designed OLAPH, an on-device context-aware service that learns phone usage patterns over mobile context data such as app usage, location, call and SMS logs to provide device intelligence. Concepts introduced for modeling mobile data as sequences include compressing context logs to intervaled context events, adding generalized time features, and identifying meaningful sequences via filter expressions
On mining complex sequential data by means of FCA and pattern structures
Nowadays data sets are available in very complex and heterogeneous ways.
Mining of such data collections is essential to support many real-world
applications ranging from healthcare to marketing. In this work, we focus on
the analysis of "complex" sequential data by means of interesting sequential
patterns. We approach the problem using the elegant mathematical framework of
Formal Concept Analysis (FCA) and its extension based on "pattern structures".
Pattern structures are used for mining complex data (such as sequences or
graphs) and are based on a subsumption operation, which in our case is defined
with respect to the partial order on sequences. We show how pattern structures
along with projections (i.e., a data reduction of sequential structures), are
able to enumerate more meaningful patterns and increase the computing
efficiency of the approach. Finally, we show the applicability of the presented
method for discovering and analyzing interesting patient patterns from a French
healthcare data set on cancer. The quantitative and qualitative results (with
annotations and analysis from a physician) are reported in this use case which
is the main motivation for this work.
Keywords: data mining; formal concept analysis; pattern structures;
projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems.
The paper is created in the wake of the conference on Concept Lattice and
their Applications (CLA'2013). 27 pages, 9 figures, 3 table
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
CloSpan Sequential Pattern Mining for Books Recommendation System in Petra Christian University Library
Petra Christian University (PCU) Library has been using website for their books search system. To further improve the service, it is necessary to develop the automatic system which can recommends the book or the correlation or the book which often being lend at the same time or sequentially by prospective borrowers. The algorithm used to explore the lending sequential patterns is CloSpan Sequential Mining algorithm. The output generated by this application is closed sequential pattern rules
and the tree of sequential patterns. They can be used as a reference to establish a list of recommended related books. From the test results it can be concluded that the more data and smaller minimum support, the longer the process takes, and the more patterns that is produced. From the questionnaire outcome that are distributed to employees and users of the library can be concluded that the system can create right recommendations and useful
- …