6 research outputs found
GTRACE-RS: Efficient Graph Sequence Mining using Reverse Search
The mining of frequent subgraphs from labeled graph data has been studied
extensively. Furthermore, much attention has recently been paid to frequent
pattern mining from graph sequences. A method, called GTRACE, has been proposed
to mine frequent patterns from graph sequences under the assumption that
changes in graphs are gradual. Although GTRACE mines the frequent patterns
efficiently, it still needs substantial computation time to mine the patterns
from graph sequences containing large graphs and long sequences. In this paper,
we propose a new version of GTRACE that enables efficient mining of frequent
patterns based on the principle of a reverse search. The underlying concept of
the reverse search is a general scheme for designing efficient algorithms for
hard enumeration problems. Our performance study shows that the proposed method
is efficient and scalable for mining both long and large graph sequence
patterns and is several orders of magnitude faster than the original GTRACE
Mining (maximal) span-cores from temporal networks
When analyzing temporal networks, a fundamental task is the identification of
dense structures (i.e., groups of vertices that exhibit a large number of
links), together with their temporal span (i.e., the period of time for which
the high density holds). We tackle this task by introducing a notion of
temporal core decomposition where each core is associated with its span: we
call such cores span-cores.
As the total number of time intervals is quadratic in the size of the
temporal domain under analysis, the total number of span-cores is quadratic
in as well. Our first contribution is an algorithm that, by exploiting
containment properties among span-cores, computes all the span-cores
efficiently. Then, we focus on the problem of finding only the maximal
span-cores, i.e., span-cores that are not dominated by any other span-core by
both the coreness property and the span. We devise a very efficient algorithm
that exploits theoretical findings on the maximality condition to directly
compute the maximal ones without computing all span-cores.
Experimentation on several real-world temporal networks confirms the
efficiency and scalability of our methods. Applications on temporal networks,
gathered by a proximity-sensing infrastructure recording face-to-face
interactions in schools, highlight the relevance of the notion of (maximal)
span-core in analyzing social dynamics and detecting/correcting anomalies in
the data
Span-core Decomposition for Temporal Networks: Algorithms and Applications
When analyzing temporal networks, a fundamental task is the identification of
dense structures (i.e., groups of vertices that exhibit a large number of
links), together with their temporal span (i.e., the period of time for which
the high density holds). In this paper we tackle this task by introducing a
notion of temporal core decomposition where each core is associated with two
quantities, its coreness, which quantifies how densely it is connected, and its
span, which is a temporal interval: we call such cores \emph{span-cores}.
For a temporal network defined on a discrete temporal domain , the total
number of time intervals included in is quadratic in , so that the
total number of span-cores is potentially quadratic in as well. Our first
main contribution is an algorithm that, by exploiting containment properties
among span-cores, computes all the span-cores efficiently. Then, we focus on
the problem of finding only the \emph{maximal span-cores}, i.e., span-cores
that are not dominated by any other span-core by both their coreness property
and their span. We devise a very efficient algorithm that exploits theoretical
findings on the maximality condition to directly extract the maximal ones
without computing all span-cores.
Finally, as a third contribution, we introduce the problem of \emph{temporal
community search}, where a set of query vertices is given as input, and the
goal is to find a set of densely-connected subgraphs containing the query
vertices and covering the whole underlying temporal domain . We derive a
connection between this problem and the problem of finding (maximal)
span-cores. Based on this connection, we show how temporal community search can
be solved in polynomial-time via dynamic programming, and how the maximal
span-cores can be profitably exploited to significantly speed-up the basic
algorithm.Comment: ACM Transactions on Knowledge Discovery from Data (TKDD), 2020. arXiv
admin note: substantial text overlap with arXiv:1808.0937
A framework for dynamic heterogeneous information networks change discovery based on knowledge engineering and data mining methods
Information Networks are collections of data structures that are used to model interactions in social and living phenomena. They can be either homogeneous or heterogeneous and static or dynamic depending upon the type and nature of relations between the network entities. Static, homogeneous and heterogenous networks have been widely studied in data mining but recently, there has been renewed interest in dynamic heterogeneous information networks (DHIN) analysis because the rich temporal, structural and semantic information is hidden in this kind of network. The heterogeneity and dynamicity of the real-time networks offer plenty of prospects as well as a lot of challenges for data mining. There has been substantial research undertaken on the exploration of entities and their link identification in heterogeneous networks. However, the work on the formal construction and change mining of heterogeneous information networks is still infant due to its complex structure and rich semantics. Researchers have used clusters-based methods and frequent pattern-mining techniques in the past for change discovery in dynamic heterogeneous networks. These methods only work on small datasets, only provide the structural change discovery and fail to consider the quick and parallel process on big data. The problem with these methods is also that cluster-based approaches provide the structural changes while the pattern-mining provide semantic characteristics of changes in a dynamic network. Another interesting but challenging problem that has not been considered by past studies is to extract knowledge from these semantically richer networks based on the user-specific constraint.This study aims to develop a new change mining system ChaMining to investigate dynamic heterogeneous network data, using knowledge engineering with semantic web technologies and data mining to overcome the problems of previous techniques, this system and approach are important in academia as well as real-life applications to support decision-making based on temporal network data patterns. This research has designed a novel framework “ChaMining” (i) to find relational patterns in dynamic networks locally and globally by employing domain ontologies (ii) extract knowledge from these semantically richer networks based on the user-specific (meta-paths) constraints (iii) Cluster the relational data patterns based on structural properties of nodes in the dynamic network (iv) Develop a hybrid approach using knowledge engineering, temporal rule mining and clustering to detect changes in the dynamic heterogeneous networks.The evidence is presented in this research shows that the proposed framework and methods work very efficiently on the benchmark big dynamic heterogeneous datasets. The empirical results can contribute to a better understanding of the rich semantics of DHIN and how to mine them using the proposed hybrid approach. The proposed framework has been evaluated with the previous six dynamic change detection algorithms or frameworks and it performs very well to detect microscopic as well as macroscopic human-understandable changes. The number of change patterns extracted in this approach was higher than the previous approaches which help to reduce the information loss