Search CORE

146,846 research outputs found

Clones in Graphs

Author: A Gély
B Ganter
D Borchmann
David Lusseau
OL Mangasarian
P Gleiser
R Medina
R Wille
RR Faulkner
S Wasserman
T Opsahl
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/07/2018
Field of study

Finding structural similarities in graph data, like social networks, is a far-ranging task in data mining and knowledge discovery. A (conceptually) simple reduction would be to compute the automorphism group of a graph. However, this approach is ineffective in data mining since real world data does not exhibit enough structural regularity. Here we step in with a novel approach based on mappings that preserve the maximal cliques. For this we exploit the well known correspondence between bipartite graphs and the data structure formal context

(G,M,I)

from Formal Concept Analysis. From there we utilize the notion of clone items. The investigation of these is still an open problem to which we add new insights with this work. Furthermore, we produce a substantial experimental investigation of real world data. We conclude with demonstrating the generalization of clone items to permutations.Comment: 11 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

GCG: Mining Maximal Complete Graph Patterns from Large Spatial Data

Author: Al-Naymat Ghazi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/12/2013
Field of study

Recent research on pattern discovery has progressed from mining frequent patterns and sequences to mining structured patterns, such as trees and graphs. Graphs as general data structure can model complex relations among data with wide applications in web exploration and social networks. However, the process of mining large graph patterns is a challenge due to the existence of large number of subgraphs. In this paper, we aim to mine only frequent complete graph patterns. A graph g in a database is complete if every pair of distinct vertices is connected by a unique edge. Grid Complete Graph (GCG) is a mining algorithm developed to explore interesting pruning techniques to extract maximal complete graphs from large spatial dataset existing in Sloan Digital Sky Survey (SDSS) data. Using a divide and conquer strategy, GCG shows high efficiency especially in the presence of large number of patterns. In this paper, we describe GCG that can mine not only simple co-location spatial patterns but also complex ones. To the best of our knowledge, this is the first algorithm used to exploit the extraction of maximal complete graphs in the process of mining complex co-location patterns in large spatial dataset.Comment: 1

arXiv.org e-Print Archive

Crossref

A Regularized Graph Layout Framework for Dynamic Network Visualization

Author: AE Hoerl
Alfred O. Hero
AY Ng
DM Witten
DM Witten
G Di Battista
G Kossinets
H Lütkepohl
I Borg
I Herman
J Branke
J Leskovec
J Moody
JA Lee
K Misue
Kevin S. Xu
KM Hall
L Leydesdorff
LN Trefethen
M Belkin
Mark Kliger
MS Bazaraa
N Eagle
P Eades
PJ Mucha
PW Holland
R Tibshirani
RH Byrd
S Bender-deMoll
T Kamada
TM Newcomb
TMJ Fruchterman
U Brandes
U Brandes
U Brandes
U Brandes
Y Chi
Y Frishman
Y Koren
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/02/2013
Field of study

Many real-world networks, including social and information networks, are dynamic structures that evolve over time. Such dynamic networks are typically visualized using a sequence of static graph layouts. In addition to providing a visual representation of the network structure at each time step, the sequence should preserve the mental map between layouts of consecutive time steps to allow a human to interpret the temporal evolution of the network. In this paper, we propose a framework for dynamic network visualization in the on-line setting where only present and past graph snapshots are available to create the present layout. The proposed framework creates regularized graph layouts by augmenting the cost function of a static graph layout algorithm with a grouping penalty, which discourages nodes from deviating too far from other nodes belonging to the same group, and a temporal penalty, which discourages large node movements between consecutive time steps. The penalties increase the stability of the layout sequence, thus preserving the mental map. We introduce two dynamic layout algorithms within the proposed framework, namely dynamic multidimensional scaling (DMDS) and dynamic graph Laplacian layout (DGLL). We apply these algorithms on several data sets to illustrate the importance of both grouping and temporal regularization for producing interpretable visualizations of dynamic networks.Comment: To appear in Data Mining and Knowledge Discovery, supporting material (animations and MATLAB toolbox) available at http://tbayes.eecs.umich.edu/xukevin/visualization_dmkd_201

arXiv.org e-Print Archive

Crossref

Edge-based mining of frequent subgraphs from graph streams

Author: Cuzzocrea Alfredo
Han Zhao
Jiang Fan
Leung Carson K.
Zhang Hao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

In the current era of Big data, high volumes of valuable data can be generated at a high velocity from high-varieties of data sources in various real-life applications ranging from sensor networks to social networks, from bio-informatics to chemical informatics. In addition, Big data are also available in business, education, engineering, finance, healthcare, scientific, telecommunication, and transportation domains. A collection of these data can be viewed as a big dynamic graph structure. Embedded in them are implicit, previously unknown, and potentially useful knowledge. Consequently, efficient knowledge discovery algorithms for mining frequent subgraphs from these dynamic streaming graph structured data are in demand. On the one hand, some existing algorithms discover collections of frequently co-occurring edges, which may be disjoint. On the other hand, some other existing algorithms discover frequent subgraphs by requiring very large memory space. With high volumes of Big data, available memory space may be limited. To discover collections of frequently co-occurring connected edges, we present in this paper two efficient algorithms that require small memory space. Evaluation results show the efficiency of our edge-based algorithms in mining frequent subgraphs from graph streams

Archivio istituzionale della ricerca - Università di Trieste

Elsevier - Publisher Connector

Graph search and beyond:SIGIR 2015 workshop summary

Author: Alonso O.
Hearst M.A.
Kamps J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Modern Web data is highly structured in terms of entities and relations from large knowledge resources, geo-temporal references and social network structure, resulting in a massive multidimensional graph. This graph essentially unifies both the searcher and the information resources that played a fundamentally different role in traditional IR, and "Graph Search" offers major new ways to access relevant information. Graph search affects both query formulation (complex queries about entities and relations building on the searcher's context) as well as result exploration and discovery (slicing and dicing the information using the graph structure) in a completely personalized way. This new graph based approach introduces great opportunities, but also great challenges, in terms of data quality and data integration, user interface design, and privacy. We view the notion of "graph search" as searching information from your personal point of view (you are the query) over a highly structured and curated information space. This goes beyond the traditional two-term queries and ten blue links results that users are familiar with, requiring a highly interactive session covering both query formulation and result exploration. The workshop attracted a range of researchers working on this and related topics, and made concrete progress working together on one of the greatest challenges in the years to come

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Recommended from our members

A Visual Query Language for Relational Knowledge Discovery

Author: Blau H.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2001
Field of study

QGRAPH is a visual query language for knowledge discovery in relational data. Using QGRAPH, a user can query and update relational data in ways that support data exploration, data transformation, and sampling. When combined with modeling algorithms, such as those developed in inductive logic programming and relational learning, the language assists analysis of relational data, such as data drawn fromtheWeb, chemical structure-activity relationships, and social networks. Several features distinguish QGRAPH from other query languages such as SQL and Datalog. It is a visual language, so its queries are annotated graphs that reflect potential structures within a database. QGRAPH treats objects, links, and attributes as first-class entities, so its queries can dynamically alter a data schema by adding and deleting those entities. Finally, the language provides grouping and counting constructs that facilitate calculation of attributes that can capture features of local graph structure. We describe the language in detail, discuss key aspects of the underlying data model and implementation, and discuss several uses of QGRAPH for knowledge discovery

ScholarWorks@UMass Amherst

Estimating Properties of Social Networks via Random Walk considering Private Nodes

Author: Dey Ratan
Nakajima Kazuki
Takac Lubos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/07/2020
Field of study

Accurately analyzing graph properties of social networks is a challenging task because of access limitations to the graph data. To address this challenge, several algorithms to obtain unbiased estimates of properties from few samples via a random walk have been studied. However, existing algorithms do not consider private nodes who hide their neighbors in real social networks, leading to some practical problems. Here we design random walk-based algorithms to accurately estimate properties without any problems caused by private nodes. First, we design a random walk-based sampling algorithm that comprises the neighbor selection to obtain samples having the Markov property and the calculation of weights for each sample to correct the sampling bias. Further, for two graph property estimators, we propose the weighting methods to reduce not only the sampling bias but also estimation errors due to private nodes. The proposed algorithms improve the estimation accuracy of the existing algorithms by up to 92.6% on real-world datasets.Comment: 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2020

arXiv.org e-Print Archive

Crossref