Search CORE

42 research outputs found

Using similarity of graphs in evaluation of designs

Author: Strug Barbara
Publication venue
Publication date: 01/01/2015
Field of study

This paper deals with evaluating design on the basis of their internal structures in the form of graphs. A set containing graphs representing solutions of similar design tasks is used to search for frequently occurring subgraphs. On the basis of the results of the search the quality of new solutions is evaluated. Moreover the common subgraphs found are considered to be design patterns characterizing a given design task solutions. The paper presents the generic concept of such an approach as well as illustrates it by the small example of floor layout design

Jagiellonian Univeristy Repository

When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors

Author: Andoni A.
Davis T.
Gionis A.
Goel A.
Shrivastava A.
Shrivastava A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/03/2017
Field of study

Finding similar user pairs is a fundamental task in social networks, with numerous applications in ranking and personalization tasks such as link prediction and tie strength detection. A common manifestation of user similarity is based upon network structure: each user is represented by a vector that represents the user's network connections, where pairwise cosine similarity among these vectors defines user similarity. The predominant task for user similarity applications is to discover all similar pairs that have a pairwise cosine similarity value larger than a given threshold

\tau

. In contrast to previous work where

\tau

is assumed to be quite close to 1, we focus on recommendation applications where

\tau

is small, but still meaningful. The all pairs cosine similarity problem is computationally challenging on networks with billions of edges, and especially so for settings with small

\tau

. To the best of our knowledge, there is no practical solution for computing all user pairs with, say

\tau = 0.2

on large social networks, even using the power of distributed algorithms. Our work directly addresses this challenge by introducing a new algorithm --- WHIMP --- that solves this problem efficiently in the MapReduce model. The key insight in WHIMP is to combine the "wedge-sampling" approach of Cohen-Lewis for approximate matrix multiplication with the SimHash random projection techniques of Charikar. We provide a theoretical analysis of WHIMP, proving that it has near optimal communication costs while maintaining computation cost comparable with the state of the art. We also empirically demonstrate WHIMP's scalability by computing all highly similar pairs on four massive data sets, and show that it accurately finds high similarity pairs. In particular, we note that WHIMP successfully processes the entire Twitter network, which has tens of billions of edges

arXiv.org e-Print Archive

Crossref

GraphFind: enhancing graph searching by low support data mining techniques

Author: A Ferro
Alfredo Ferro
Alfredo Pulvirenti
BT Messmer
D Shasha
Dennis Shasha
DJ Cook
Dmitry Skripin
E Cohen
J Cheng
L Cordella
Misael Mongiovì
P Foggia
R Giugno
R Sharan
R Sharan
Rosalba Giugno
S Kumar
S Zhang
X Yan
X Yan
X Yan
Y Tian
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Biomedical and chemical databases are large and rapidly growing in size. Graphs naturally model such kinds of data. To fully exploit the wealth of information in these graph databases, a key role is played by systems that search for all exact or approximate occurrences of a query graph. To deal efficiently with graph searching, advanced methods for indexing, representation and matching of graphs have been proposed. Results This paper presents GraphFind. The system implements efficient graph searching algorithms together with advanced filtering techniques that allow approximate search. It allows users to select candidate subgraphs rather than entire graphs. It implements an effective data storage based also on low-support data mining. Conclusions GraphFind is compared with Frowns, GraphGrep and gIndex. Experiments show that GraphFind outperforms the compared systems on a very large collection of small graphs. The proposed low-support mining technique which applies to any searching system also allows a significant index space reduction.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Catalogo dei prodotti della ricerca

Efficient Subgraph Matching on Billion Node Graphs

Author: Li Jianzhong
Shao Bin
Sun Zhao
Wang Haixun
Wang Hongzhi
Publication venue
Publication date: 01/01/2012
Field of study

The ability to handle large scale graph data is crucial to an increasing number of applications. Much work has been dedicated to supporting basic graph operations such as subgraph matching, reachability, regular expression matching, etc. In many cases, graph indices are employed to speed up query processing. Typically, most indices require either super-linear indexing time or super-linear indexing space. Unfortunately, for very large graphs, super-linear approaches are almost always infeasible. In this paper, we study the problem of subgraph matching on billion-node graphs. We present a novel algorithm that supports efficient subgraph matching for graphs deployed on a distributed memory store. Instead of relying on super-linear indices, we use efficient graph exploration and massive parallel computing for query processing. Our experimental results demonstrate the feasibility of performing subgraph matching on web-scale graph data.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Comparing and Fusing Terrain Network Information

Author: Gaume Bruno
Navarro Emmanuel
Prade Henri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International audienceTerrain networks (or complex networks) is a type of relational infor-mation that is encountered in many fields. In order to properly answer questionspertaining to the comparison or to the merging of such networks, a method thattakes into account the underlying structure of graphs is proposed. The effective-ness of the method is illustrated using real linguistic data networks and artificialnetworks, in particular

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

OPUS - University of Technology Sydney

HAL Descartes

Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases

Author: Chen Lei
Wang Guoren
Wang Haixun
Yuan Ye
Publication venue
Publication date: 01/01/2012
Field of study

Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

An Approach for Keyword Searching in Uncertain Graph Data

Author: Nikita B Zambare
Snehalata S Dongre
Publication venue
Publication date: 24/04/2020
Field of study

ABSTRACT: Keyword searching is generally used for retrieving the relevant data from the database. For input query, the related data is retrieved. But it is tedious task to search keyword on uncertain graph. In this paper, the keyword searching technique over uncertain graph is introduced. The Keyword routing method is used to route the keywords to relevant source. In this approach two methods are included. The keyword relationship graph deduces the relationship between keywords and the element mentioning them. The scoring mechanism computes the score of keywords at each level which reduces the ambiguity. The result will include the subtree of the entire graph which includes all keywords of input query having high score and in addition it retrieves the most relevant data . Effective results are derived from employed method

CiteSeerX