Search CORE

159 research outputs found

Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases

Author: Chen Lei
Wang Guoren
Wang Haixun
Yuan Ye
Publication venue
Publication date: 01/01/2012
Field of study

Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

Energy-Efficient β

Author: Guoren Wang
Junchang Xin
Linlin Ding
Mei Bai
Zhiqiong Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

As the first priority of query processing in wireless sensor networks is to save the limited energy of sensor nodes and in many sensing applications a part of skyline result is enough for the user’s requirement, calculating the exact skyline is not energy-efficient relatively. Therefore, a new approximate skyline query, β-approximate skyline query which is limited by a guaranteed error bound, is proposed in this paper. With an objective to reduce the communication cost in evaluating β-approximate skyline queries, we also propose an energy-efficient processing algorithm using mapping and filtering strategies, named Actual Approximate Skyline (AAS). And more than that, an extended algorithm named Hypothetical Approximate Skyline (HAS) which replaces the real tuples with the hypothetical ones is proposed to further reduce the communication cost. Extensive experiments on synthetic data have demonstrated the efficiency and effectiveness of our proposed approaches with various experimental settings

Crossref

Directory of Open Access Journals

FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks

Author: Feng Kaituo
Li Changsheng
Wang Guoren
Yuan Ye
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/03/2023
Field of study

Knowledge distillation (KD) has demonstrated its effectiveness to boost the performance of graph neural networks (GNNs), where its goal is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is actually difficult to train a satisfactory teacher GNN due to the well-known over-parametrized and over-smoothing issues, leading to invalid knowledge transfer in practical applications. In this paper, we propose the first Free-direction Knowledge Distillation framework via Reinforcement learning for GNNs, called FreeKD, which is no longer required to provide a deeper well-optimized teacher GNN. The core idea of our work is to collaboratively build two shallower GNNs in an effort to exchange knowledge between them via reinforcement learning in a hierarchical way. As we observe that one typical GNN model often has better and worse performances at different nodes during training, we devise a dynamic and free-direction knowledge transfer strategy that consists of two levels of actions: 1) node-level action determines the directions of knowledge transfer between the corresponding nodes of two networks; and then 2) structure-level action determines which of the local structures generated by the node-level actions to be propagated. In essence, our FreeKD is a general and principled framework which can be naturally compatible with GNNs of different architectures. Extensive experiments on five benchmark datasets demonstrate our FreeKD outperforms two base GNNs in a large margin, and shows its efficacy to various GNNs. More surprisingly, our FreeKD has comparable or even better performance than traditional KD algorithms that distill knowledge from a deeper and stronger teacher GNN.Comment: Accepted to KDD 202

arXiv.org e-Print Archive

Learning to Generate Parameters of ConvNets for Unseen Image Data

Author: Feng Kaituo
Li Changsheng
Wang Guoren
Wang Shiye
Yuan Ye
Publication venue
Publication date: 24/10/2023
Field of study

Typical Convolutional Neural Networks (ConvNets) depend heavily on large amounts of image data and resort to an iterative optimization algorithm (e.g., SGD or Adam) to learn network parameters, which makes training very time- and resource-intensive. In this paper, we propose a new training paradigm and formulate the parameter learning of ConvNets into a prediction task: given a ConvNet architecture, we observe there exists correlations between image datasets and their corresponding optimal network parameters, and explore if we can learn a hyper-mapping between them to capture the relations, such that we can directly predict the parameters of the network for an image dataset never seen during the training phase. To do this, we put forward a new hypernetwork based model, called PudNet, which intends to learn a mapping between datasets and their corresponding network parameters, and then predicts parameters for unseen data with only a single forward propagation. Moreover, our model benefits from a series of adaptive hyper recurrent units sharing weights to capture the dependencies of parameters among different network layers. Extensive experiments demonstrate that our proposed method achieves good efficacy for unseen image datasets on two kinds of settings: Intra-dataset prediction and Inter-dataset prediction. Our PudNet can also well scale up to large-scale datasets, e.g., ImageNet-1K. It takes 8967 GPU seconds to train ResNet-18 on the ImageNet-1K using GC from scratch and obtain a top-5 accuracy of 44.65 %. However, our PudNet costs only 3.89 GPU seconds to predict the network parameters of ResNet-18 achieving comparable performance (44.92 %), more than 2,300 times faster than the traditional training paradigm

arXiv.org e-Print Archive

Generalizing the Pigeonhole Principle for Similarity Query Processing in Hamming Space

Author: Ishikawa Yoshiharu
Lin Xuemin
Qin Jianbin
Wang Guoren
Wang Wei
Wang Yaoshu
Xiao Chuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2021
Field of study

Edinburgh Research Explorer

Scalable Algorithms for Laplacian Pseudo-inverse Computation

Author: Chen Hongyang
Dai Qiangqiang
Li Rong-Hua
Liao Meihao
Wang Guoren
Publication venue
Publication date: 16/11/2023
Field of study

The pseudo-inverse of a graph Laplacian matrix, denoted as

L^\dagger

, finds extensive application in various graph analysis tasks. Notable examples include the calculation of electrical closeness centrality, determination of Kemeny's constant, and evaluation of resistance distance. However, existing algorithms for computing

L^\dagger

are often computationally expensive when dealing with large graphs. To overcome this challenge, we propose novel solutions for approximating

L^\dagger

by establishing a connection with the inverse of a Laplacian submatrix

L_v

. This submatrix is obtained by removing the

v

-th row and column from the original Laplacian matrix

L

. The key advantage of this connection is that

L_v^{-1}

exhibits various interesting combinatorial interpretations. We present two innovative interpretations of

L_v^{-1}

based on spanning trees and loop-erased random walks, which allow us to develop efficient sampling algorithms. Building upon these new theoretical insights, we propose two novel algorithms for efficiently approximating both electrical closeness centrality and Kemeny's constant. We extensively evaluate the performance of our algorithms on five real-life datasets. The results demonstrate that our novel approaches significantly outperform the state-of-the-art methods by several orders of magnitude in terms of both running time and estimation errors for these two graph analysis tasks. To further illustrate the effectiveness of electrical closeness centrality and Kemeny's constant, we present two case studies that showcase the practical applications of these metrics

arXiv.org e-Print Archive

Robust Knowledge Adaptation for Dynamic Graph Neural Networks

Author: Feng Kaituo
Li Changsheng
Li Hanjie
Wang Guoren
Yuan Ye
Zha Hongyuan
Publication venue
Publication date: 21/07/2022
Field of study

Graph structured data often possess dynamic characters in nature, e.g., the addition of links and nodes, in many real-world applications. Recent years have witnessed the increasing attentions paid to dynamic graph neural networks for modelling such graph data, where almost all the existing approaches assume that when a new link is built, the embeddings of the neighbor nodes should be updated by learning the temporal dynamics to propagate new information. However, such approaches suffer from the limitation that if the node introduced by a new connection contains noisy information, propagating its knowledge to other nodes is not reliable and even leads to the collapse of the model. In this paper, we propose AdaNet: a robust knowledge Adaptation framework via reinforcement learning for dynamic graph neural Networks. In contrast to previous approaches immediately updating the embeddings of the neighbor nodes once adding a new link, AdaNet attempts to adaptively determine which nodes should be updated because of the new link involved. Considering that the decision whether to update the embedding of one neighbor node will have great impact on other neighbor nodes, we thus formulate the selection of node update as a sequence decision problem, and address this problem via reinforcement learning. By this means, we can adaptively propagate knowledge to other nodes for learning robust node embedding representations. To the best of our knowledge, our approach constitutes the first attempt to explore robust knowledge adaptation via reinforcement learning for dynamic graph neural networks. Extensive experiments on three benchmark datasets demonstrate that AdaNet achieves the state-of-the-art performance. In addition, we perform the experiments by adding different degrees of noise into the dataset, quantitatively and qualitatively illustrating the robustness of AdaNet.Comment: 14 pages, 6 figure

arXiv.org e-Print Archive