Search CORE

156 research outputs found

Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning

Author: Gong Chen
Pan Shirui
Wan Sheng
Yang Jian
Publication venue
Publication date: 18/09/2020
Field of study

Graph-based Semi-Supervised Learning (SSL) aims to transfer the labels of a handful of labeled data to the remaining massive unlabeled data via a graph. As one of the most popular graph-based SSL approaches, the recently proposed Graph Convolutional Networks (GCNs) have gained remarkable progress by combining the sound expressiveness of neural networks with graph structure. Nevertheless, the existing graph-based methods do not directly address the core problem of SSL, i.e., the shortage of supervision, and thus their performances are still very limited. To accommodate this issue, a novel GCN-based SSL algorithm is presented in this paper to enrich the supervision signals by utilizing both data similarities and graph structure. Firstly, by designing a semi-supervised contrastive loss, improved node representations can be generated via maximizing the agreement between different views of the same data or the data from the same class. Therefore, the rich unlabeled data and the scarce yet valuable labeled data can jointly provide abundant supervision information for learning discriminative node representations, which helps improve the subsequent classification result. Secondly, the underlying determinative relationship between the data features and input graph topology is extracted as supplementary supervision signals for SSL via using a graph generative loss related to the input features. Intensive experimental results on a variety of real-world datasets firmly verify the effectiveness of our algorithm compared with other state-of-the-art methods

arXiv.org e-Print Archive

Monash University Research Portal

Association for the Advancement of Artificial Intelligence: AAAI Publications

PMFL: Partial Meta-Federated Learning for heterogeneous tasks and its applications on real-world medical records

Author: Chen Ziwei
Liu Dianbo
Zhang Shirui
Zhang Tianyi
Publication venue
Publication date: 08/03/2024
Field of study

Federated machine learning is a versatile and flexible tool to utilize distributed data from different sources, especially when communication technology develops rapidly and an unprecedented amount of data could be collected on mobile devices nowadays. Federated learning method exploits not only the data but the computational power of all devices in the network to achieve more efficient model training. Nevertheless, while most traditional federated learning methods work well for homogeneous data and tasks, adapting the method to a different heterogeneous data and task distribution is challenging. This limitation has constrained the applications of federated learning in real-world contexts, especially in healthcare settings. Inspired by the fundamental idea of meta-learning, in this study we propose a new algorithm, which is an integration of federated learning and meta-learning, to tackle this issue. In addition, owing to the advantage of transfer learning for model generalization, we further improve our algorithm by introducing partial parameter sharing. We name this method partial meta-federated learning (PMFL). Finally, we apply the algorithms to two medical datasets. We show that our algorithm could obtain the fastest training speed and achieve the best performance when dealing with heterogeneous medical datasets.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Unifying Large Language Models and Knowledge Graphs: A Roadmap

Author: Chen Chen
Luo Linhao
Pan Shirui
Wang Jiapu
Wang Yufei
Wu Xindong
Publication venue
Publication date: 20/06/2023
Field of study

Large language models (LLMs), such as ChatGPT and GPT4, are making new waves in the field of natural language processing and artificial intelligence, due to their emergent ability and generalizability. However, LLMs are black-box models, which often fall short of capturing and accessing factual knowledge. In contrast, Knowledge Graphs (KGs), Wikipedia and Huapu for example, are structured knowledge models that explicitly store rich factual knowledge. KGs can enhance LLMs by providing external knowledge for inference and interpretability. Meanwhile, KGs are difficult to construct and evolving by nature, which challenges the existing methods in KGs to generate new facts and represent unseen knowledge. Therefore, it is complementary to unify LLMs and KGs together and simultaneously leverage their advantages. In this article, we present a forward-looking roadmap for the unification of LLMs and KGs. Our roadmap consists of three general frameworks, namely, 1) KG-enhanced LLMs, which incorporate KGs during the pre-training and inference phases of LLMs, or for the purpose of enhancing understanding of the knowledge learned by LLMs; 2) LLM-augmented KGs, that leverage LLMs for different KG tasks such as embedding, completion, construction, graph-to-text generation, and question answering; and 3) Synergized LLMs + KGs, in which LLMs and KGs play equal roles and work in a mutually beneficial way to enhance both LLMs and KGs for bidirectional reasoning driven by both data and knowledge. We review and summarize existing efforts within these three frameworks in our roadmap and pinpoint their future research directions.Comment: 29 pages, 25 figure

arXiv.org e-Print Archive

GNNEvaluator: Evaluating GNN Performance On Unseen Graphs Without Labels

Author: Chen Chunyang
Molaei Soheila
Pan Shirui
Zhang Miao
Zheng Xin
Zhou Chuan
Publication venue
Publication date: 26/10/2023
Field of study

Evaluating the performance of graph neural networks (GNNs) is an essential task for practical GNN model deployment and serving, as deployed GNNs face significant performance uncertainty when inferring on unseen and unlabeled test graphs, due to mismatched training-test graph distributions. In this paper, we study a new problem, GNN model evaluation, that aims to assess the performance of a specific GNN model trained on labeled and observed graphs, by precisely estimating its performance (e.g., node classification accuracy) on unseen graphs without labels. Concretely, we propose a two-stage GNN model evaluation framework, including (1) DiscGraph set construction and (2) GNNEvaluator training and inference. The DiscGraph set captures wide-range and diverse graph data distribution discrepancies through a discrepancy measurement function, which exploits the outputs of GNNs related to latent node embeddings and node class predictions. Under the effective training supervision from the DiscGraph set, GNNEvaluator learns to precisely estimate node classification accuracy of the to-be-evaluated GNN model and makes an accurate inference for evaluating GNN model performance. Extensive experiments on real-world unseen and unlabeled test graphs demonstrate the effectiveness of our proposed method for GNN model evaluation.Comment: Accepted by NeurIPS 202

arXiv.org e-Print Archive

Expressive probabilistic sampling in recurrent neural networks

Author: Chen Shirui
Jiang Linxing Preston
Rao Rajesh P. N.
Shea-Brown Eric
Publication venue
Publication date: 14/11/2023
Field of study

In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to explore the minimum architectural requirements for

\textit{recurrent}

neural circuits to sample from complex distributions. We first consider the traditional sampling model consisting of a network of neurons whose outputs directly represent the samples (sampler-only network). We argue that synaptic current and firing-rate dynamics in the traditional model have limited capacity to sample from a complex probability distribution. We show that the firing rate dynamics of a recurrent neural circuit with a separate set of output units can sample from an arbitrary probability distribution. We call such circuits reservoir-sampler networks (RSNs). We propose an efficient training procedure based on denoising score matching that finds recurrent and output weights such that the RSN implements Langevin sampling. We empirically demonstrate our model's ability to sample from several complex data distributions using the proposed neural dynamics and discuss its applicability to developing the next generation of sampling-based brain models

arXiv.org e-Print Archive

Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs

Author: Chen Siheng
Jin Ming
Li Yuan-Fang
Pan Shirui
Yang Bin
Zheng Yu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/11/2022
Field of study

Multivariate time series forecasting has long received significant attention in real-world applications, such as energy consumption and traffic prediction. While recent methods demonstrate good forecasting abilities, they have three fundamental limitations. (i) Discrete neural architectures: Interlacing individually parameterized spatial and temporal blocks to encode rich underlying patterns leads to discontinuous latent state trajectories and higher forecasting numerical errors. (ii) High complexity: Discrete approaches complicate models with dedicated designs and redundant parameters, leading to higher computational and memory overheads. (iii) Reliance on graph priors: Relying on predefined static graph structures limits their effectiveness and practicability in real-world applications. In this paper, we address all the above limitations by proposing a continuous model to forecast

\textbf{M}

ultivariate

\textbf{T}

ime series with dynamic

\textbf{G}

raph neural

\textbf{O}

rdinary

\textbf{D}

ifferential

\textbf{E}

quations (

\texttt{MTGODE}

). Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures. Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing, allowing deeper graph propagation and fine-grained temporal information aggregation to characterize stable and precise latent spatial-temporal dynamics. Our experiments demonstrate the superiorities of

\texttt{MTGODE}

from various perspectives on five time series benchmark datasets.Comment: 14 pages, 6 figures, 5 table

arXiv.org e-Print Archive