208 research outputs found
Integrating Graphs with Large Language Models: Methods and Prospects
Large language models (LLMs) such as GPT-4 have emerged as frontrunners,
showcasing unparalleled prowess in diverse applications, including answering
queries, code generation, and more. Parallelly, graph-structured data, an
intrinsic data type, is pervasive in real-world scenarios. Merging the
capabilities of LLMs with graph-structured data has been a topic of keen
interest. This paper bifurcates such integrations into two predominant
categories. The first leverages LLMs for graph learning, where LLMs can not
only augment existing graph algorithms but also stand as prediction models for
various graph tasks. Conversely, the second category underscores the pivotal
role of graphs in advancing LLMs. Mirroring human cognition, we solve complex
tasks by adopting graphs in either reasoning or collaboration. Integrating with
such structures can significantly boost the performance of LLMs in various
complicated tasks. We also discuss and propose open questions for integrating
LLMs with graph-structured data for the future direction of the field
PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly Detection
Node-level graph anomaly detection (GAD) plays a critical role in identifying
anomalous nodes from graph-structured data in various domains such as medicine,
social networks, and e-commerce. However, challenges have arisen due to the
diversity of anomalies and the dearth of labeled data. Existing methodologies -
reconstruction-based and contrastive learning - while effective, often suffer
from efficiency issues, stemming from their complex objectives and elaborate
modules. To improve the efficiency of GAD, we introduce a simple method termed
PREprocessing and Matching (PREM for short). Our approach streamlines GAD,
reducing time and memory consumption while maintaining powerful anomaly
detection capabilities. Comprising two modules - a pre-processing module and an
ego-neighbor matching module - PREM eliminates the necessity for
message-passing propagation during training, and employs a simple contrastive
loss, leading to considerable reductions in training time and memory usage.
Moreover, through rigorous evaluations of five real-world datasets, our method
demonstrated robustness and effectiveness. Notably, when validated on the ACM
dataset, PREM achieved a 5% improvement in AUC, a 9-fold increase in training
speed, and sharply reduce memory usage compared to the most efficient baseline.Comment: Accepted by IEEE International Conference of Data Mining 2023 (ICDM
2023
SIDE: Self-supervised Intermediate Domain Exploration for Source-free Domain Adaptation
Domain adaptation aims to alleviate the domain shift when transferring the
knowledge learned from the source domain to the target domain. Due to privacy
issues, source-free domain adaptation (SFDA), where source data is unavailable
during adaptation, has recently become very demanding yet challenging. Existing
SFDA methods focus on either self-supervised learning of target samples or
reconstruction of virtual source data. The former overlooks the transferable
knowledge in the source model, whilst the latter introduces even more
uncertainty. To address the above issues, this paper proposes self-supervised
intermediate domain exploration (SIDE) that effectively bridges the domain gap
with an intermediate domain, where samples are cyclically filtered out in a
self-supervised fashion. First, we propose cycle intermediate domain filtering
(CIDF) to cyclically select intermediate samples with similar distributions
over source and target domains. Second, with the aid of those intermediate
samples, an inter-domain gap transition (IDGT) module is developed to mitigate
possible distribution mismatches between the source and target data. Finally,
we introduce cross-view consistency learning (CVCL) to maintain the intrinsic
class discriminability whilst adapting the model to the target domain.
Extensive experiments on three popular benchmarks, i.e. Office-31, Office-Home
and VisDA-C, show that our proposed SIDE achieves competitive performance
against state-of-the-art methods.Comment: code at https://github.com/se111/SID
A Dual-Stream Neural Network Explains the Functional Segregation of Dorsal and Ventral Visual Pathways in Human Brains
The human visual system uses two parallel pathways for spatial processing and
object recognition. In contrast, computer vision systems tend to use a single
feedforward pathway, rendering them less robust, adaptive, or efficient than
human vision. To bridge this gap, we developed a dual-stream vision model
inspired by the human eyes and brain. At the input level, the model samples two
complementary visual patterns to mimic how the human eyes use magnocellular and
parvocellular retinal ganglion cells to separate retinal inputs to the brain.
At the backend, the model processes the separate input patterns through two
branches of convolutional neural networks (CNN) to mimic how the human brain
uses the dorsal and ventral cortical pathways for parallel visual processing.
The first branch (WhereCNN) samples a global view to learn spatial attention
and control eye movements. The second branch (WhatCNN) samples a local view to
represent the object around the fixation. Over time, the two branches interact
recurrently to build a scene representation from moving fixations. We compared
this model with the human brains processing the same movie and evaluated their
functional alignment by linear transformation. The WhereCNN and WhatCNN
branches were found to differentially match the dorsal and ventral pathways of
the visual cortex, respectively, primarily due to their different learning
objectives. These model-based results lead us to speculate that the distinct
responses and representations of the ventral and dorsal streams are more
influenced by their distinct goals in visual attention and object recognition
than by their specific bias or selectivity in retinal inputs. This dual-stream
model takes a further step in brain-inspired computer vision, enabling parallel
neural networks to actively explore and understand the visual surroundings
Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating
Unsupervised graph representation learning (UGRL) has drawn increasing
research attention and achieved promising results in several graph analytic
tasks. Relying on the homophily assumption, existing UGRL methods tend to
smooth the learned node representations along all edges, ignoring the existence
of heterophilic edges that connect nodes with distinct attributes. As a result,
current methods are hard to generalize to heterophilic graphs where dissimilar
nodes are widely connected, and also vulnerable to adversarial attacks. To
address this issue, we propose a novel unsupervised Graph Representation
learning method with Edge hEterophily discriminaTing (GREET) which learns
representations by discriminating and leveraging homophilic edges and
heterophilic edges. To distinguish two types of edges, we build an edge
discriminator that infers edge homophily/heterophily from feature and structure
information. We train the edge discriminator in an unsupervised way through
minimizing the crafted pivot-anchored ranking loss, with randomly sampled node
pairs acting as pivots. Node representations are learned through contrasting
the dual-channel encodings obtained from the discriminated homophilic and
heterophilic edges. With an effective interplaying scheme, edge discriminating
and representation learning can mutually boost each other during the training
phase. We conducted extensive experiments on 14 benchmark datasets and multiple
learning scenarios to demonstrate the superiority of GREET.Comment: 14 pages, 7 tables, 6 figures, accepted by AAAI 202
MolFM: A Multimodal Molecular Foundation Model
Molecular knowledge resides within three different modalities of information
sources: molecular structures, biomedical documents, and knowledge bases.
Effective incorporation of molecular knowledge from these modalities holds
paramount significance in facilitating biomedical research. However, existing
multimodal molecular foundation models exhibit limitations in capturing
intricate connections between molecular structures and texts, and more
importantly, none of them attempt to leverage a wealth of molecular expertise
derived from knowledge graphs. In this study, we introduce MolFM, a multimodal
molecular foundation model designed to facilitate joint representation learning
from molecular structures, biomedical texts, and knowledge graphs. We propose
cross-modal attention between atoms of molecular structures, neighbors of
molecule entities and semantically related texts to facilitate cross-modal
comprehension. We provide theoretical analysis that our cross-modal
pre-training captures local and global molecular knowledge by minimizing the
distance in the feature space between different modalities of the same
molecule, as well as molecules sharing similar structures or functions. MolFM
achieves state-of-the-art performance on various downstream tasks. On
cross-modal retrieval, MolFM outperforms existing models with 12.13% and 5.04%
absolute gains under the zero-shot and fine-tuning settings, respectively.
Furthermore, qualitative analysis showcases MolFM's implicit ability to provide
grounding from molecular substructures and knowledge graphs. Code and models
are available on https://github.com/BioFM/OpenBioMed.Comment: 31 pages, 15 figures, and 15 table
- …