17 research outputs found
Collaborative Graph Neural Networks for Attributed Network Embedding
Graph neural networks (GNNs) have shown prominent performance on attributed
network embedding. However, existing efforts mainly focus on exploiting network
structures, while the exploitation of node attributes is rather limited as they
only serve as node features at the initial layer. This simple strategy impedes
the potential of node attributes in augmenting node connections, leading to
limited receptive field for inactive nodes with few or even no neighbors.
Furthermore, the training objectives (i.e., reconstructing network structures)
of most GNNs also do not include node attributes, although studies have shown
that reconstructing node attributes is beneficial. Thus, it is encouraging to
deeply involve node attributes in the key components of GNNs, including graph
convolution operations and training objectives. However, this is a nontrivial
task since an appropriate way of integration is required to maintain the merits
of GNNs. To bridge the gap, in this paper, we propose COllaborative graph
Neural Networks--CONN, a tailored GNN architecture for attribute network
embedding. It improves model capacity by 1) selectively diffusing messages from
neighboring nodes and involved attribute categories, and 2) jointly
reconstructing node-to-node and node-to-attribute-category interactions via
cross-correlation. Experiments on real-world networks demonstrate that CONN
excels state-of-the-art embedding algorithms with a great margin
Multi-Task Learning for Post-transplant Cause of Death Analysis: A Case Study on Liver Transplant
Organ transplant is the essential treatment method for some end-stage
diseases, such as liver failure. Analyzing the post-transplant cause of death
(CoD) after organ transplant provides a powerful tool for clinical decision
making, including personalized treatment and organ allocation. However,
traditional methods like Model for End-stage Liver Disease (MELD) score and
conventional machine learning (ML) methods are limited in CoD analysis due to
two major data and model-related challenges. To address this, we propose a
novel framework called CoD-MTL leveraging multi-task learning to model the
semantic relationships between various CoD prediction tasks jointly.
Specifically, we develop a novel tree distillation strategy for multi-task
learning, which combines the strength of both the tree model and multi-task
learning. Experimental results are presented to show the precise and reliable
CoD predictions of our framework. A case study is conducted to demonstrate the
clinical importance of our method in the liver transplant
Towards Fair Patient-Trial Matching via Patient-Criterion Level Fairness Constraint
Clinical trials are indispensable in developing new treatments, but they face
obstacles in patient recruitment and retention, hindering the enrollment of
necessary participants. To tackle these challenges, deep learning frameworks
have been created to match patients to trials. These frameworks calculate the
similarity between patients and clinical trial eligibility criteria,
considering the discrepancy between inclusion and exclusion criteria. Recent
studies have shown that these frameworks outperform earlier approaches.
However, deep learning models may raise fairness issues in patient-trial
matching when certain sensitive groups of individuals are underrepresented in
clinical trials, leading to incomplete or inaccurate data and potential harm.
To tackle the issue of fairness, this work proposes a fair patient-trial
matching framework by generating a patient-criterion level fairness constraint.
The proposed framework considers the inconsistency between the embedding of
inclusion and exclusion criteria among patients of different sensitive groups.
The experimental results on real-world patient-trial and patient-criterion
matching tasks demonstrate that the proposed framework can successfully
alleviate the predictions that tend to be biased
OpenGSL: A Comprehensive Benchmark for Graph Structure Learning
Graph Neural Networks (GNNs) have emerged as the de facto standard for
representation learning on graphs, owing to their ability to effectively
integrate graph topology and node attributes. However, the inherent suboptimal
nature of node connections, resulting from the complex and contingent formation
process of graphs, presents significant challenges in modeling them
effectively. To tackle this issue, Graph Structure Learning (GSL), a family of
data-centric learning approaches, has garnered substantial attention in recent
years. The core concept behind GSL is to jointly optimize the graph structure
and the corresponding GNN models. Despite the proposal of numerous GSL methods,
the progress in this field remains unclear due to inconsistent experimental
protocols, including variations in datasets, data processing techniques, and
splitting strategies. In this paper, we introduce OpenGSL, the first
comprehensive benchmark for GSL, aimed at addressing this gap. OpenGSL enables
a fair comparison among state-of-the-art GSL methods by evaluating them across
various popular datasets using uniform data processing and splitting
strategies. Through extensive experiments, we observe that existing GSL methods
do not consistently outperform vanilla GNN counterparts. However, we do observe
that the learned graph structure demonstrates a strong generalization ability
across different GNN backbones, despite its high computational and space
requirements. We hope that our open-sourced library will facilitate rapid and
equitable evaluation and inspire further innovative research in the field of
GSL. The code of the benchmark can be found in
https://github.com/OpenGSL/OpenGSL.Comment: 9 pages, 4 figure
Effects of Different Exogenous Substances on the Protein Conformation and in Vitro Digestion Characteristics of Low-salt Tilapia Surimi
The effects of glutamine transaminase (TGase), hydroxypropyl distarch phosphate (HDP), gellan gum and their complex (THG) on the water distribution and protein conformation of low-salt tilapia surimi gel prepared with microwave and ultrasound were analyzed. In addition, the effects of different exogenous substances on the characteristics of low-salt tilapia fish cake were explored through in vitro digestion experiment. The results showed that compared with the control group, THG increased the bound water and immovable water of surimi to 98.71% and 14.75%, respectively, and significantly decreased the free water content (P<0.05). Moreover, THG promoted the transformation of α-helix to β-folding, β-turning and random curling structures. TGase and THG (0.4%) played important roles on gastric emptying rate, protein digestibility and protein hydrolysis degree of low-salt tilapia cake. THG significantly promoted protein decomposition into aggregates with smaller particle size (P<0.05). After the digestion of stomach and duodenum, color of the THG group products was more transparent and clear. And it could be observed by the laser confocal microscope that the red fluorescence highlights of the THG group were significantly reduced, indicating that proteins had been fully digested. Hence, compared with a single exogenous substance, THG not only promoted the binding of water molecules and proteins and induced the change of protein conformation, but also facilitated the exposure of hydrophobic protein groups and the interaction between proteins, and promoted the digestion and absorption of surimi products in the stomach and duodenum. This project provided a theoretical reference for the research on the gel properties of tilapia surimi and the development and application of tilapia fish cake
Towards Efficient Self-Supervised Learning on Graphs
Deep learning on graphs has garnered considerable attention across various machine learning applications, encompassing social science, transportation services, and biomedical informatics. Nonetheless, prevailing methods have predominantly focused on supervised learning, resulting in several limitations, such as heavy reliance on labels and subpar generalization.
To address the scarcity of labels, self-supervised learning (SSL) has emerged as a promising approach for graph data. Traditional SSL methods for graphs primarily concentrate on enhancing model performance through advanced data augmentation strategies and contrastive loss functions. Despite the significant progress made by existing studies, they encounter severe efficiency challenges when dealing with large-scale graphs and resource-limited applications, such as online services. To bridge this gap, I have developed a series of graph SSL models that systematically enhance the efficiency of self-supervised learning on graphs across the stages of model training, inference, and deployment. Firstly, to improve training efficiency, we propose automating the data augmentation process through Graph Personalized Augmentation (GPA) and conducting augmentation-free training via model perturbation (PerturbGCL). Secondly, to expedite inference efficiency, we suggest distilling the fine-tuned classification model into a lightweight model using reliable knowledge distillation (Meta-MLP). Finally, to enhance deployment efficiency, we propose the development of a universal graph model (S2GAE) that enables the learned representation to generalize across different types of downstream tasks in the graph system.
My research presents a significant contribution to the research community by advancing the efficiency and applicability of self-supervised learning on graphs, addressing challenges related to label scarcity and resource limitations. These innovations have the potential to revolutionize various machine learning applications across disciplines, ranging from social science to transportation services and biomedical informatics, ultimately paving the way for more effective and widespread adoption of deep learning techniques in real-world graph scenarios
Multi-Label Classification Based on Low Rank Representation for Image Annotation
Annotating remote sensing images is a challenging task for its labor demanding annotation process and requirement of expert knowledge, especially when images can be annotated with multiple semantic concepts (or labels). To automatically annotate these multi-label images, we introduce an approach called Multi-Label Classification based on Low Rank Representation (MLC-LRR). MLC-LRR firstly utilizes low rank representation in the feature space of images to compute the low rank constrained coefficient matrix, then it adapts the coefficient matrix to define a feature-based graph and to capture the global relationships between images. Next, it utilizes low rank representation in the label space of labeled images to construct a semantic graph. Finally, these two graphs are exploited to train a graph-based multi-label classifier. To validate the performance of MLC-LRR against other related graph-based multi-label methods in annotating images, we conduct experiments on a public available multi-label remote sensing images (Land Cover). We perform additional experiments on five real-world multi-label image datasets to further investigate the performance of MLC-LRR. Empirical study demonstrates that MLC-LRR achieves better performance on annotating images than these comparing methods across various evaluation criteria; it also can effectively exploit global structure and label correlations of multi-label images