Search CORE

47 research outputs found

Scaling Up, Scaling Deep: Blockwise Graph Contrastive Learning

Author: Chen Liang
Li Jintang
Sun Wangbin
Wu Ruofan
Zheng Zibin
Zhu Yuchang
Publication venue
Publication date: 03/06/2023
Field of study

Oversmoothing is a common phenomenon in graph neural networks (GNNs), in which an increase in the network depth leads to a deterioration in their performance. Graph contrastive learning (GCL) is emerging as a promising way of leveraging vast unlabeled graph data. As a marriage between GNNs and contrastive learning, it remains unclear whether GCL inherits the same oversmoothing defect from GNNs. This work undertakes a fundamental analysis of GCL from the perspective of oversmoothing on the first hand. We demonstrate empirically that increasing network depth in GCL also leads to oversmoothing in their deep representations, and surprisingly, the shallow ones. We refer to this phenomenon in GCL as long-range starvation', wherein lower layers in deep networks suffer from degradation due to the lack of sufficient guidance from supervision (e.g., loss computing). Based on our findings, we present BlockGCL, a remarkably simple yet effective blockwise training framework to prevent GCL from notorious oversmoothing. Without bells and whistles, BlockGCL consistently improves robustness and stability for well-established GCL methods with increasing numbers of layers on real-world graph benchmarks. We believe our work will provide insights for future improvements of scalable and deep GCL frameworks.Comment: Preprint; Code is available at https://github.com/EdisonLeeeee/BlockGC

arXiv.org e-Print Archive

Rethinking and Simplifying Bootstrapped Graph Latents

Author: Bian Yatao
Chen Liang
Li Jintang
Sun Wangbin
Wu Bingzhe
Zheng Zibin
Publication venue
Publication date: 05/12/2023
Field of study

Graph contrastive learning (GCL) has emerged as a representative paradigm in graph self-supervised learning, where negative samples are commonly regarded as the key to preventing model collapse and producing distinguishable representations. Recent studies have shown that GCL without negative samples can achieve state-of-the-art performance as well as scalability improvement, with bootstrapped graph latent (BGRL) as a prominent step forward. However, BGRL relies on a complex architecture to maintain the ability to scatter representations, and the underlying mechanisms enabling the success remain largely unexplored. In this paper, we introduce an instance-level decorrelation perspective to tackle the aforementioned issue and leverage it as a springboard to reveal the potential unnecessary model complexity within BGRL. Based on our findings, we present SGCL, a simple yet effective GCL framework that utilizes the outputs from two consecutive iterations as positive pairs, eliminating the negative samples. SGCL only requires a single graph augmentation and a single graph encoder without additional parameters. Extensive experiments conducted on various graph benchmarks demonstrate that SGCL can achieve competitive performance with fewer parameters, lower time and space costs, and significant convergence speedup.Comment: Accepted by WSDM 202

arXiv.org e-Print Archive

Multi-modality cardiac image computing: a survey

Author: Ding Wangbin
et al.
Grau Vicente
Huang Liqin
Li Lei
Zhuang Xiahai
Publication venue: Elsevier
Publication date: 16/06/2023
Field of study

Multi-modality cardiac imaging plays a key role in the management of patients with cardiovascular diseases. It allows a combination of complementary anatomical, morphological and functional information, increases diagnosis accuracy, and improves the efficacy of cardiovascular interventions and clinical outcomes. Fully-automated processing and quantitative analysis of multi-modality cardiac images could have a direct impact on clinical research and evidence-based patient management. However, these require overcoming significant challenges including inter-modality misalignment and finding optimal methods to integrate information from different modalities. This paper aims to provide a comprehensive review of multi-modality imaging in cardiology, the computing methods, the validation strategies, the related clinical workflows and future perspectives. For the computing methodologies, we have a favored focus on the three tasks, i.e., registration, fusion and segmentation, which generally involve multi-modality imaging data, either combining information from different modalities or transferring information across modalities. The review highlights that multi-modality cardiac imaging data has the potential of wide applicability in the clinic, such as trans-aortic valve implantation guidance, myocardial viability assessment, and catheter ablation therapy and its patient selection. Nevertheless, many challenges remain unsolved, such as missing modality, modality selection, combination of imaging and non-imaging data, and uniform analysis and representation of different modalities. There is also work to do in defining how the well-developed techniques fit in clinical workflows and how much additional and relevant information they introduce. These problems are likely to continue to be an active field of research and the questions to be answered in the future

Southampton (e-Prints Soton)

Oxford University Research Archive

DeepMerge: Deep-Learning-Based Region-Merging for Image Segmentation

Author: Huang Xiao
Li Wangbin
Lv Xianwei
Ming Dongping
Persello Claudio
Stein Alfred
Publication venue
Publication date: 05/01/2024
Field of study

Image segmentation aims to partition an image according to the objects in the scene and is a fundamental step in analysing very high spatial-resolution (VHR) remote sensing imagery. Current methods struggle to effectively consider land objects with diverse shapes and sizes. Additionally, the determination of segmentation scale parameters frequently adheres to a static and empirical doctrine, posing limitations on the segmentation of large-scale remote sensing images and yielding algorithms with limited interpretability. To address the above challenges, we propose a deep-learning-based region merging method dubbed DeepMerge to handle the segmentation of complete objects in large VHR images by integrating deep learning and region adjacency graph (RAG). This is the first method to use deep learning to learn the similarity and merge similar adjacent super-pixels in RAG. We propose a modified binary tree sampling method to generate shift-scale data, serving as inputs for transformer-based deep learning networks, a shift-scale attention with 3-Dimension relative position embedding to learn features across scales, and an embedding to fuse learned features with hand-crafted features. DeepMerge can achieve high segmentation accuracy in a supervised manner from large-scale remotely sensed images and provides an interpretable optimal scale parameter, which is validated using a remote sensing image of 0.55 m resolution covering an area of 5,660 km^2. The experimental results show that DeepMerge achieves the highest F value (0.9550) and the lowest total error TE (0.0895), correctly segmenting objects of different sizes and outperforming all competing segmentation methods

arXiv.org e-Print Archive

Recommended from our members

Femtosecond visualization of oxygen vacancies in metal oxides.

Author: Friend Richard H
Hu Huaxin
Li Yurong
Song Xiaoyan
Tang Fawei
Wang Meng
Zhan Wangbin
Zhang Xinping
Publication venue: Sci Adv
Publication date: 18/04/2020
Field of study

Oxygen vacancies often determine the electronic structure of metal oxides, but existing techniques cannot distinguish the oxygen-vacancy sites in the crystal structure. We report here that time-resolved optical spectroscopy can solve this challenge and determine the spatial locations of oxygen vacancies. Using tungsten oxides as examples, we identified the true oxygen-vacancy sites in WO2.9 and WO2.72, typical derivatives of WO3 and determined their fingerprint optoelectronic features. We find that a metastable band with a three-stage evolution dynamics of the excited states is present in WO2.9 but is absent in WO2.72. By comparison with model bandstructure calculations, this enables determination of the most closely neighbored oxygen-vacancy pairs in the crystal structure of WO2.72, for which two oxygen vacancies are ortho-positioned to a single W atom as a sole configuration among all O─W bonds. These findings verify the existence of preference rules of oxygen vacancies in metal oxides

Apollo (Cambridge)

Random Style Transfer based Domain Generalization Networks Integrating Shape and Spatial Information

Author: Ding Wangbin
Huang Liqin
Li Lei
Schnabel Julia A.
Wu Fuping
Zhuang Xiahai
Zimmer Veronika A.
Publication venue
Publication date: 03/09/2020
Field of study

Deep learning (DL)-based models have demonstrated good performance in medical image segmentation. However, the models trained on a known dataset often fail when performed on an unseen dataset collected from different centers, vendors and disease populations. In this work, we present a random style transfer network to tackle the domain generalization problem for multi-vendor and center cardiac image segmentation. Style transfer is used to generate training data with a wider distribution/ heterogeneity, namely domain augmentation. As the target domain could be unknown, we randomly generate a modality vector for the target modality in the style transfer stage, to simulate the domain shift for unknown domains. The model can be trained in a semi-supervised manner by simultaneously optimizing a supervised segmentation and an unsupervised style translation objective. Besides, the framework incorporates the spatial information and shape prior of the target by introducing two regularization terms. We evaluated the proposed framework on 40 subjects from the M\&Ms challenge2020, and obtained promising performance in the segmentation for data from unknown vendors and centers.Comment: 11 page

arXiv.org e-Print Archive

Crossref

Southampton (e-Prints Soton)

King's Research Portal

What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders

Author: Chen Liang
Li Jintang
Meng Changhua
Sun Wangbin
Tian Sheng
Wang Weiqiang
Wu Ruofan
Zheng Zibin
Zhu Liang
Publication venue
Publication date: 29/05/2023
Field of study

The last years have witnessed the emergence of a promising self-supervised learning strategy, referred to as masked autoencoding. However, there is a lack of theoretical understanding of how masking matters on graph autoencoders (GAEs). In this work, we present masked graph autoencoder (MaskGAE), a self-supervised learning framework for graph-structured data. Different from standard GAEs, MaskGAE adopts masked graph modeling (MGM) as a principled pretext task - masking a portion of edges and attempting to reconstruct the missing part with partially visible, unmasked graph structure. To understand whether MGM can help GAEs learn better representations, we provide both theoretical and empirical evidence to comprehensively justify the benefits of this pretext task. Theoretically, we establish close connections between GAEs and contrastive learning, showing that MGM significantly improves the self-supervised learning scheme of GAEs. Empirically, we conduct extensive experiments on a variety of graph benchmarks, demonstrating the superiority of MaskGAE over several state-of-the-arts on both link prediction and node classification tasks.Comment: KDD 2023 research track. Code available at https://github.com/EdisonLeeeee/MaskGA

arXiv.org e-Print Archive

Cross-modality multi-atlas segmentation via deep registration and label fusion

Author: Ding Wangbin
Huang Liqin
Li Lei
Zhuang Xiahai
Publication venue
Publication date: 07/02/2022
Field of study

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target image; and the transformed atlas labels can be combined to generate target segmentation via label fusion schemes. Many conventional MAS methods employed the atlases from the same modality as the target image. However, the number of atlases with the same modality may be limited or even missing in many clinical applications. Besides, conventional MAS methods suffer from the computational burden of registration or label fusion procedures. In this work, we design a novel cross-modality MAS framework, which uses available atlases from a certain modality to segment a target image from another modality. To boost the computational efficiency of the framework, both the image registration and label fusion are achieved by well-designed deep neural networks. For the atlas-to-target image registration, we propose a bi-directional registration network (BiRegNet), which can efficiently align images from different modalities. For the label fusion, we design a similarity estimation network (SimNet), which estimates the fusion weight of each atlas by measuring its similarity to the target image. SimNet can learn multi-scale information for similarity estimation to improve the performance of label fusion. The proposed framework was evaluated by the left ventricle and liver segmentation tasks on the MM-WHS and CHAOS datasets, respectively. Results have shown that the framework is effective for cross-modality MAS in both registration and label fusion https://github.com/NanYoMy/cmmas.</p

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Right ventricular segmentation from short- and long-axis MRIs via information transition

Author: Ding Wangbin
Huang Liqin
Li Lei
Zhuang Xiahai
Publication venue: Springer Cham
Publication date: 14/01/2022
Field of study

Right ventricular (RV) segmentation from magnetic resonance imaging (MRI) is a crucial step for cardiac morphology and function analysis. However, automatic RV segmentation from MRI is still challenging, mainly due to the heterogeneous intensity, the complex variable shapes, and the unclear RV boundary. Moreover, current methods for the RV segmentation tend to suffer from performance degradation at the basal and apical slices of MRI. In this work, we propose an automatic RV segmentation framework, where the information from long-axis (LA) views is utilized to assist the segmentation of short-axis (SA) views via information transition. Specifically, we employed the transformed segmentation from LA views as a prior information, to extract the ROI from SA views for better segmentation. The information transition aims to remove the surrounding ambiguous regions in the SA views. We tested our model on a public dataset with 360 multi-center, multi-vendor and multi-disease subjects that consist of both LA and SA MRIs. Our experimental results show that including LA views can be effective to improve the accuracy of the SA segmentation. Our model is publicly available at https://github.com/NanYoMy/MMs-2.</p

Southampton (e-Prints Soton)

A building change detection framework with patch-pairing single-temporal supervised learning and metric guided attention mechanism

Author: Deren Li
Jinjiang Wei
Kaimin Sun
Song Gao
Wangbin Li
Wenzhuo Li
Yingjiao Tan
Publication venue: Elsevier
Publication date: 01/05/2024
Field of study

Building change detection (CD) aims to detect changes in buildings from bi-temporal pairwise images obtained at different times. Typically, a deep learning-based building CD algorithm requires bi-temporal samples with significant building changes for training. However, obtaining such bi-temporal samples is challenging because building changes have a low probability of occurrence. Fortunately, it is relatively simple to obtain single-temporal samples that include a substantial number of buildings. By using these single-temporal building samples, pseudo bi-temporal building change samples can be generated, which can effectively address the problem of limited available bi-temporal building change samples. In view of that, this study proposes a metric guided single-temporal supervised learning framework that uses single-temporal building samples for building CD. In the proposed framework, patch-pairing single-temporal supervised learning (PPSL) adopts a patch-pairing method to construct pseudo bi-temporal building change samples, while equipping the network to effectively suppresses the negative impact of geometric offset and radiation difference in real samples. To further suppress the impact of radiation difference and enhance the effectiveness of our framework, a metric-guided spatial attention module (MGSAM) is designed to minimize the intra-class feature differences between temporal samples and augment the spatial context modeling ability. The proposed method is verified by experiments on different datasets, and the results demonstrate that the proposed method can outperform the existing methods and achieve superior performance

Directory of Open Access Journals