63 research outputs found
A Survey on Deep Semi-supervised Learning
Deep semi-supervised learning is a fast-growing field with a range of
practical applications. This paper provides a comprehensive survey on both
fundamentals and recent advances in deep semi-supervised learning methods from
model design perspectives and unsupervised loss functions. We first present a
taxonomy for deep semi-supervised learning that categorizes existing methods,
including deep generative methods, consistency regularization methods,
graph-based methods, pseudo-labeling methods, and hybrid methods. Then we offer
a detailed comparison of these methods in terms of the type of losses,
contributions, and architecture differences. In addition to the past few years'
progress, we further discuss some shortcomings of existing methods and provide
some tentative heuristic solutions for solving these open problems.Comment: 24 pages, 6 figure
ResNorm: Tackling Long-tailed Degree Distribution Issue in Graph Neural Networks via Normalization
Graph Neural Networks (GNNs) have attracted much attention due to their
ability in learning representations from graph-structured data. Despite the
successful applications of GNNs in many domains, the optimization of GNNs is
less well studied, and the performance on node classification heavily suffers
from the long-tailed node degree distribution. This paper focuses on improving
the performance of GNNs via normalization.
In detail, by studying the long-tailed distribution of node degrees in the
graph, we propose a novel normalization method for GNNs, which is termed
ResNorm (\textbf{Res}haping the long-tailed distribution into a normal-like
distribution via \textbf{norm}alization). The operation of ResNorm
reshapes the node-wise standard deviation (NStd) distribution so as to improve
the accuracy of tail nodes (\textit{i}.\textit{e}., low-degree nodes). We
provide a theoretical interpretation and empirical evidence for understanding
the mechanism of the above . In addition to the long-tailed distribution
issue, over-smoothing is also a fundamental issue plaguing the community. To
this end, we analyze the behavior of the standard shift and prove that the
standard shift serves as a preconditioner on the weight matrix, increasing the
risk of over-smoothing. With the over-smoothing issue in mind, we design a
operation for ResNorm that simulates the degree-specific parameter
strategy in a low-cost manner. Extensive experiments have validated the
effectiveness of ResNorm on several node classification benchmark datasets
Container CT scanner: a solution for modular emergency radiology department during the COVID-19 pandemic
During the coronavirus disease 2019 (COVID-19) pandemic period, container computed tomography (CT) scanners were developed and used for the first time in China to perform CT examinations for patients with clinically mild to moderate COVID-19 who did not need to be hospitalized for comprehensive treatment, but needed to be isolated in Fangcang shelter hospitals (also known as makeshift hospitals) to receive some supportive treatment. The container CT is a multidetector CT scanner installed within a radiation-protected stand-alone container (a detachable lead shielding room) that is deployed outside the makeshift hospital buildings. The container CT approach provided various medical institutions with the solution not only for rapid CT installation and high adaptability to site environments, but also for significantly minimizing the risk of cross-infection between radiological personnel and patients during CT examination in the pandemic. In this article, we described the typical setup of a container CT and how it worked for chest CT examinations in Wuhan city, the epicenter of COVID-19 outbreak
An Early Study on Intelligent Analysis of Speech under COVID-19: Severity, Sleep Quality, Fatigue, and Anxiety
The COVID-19 outbreak was announced as a global pandemic by the World Health
Organisation in March 2020 and has affected a growing number of people in the
past few weeks. In this context, advanced artificial intelligence techniques
are brought to the fore in responding to fight against and reduce the impact of
this global health crisis. In this study, we focus on developing some potential
use-cases of intelligent speech analysis for COVID-19 diagnosed patients. In
particular, by analysing speech recordings from these patients, we construct
audio-only-based models to automatically categorise the health state of
patients from four aspects, including the severity of illness, sleep quality,
fatigue, and anxiety. For this purpose, two established acoustic feature sets
and support vector machines are utilised. Our experiments show that an average
accuracy of .69 obtained estimating the severity of illness, which is derived
from the number of days in hospitalisation. We hope that this study can foster
an extremely fast, low-cost, and convenient way to automatically detect the
COVID-19 disease
Hierarchical Heterogeneous Graph Attention Network for Syntax-Aware Summarization
The task of summarization often requires a non-trivial understanding of the given text at the semantic level. In this work, we essentially incorporate the constituent structure into the single document summarization via the Graph Neural Networks to learn the semantic meaning of tokens. More specifically, we propose a novel hierarchical heterogeneous graph attention network over constituency-based parse trees for syntax-aware summarization. This approach reflects psychological findings that humans will pinpoint specific selection patterns to construct summaries hierarchically. Extensive experiments demonstrate that our model is effective for both the abstractive and extractive summarization tasks on five benchmark datasets from various domains. Moreover, further performance improvement can be obtained by virtue of state-of-the-art pre-trained models
A Diffusion-Based Pre-training Framework for Crystal Property Prediction
Many significant problems involving crystal property prediction from 3D structures have limited labeled data due to expensive and time-consuming physical simulations or lab experiments. To overcome this challenge, we propose a pretrain-finetune framework for the crystal property prediction task named CrysDiff based on diffusion models. In the pre-training phase, CrysDiff learns the latent marginal distribution of crystal structures via the reconstruction task. Subsequently, CrysDiff can be fine-tuned under the guidance of the new sparse labeled data, fitting the conditional distribution of the target property given the crystal structures. To better model the crystal geometry, CrysDiff notably captures the full symmetric properties of the crystals, including the invariance of reflection, rotation, and periodic translation. Extensive experiments demonstrate that CrysDiff can significantly improve the performance of the downstream crystal property prediction task on multiple target properties, outperforming all the SOTA pre-training models for crystals with good margins on the popular JARVIS-DFT dataset
Spectral Feature Augmentation for Graph Contrastive Learning and Beyond
Although augmentations (e.g., perturbation of graph edges, image crops) boost the efficiency of Contrastive Learning (CL), feature level augmentation is another plausible, complementary yet not well researched strategy. Thus, we present a novel spectral feature argumentation for contrastive learning on graphs (and images). To this end, for each data view, we estimate a low-rank approximation per feature map and subtract that approximation from the map to obtain its complement. This is achieved by the proposed herein incomplete power iteration, a non-standard power iteration regime which enjoys two valuable byproducts (under mere one or two iterations): (i) it partially balances spectrum of the feature map, and (ii) it injects the noise into rebalanced singular values of the feature map (spectral augmentation). For two views, we align these rebalanced feature maps as such an improved alignment step can focus more on less dominant singular values of matrices of both views, whereas the spectral augmentation does not affect the spectral angle alignment (singular vectors are not perturbed). We derive the analytical form for: (i) the incomplete power iteration to capture its spectrum-balancing effect, and (ii) the variance of singular values augmented implicitly by the noise. We also show that the spectral augmentation improves the generalization bound. Experiments on graph/image datasets show that our spectral feature augmentation outperforms baselines, and is complementary with other augmentation strategies and compatible with various contrastive losses
COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning
Graph contrastive learning (GCL) improves graph representation learning,
leading to SOTA on various downstream tasks. The graph augmentation step is a
vital but scarcely studied step of GCL. In this paper, we show that the node
embedding obtained via the graph augmentations is highly biased, somewhat
limiting contrastive models from learning discriminative features for
downstream tasks. Thus, instead of investigating graph augmentation in the
input space, we alternatively propose to perform augmentations on the hidden
features (feature augmentation). Inspired by so-called matrix sketching, we
propose COSTA, a novel COvariance-preServing feaTure space Augmentation
framework for GCL, which generates augmented features by maintaining a "good
sketch" of original features. To highlight the superiority of feature
augmentation with COSTA, we investigate a single-view setting (in addition to
multi-view one) which conserves memory and computations. We show that the
feature augmentation with COSTA achieves comparable/better results than graph
augmentation based models.Comment: This paper is accepted by the ACM KDD 202
Graph Component Contrastive Learning for Concept Relatedness Estimation
Concept relatedness estimation (CRE) aims to determine whether two given concepts are related. Existing methods only consider the pairwise relationship between concepts, while overlooking the higher-order relationship that could be encoded in a concept-level graph structure. We discover that this underlying graph satisfies a set of intrinsic properties of CRE, including reflexivity, commutativity, and transitivity. In this paper, we formalize the CRE properties and introduce a graph structure named ConcreteGraph. To address the data scarcity issue in CRE, we introduce a novel data augmentation approach to sample new concept pairs from the graph. As it is intractable for data augmentation to fully capture the structural information of the ConcreteGraph due to a large amount of potential concept pairs, we further introduce a novel Graph Component Contrastive Learning framework to implicitly learn the complete structure of the ConcreteGraph. Empirical results on three datasets show significant improvement over the state-of-the-art model. Detailed ablation studies demonstrate that our proposed approach can effectively capture the high-order relationship among concepts
- …