6 research outputs found
HongTu: Scalable Full-Graph GNN Training on Multiple GPUs (via communication-optimized CPU data offloading)
Full-graph training on graph neural networks (GNN) has emerged as a promising
training method for its effectiveness. Full-graph training requires extensive
memory and computation resources. To accelerate this training process,
researchers have proposed employing multi-GPU processing. However the
scalability of existing frameworks is limited as they necessitate maintaining
the training data for every layer in GPU memory. To efficiently train on large
graphs, we present HongTu, a scalable full-graph GNN training system running on
GPU-accelerated platforms. HongTu stores vertex data in CPU memory and offloads
training to GPUs. HongTu employs a memory-efficient full-graph training
framework that reduces runtime memory consumption by using partition-based
training and recomputation-caching-hybrid intermediate data management. To
address the issue of increased host-GPU communication caused by duplicated
neighbor access among partitions, HongTu employs a deduplicated communication
framework that converts the redundant host-GPU communication to efficient
inter/intra-GPU data access. Further, HongTu uses a cost model-guided graph
reorganization method to minimize communication overhead. Experimental results
on a 4XA100 GPU server show that HongTu effectively supports billion-scale
full-graph GNN training while reducing host-GPU data communication by 25%-71%.
Compared to the full-graph GNN system DistGNN running on 16 CPU nodes, HongTu
achieves speedups ranging from 7.8X to 20.2X. For small graphs where the
training data fits into the GPUs, HongTu achieves performance comparable to
existing GPU-based GNN systems.Comment: 28 pages 11 figures, SIGMOD202
NeutronOrch: Rethinking Sample-based GNN Training under CPU-GPU Heterogeneous Environments
Graph Neural Networks (GNNs) have demonstrated outstanding performance in
various applications. Existing frameworks utilize CPU-GPU heterogeneous
environments to train GNN models and integrate mini-batch and sampling
techniques to overcome the GPU memory limitation. In CPU-GPU heterogeneous
environments, we can divide sample-based GNN training into three steps: sample,
gather, and train. Existing GNN systems use different task orchestrating
methods to employ each step on CPU or GPU. After extensive experiments and
analysis, we find that existing task orchestrating methods fail to fully
utilize the heterogeneous resources, limited by inefficient CPU processing or
GPU resource contention. In this paper, we propose NeutronOrch, a system for
sample-based GNN training that incorporates a layer-based task orchestrating
method and ensures balanced utilization of the CPU and GPU. NeutronOrch
decouples the training process by layer and pushes down the training task of
the bottom layer to the CPU. This significantly reduces the computational load
and memory footprint of GPU training. To avoid inefficient CPU processing,
NeutronOrch only offloads the training of frequently accessed vertices to the
CPU and lets GPU reuse their embeddings with bounded staleness. Furthermore,
NeutronOrch provides a fine-grained pipeline design for the layer-based task
orchestrating method, fully overlapping different tasks on heterogeneous
resources while strictly guaranteeing bounded staleness. The experimental
results show that compared with the state-of-the-art GNN systems, NeutronOrch
can achieve up to 11.51x performance speedup
NeutronStream: A Dynamic GNN Training Framework with Sliding Window for Graph Streams
Existing Graph Neural Network (GNN) training frameworks have been designed to
help developers easily create performant GNN implementations. However, most
existing GNN frameworks assume that the input graphs are static, but ignore
that most real-world graphs are constantly evolving. Though many dynamic GNN
models have emerged to learn from evolving graphs, the training process of
these dynamic GNNs is dramatically different from traditional GNNs in that it
captures both the spatial and temporal dependencies of graph updates. This
poses new challenges for designing dynamic GNN training frameworks. First, the
traditional batched training method fails to capture real-time structural
evolution information. Second, the time-dependent nature makes parallel
training hard to design. Third, it lacks system supports for users to
efficiently implement dynamic GNNs. In this paper, we present NeutronStream, a
framework for training dynamic GNN models. NeutronStream abstracts the input
dynamic graph into a chronologically updated stream of events and processes the
stream with an optimized sliding window to incrementally capture the
spatial-temporal dependencies of events. Furthermore, NeutronStream provides a
parallel execution engine to tackle the sequential event processing challenge
to achieve high performance. NeutronStream also integrates a built-in graph
storage structure that supports dynamic updates and provides a set of
easy-to-use APIs that allow users to express their dynamic GNNs. Our
experimental results demonstrate that, compared to state-of-the-art dynamic GNN
implementations, NeutronStream achieves speedups ranging from 1.48X to 5.87X
and an average accuracy improvement of 3.97%.Comment: 12 pages, 15 figure
Structure evolution and performance of poly (vinyl alcohol) fibers with controllable cross-section fabricated using a combination of melt-spinning and stretching
Profiled fibers are gained potential applications in the fields of both functional textiles and cements reinforcements. While the commercial Poly (vinyl alcohol) (PVA) fibers are generally produced by solution spinning, which is difficult to prepare profiled PVA fiber with controllable cross-section because of bidirectional mass transfer in coagulation bath. In this work, based on intermolecular complexation and plasticization, water was adopted to improve the thermal-processing of PVA, the special-shaped PVA fibers such as triangular, trilobal, cruciform and wavy cross-sections were prepared by melt-spinning. The crystallinity of profiled as-spun PVA fibers was lower than 32%, and their elongation at break was higher than 400%, resulting in good stretch-ability for hot-drawing. During the hot-drawing of profiled PVA fibers, the circular trend of the characteristic sections can be controlled, and the reduction in profile degree was lower than 9.4%. Hot-drawing improved the crystalline structure of PVA, and increased the crystallinity, orientation, tensile strength, and melting point of profiled fibers. When the draw ratio was 9, the tensile strength of triangular, cruciform, trilobal and wavy fibers were 585 MPa, 559 MPa, 481 MPa and 585 MPa, respectively, which proposed a new processing technology of special-shaped PVA fiber and enriched the kinds of special-shaped fibers
Lipid Metabolism-Related Gene Signature Predicts Prognosis and Indicates Immune Microenvironment Infiltration in Advanced Gastric Cancer
Objective. Abnormal lipid metabolism is known to influence the malignant behavior of gastric cancer. However, the underlying mechanism remains elusive. In this study, we comprehensively analyzed the biological significance of genes involved in lipid metabolism in advanced gastric cancer (AGC). Methods. We obtained gene expression profiles from The Cancer Genome Atlas (TCGA) database for early and advanced gastric cancer samples and performed differential expression analysis to identify specific lipid metabolism-related genes in AGC. We then used consensus cluster analysis to classify AGC patients into molecular subtypes based on lipid metabolism and constructed a diagnostic model using least absolute shrinkage and selection operator- (LASSO-) Cox regression analysis and Gene Set Enrichment Analysis (GSEA). We evaluated the discriminative ability and clinical significance of the model using the Kaplan-Meier (KM) curve, ROC curve, DCA curve, and nomogram. We also estimated immune levels based on immune microenvironment expression, immune checkpoints, and immune cell infiltration and obtained hub genes by weighted gene co-expression network analysis (WGCNA) of differential genes from the two molecular subtypes. Results. We identified 6 lipid metabolism genes that were associated with the prognosis of AGC and used consistent clustering to classify AGC patients into two subgroups with significantly different overall survival and immune microenvironment. Our risk model successfully classified patients in the training and validation sets into high-risk and low-risk groups. The high-risk score predicted poor prognosis and indicated low degree of immune infiltration. Subgroup analysis showed that the risk model was an independent predictor of prognosis in AGC. Furthermore, our results indicated that most chemotherapeutic agents are more effective for AGC patients in the low-risk group than in the high-risk group, and risk scores for AGC are strongly correlated with drug sensitivity. Finally, we performed qRT-PCR experiments to verify the relevant results. Conclusion. Our findings suggest that lipid metabolism-related genes play an important role in predicting the prognosis of AGC and regulating immune invasion. These results have important implications for the development of targeted therapies for AGC patients