43 research outputs found

    Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses

    Full text link
    Graph Neural Networks (GNNs) are emerging as a powerful tool for learning from graph-structured data and performing sophisticated inference tasks in various application domains. Although GNNs have been shown to be effective on modest-sized graphs, training them on large-scale graphs remains a significant challenge due to lack of efficient data access and data movement methods. Existing frameworks for training GNNs use CPUs for graph sampling and feature aggregation, while the training and updating of model weights are executed on GPUs. However, our in-depth profiling shows the CPUs cannot achieve the throughput required to saturate GNN model training throughput, causing gross under-utilization of expensive GPU resources. Furthermore, when the graph and its embeddings do not fit in the CPU memory, the overhead introduced by the operating system, say for handling page-faults, comes in the critical path of execution. To address these issues, we propose the GPU Initiated Direct Storage Access (GIDS) dataloader, to enable GPU-oriented GNN training for large-scale graphs while efficiently utilizing all hardware resources, such as CPU memory, storage, and GPU memory with a hybrid data placement strategy. By enabling GPU threads to fetch feature vectors directly from storage, GIDS dataloader solves the memory capacity problem for GPU-oriented GNN training. Moreover, GIDS dataloader leverages GPU parallelism to tolerate storage latency and eliminates expensive page-fault overhead. Doing so enables us to design novel optimizations for exploiting locality and increasing effective bandwidth for GNN training. Our evaluation using a single GPU on terabyte-scale GNN datasets shows that GIDS dataloader accelerates the overall DGL GNN training pipeline by up to 392X when compared to the current, state-of-the-art DGL dataloader.Comment: Under Submission. Source code: https://github.com/jeongminpark417/GID

    Venous Thromboembolism in Cancer: An Update of Treatment and Prevention in the Era of Newer Anticoagulants

    Get PDF
    Cancer patients are at major risk of developing Venous Thromboembolism (VTE), resulting in increased morbidity and economic burden. While a number of theories try to explain its pathophysiology, its risk stratification can be broadly done in cancer related, treatment related and patient related factors. Studies report the prophylactic use of thrombolytic agents to be safe and effective in decreasing VTE related mortality/ morbidity especially in postoperative cancer patients. Recent data also suggests the prophylactic use of low molecular weight Heparins (LMWH’s) and Warfarin to be effective in reducing VTE’s related to long term Central Venous Catheter (CVC) use. In a double blind, multicenter trial, a new Ultra LMWH Semuloparin has shown to be efficacious in preventing chemotherapy associated VTE’s along with other drugs such as Certoparin and Nadoparin.. LMWH’s are reported to be very useful in preventing recurrent VTE’s in advanced cancers and should be preferred over full dose Warfarin. However their long term safety beyond 6 months has not been established yet. Further, this manuscript discusses the safety and efficacy of different drugs used in the treatment and prevention of recurrent VTE’s including Bemiparin, Semuloparin, oral direct thrombin inhibitors, parenteral and direct oral factor Xa inhibitors

    Potentially Guided Bidirectionalized RRT* for Fast Optimal Path Planning in Cluttered Environments

    Get PDF
    Rapidly-exploring Random Tree star (RRT*) has recently gained immense popularity in the motion planning community as it provides a probabilistically complete and asymptotically optimal solution without requiring the complete information of the obstacle space. In spite of all of its advantages, RRT* converges to optimal solution very slowly. Hence to improve the convergence rate, its bidirectional variants were introduced, the Bi-directional RRT* (B-RRT*) and Intelligent Bi-directional RRT* (IB-RRT*). However, as both variants perform pure exploration, they tend to suffer in highly cluttered environments. In order to overcome these limitations we introduce a new concept of potentially guided bidirectional trees in our proposed Potentially Guided Intelligent Bi-directional RRT* (PIB-RRT*) and Potentially Guided Bi-directional RRT* (PB-RRT*). The proposed algorithms greatly improve the convergence rate and have a more efficient memory utilization. Theoretical and experimental evaluation of the proposed algorithms have been made and compared to the latest state of the art motion planning algorithms under different challenging environmental conditions and have proven their remarkable improvement in efficiency and convergence rate

    CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs

    Full text link
    Data compression and decompression have become vital components of big-data applications to manage the exponential growth in the amount of data collected and stored. Furthermore, big-data applications have increasingly adopted GPUs due to their high compute throughput and memory bandwidth. Prior works presume that decompression is memory-bound and have dedicated most of the GPU's threads to data movement and adopted complex software techniques to hide memory latency for reading compressed data and writing uncompressed data. This paper shows that these techniques lead to poor GPU resource utilization as most threads end up waiting for the few decoding threads, exposing compute and synchronization latencies. Based on this observation, we propose CODAG, a novel and simple kernel architecture for high throughput decompression on GPUs. CODAG eliminates the use of specialized groups of threads, frees up compute resources to increase the number of parallel decompression streams, and leverages the ample compute activities and the GPU's hardware scheduler to tolerate synchronization, compute, and memory latencies. Furthermore, CODAG provides a framework for users to easily incorporate new decompression algorithms without being burdened with implementing complex optimizations to hide memory latency. We validate our proposed architecture with three different encoding techniques, RLE v1, RLE v2, and Deflate, and a wide range of large datasets from different domains. We show that CODAG provides 13.46x, 5.69x, and 1.18x speed up for RLE v1, RLE v2, and Deflate, respectively, when compared to the state-of-the-art decompressors from NVIDIA RAPIDS

    ADHD presenting as recurrent epistaxis: a case report

    Get PDF
    Epistaxis is an important otorhinolaryngological emergency, which usually has an apparent etiology, frequently local trauma in children. Here we present a case report wherein the epistaxis was recalcitrant, and proved to have a psychiatric disorder as an underlying basis. The child was diagnosed with Attention Deficit/Hyperactivity Disorder, hyperactive type, which led to trauma to nasal mucosa due to frequent and uncontrolled nose picking. Treatment with atomoxetine controlled the patient's symptoms and led to a remission of epistaxis

    The global burden of trichiasis in 2016.

    Get PDF
    BACKGROUND: Trichiasis is present when one or more eyelashes touches the eye. Uncorrected, it can cause blindness. Accurate estimates of numbers affected, and their geographical distribution, help guide resource allocation. METHODS: We obtained district-level trichiasis prevalence estimates in adults for 44 endemic and previously-endemic countries. We used (1) the most recent data for a district, if more than one estimate was available; (2) age- and sex-standardized corrections of historic estimates, where raw data were available; (3) historic estimates adjusted using a mean adjustment factor for districts where raw data were unavailable; and (4) expert assessment of available data for districts for which no prevalence estimates were available. FINDINGS: Internally age- and sex-standardized data represented 1,355 districts and contributed 662 thousand cases (95% confidence interval [CI] 324 thousand-1.1 million) to the global total. Age- and sex-standardized district-level prevalence estimates differed from raw estimates by a mean factor of 0.45 (range 0.03-2.28). Previously non- stratified estimates for 398 districts, adjusted by ×0.45, contributed a further 411 thousand cases (95% CI 283-557 thousand). Eight countries retained previous estimates, contributing 848 thousand cases (95% CI 225 thousand-1.7 million). New expert assessments in 14 countries contributed 862 thousand cases (95% CI 228 thousand-1.7 million). The global trichiasis burden in 2016 was 2.8 million cases (95% CI 1.1-5.2 million). INTERPRETATION: The 2016 estimate is lower than previous estimates, probably due to more and better data; scale-up of trichiasis management services; and reductions in incidence due to lower active trachoma prevalence
    corecore