361,776 research outputs found
Performance of Graph Neural Networks for Point Cloud Applications
Graph Neural Networks (GNNs) have gained significant momentum recently due to
their capability to learn on unstructured graph data. Dynamic GNNs (DGNNs) are
the current state-of-the-art for point cloud applications; such applications
(viz. autonomous driving) require real-time processing at the edge with tight
latency and memory constraints. Conducting performance analysis on such DGNNs,
thus, becomes a crucial task to evaluate network suitability.
This paper presents a profiling analysis of EdgeConv-based DGNNs applied to
point cloud inputs. We assess their inference performance in terms of
end-to-end latency and memory consumption on state-of-the-art CPU and GPU
platforms. The EdgeConv layer has two stages: (1) dynamic graph generation
using k-Nearest Neighbors (kNN) and, (2) node feature updation. The addition of
dynamic graph generation via kNN in each (EdgeConv) layer enhances network
performance compared to networks that work with the same static graph in each
layer; such performance enhancement comes, however, at the added computational
cost associated with the dynamic graph generation stage (via kNN algorithm).
Understanding its costs is essential for identifying the performance bottleneck
and exploring potential avenues for hardware acceleration. To this end, this
paper aims to shed light on the performance characteristics of EdgeConv-based
DGNNs for point cloud inputs. Our performance analysis on a state-of-the-art
EdgeConv network for classification shows that the dynamic graph construction
via kNN takes up upwards of 95% of network latency on the GPU and almost 90% on
the CPU. Moreover, we propose a quasi-Dynamic Graph Neural Network (qDGNN) that
halts dynamic graph updates after a specific depth within the network to
significantly reduce the latency on both CPU and GPU whilst matching the
original networks inference accuracy.Comment: 27th Annual IEEE High Performance Extreme Computing Conferenc
DBRS: Directed Acyclic Graph based Reliable Scheduling Approach in Large Scale Computing
In large scale environments, scheduling presents a significant challenge because it is an NP-hard problem. There are basically two types of task in execution- dependent task and independent task. The execution of dependent task must follow a strict order because output of one activity is typically the input of another. In this paper, a reliable fault tolerant approach is proposed for scheduling of dependent task in large scale computing environments. The workflow of dependent task is represented with the help of a DAG (directed acyclic graph). The proposed methodology is evaluated over various parameters by applying it in a large scale computing environment- ‘grid computing’. Grid computing is a high performance computing for solving complex, large and data intensive problems in various fields. The result analysis shows that the proposed DAG based reliable scheduling (DBRS) approach increases the performance of system by decreasing the makespan, number of failures and increasing performance improvement ratio (PIR)
GraPE: fast and scalable Graph Processing and Embedding
Graph Representation Learning methods have enabled a wide range of learning
problems to be addressed for data that can be represented in graph form.
Nevertheless, several real world problems in economy, biology, medicine and
other fields raised relevant scaling problems with existing methods and their
software implementation, due to the size of real world graphs characterized by
millions of nodes and billions of edges. We present GraPE, a software resource
for graph processing and random walk based embedding, that can scale with large
and high-degree graphs and significantly speed up-computation. GraPE comprises
specialized data structures, algorithms, and a fast parallel implementation
that displays everal orders of magnitude improvement in empirical space and
time complexity compared to state of the art software resources, with a
corresponding boost in the performance of machine learning methods for edge and
node label prediction and for the unsupervised analysis of graphs.GraPE is
designed to run on laptop and desktop computers, as well as on high performance
computing cluster
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs
Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems
Implementing and evaluating graph algorithms for long vector architectures
High-Performance Computing can be accelerated using long-vector architectures. However, creating efficient coding implementations for these architectures can be challenging. This Master's thesis focuses on implementing four well-known and widely-used graph processing algorithms using the RISC-V Vector Extension, leveraging an experimental system in an FPGA. I present a graph storage format that benefits from long vectors and describe how these four algorithms can be rewritten to utilize it. This thesis also introduces an instrumentation tool for FPGA that I developed to link the output of electrical engineering software with performance analysis tools for HPC. This tool allows users to visualize information coming from the logic analyzer internal to the FPGA with powerful visualization tools, permitting fine-grain analysis of the FPGA signals correlated with the code running on it. This tool has been integrated into the experimental performance analysis tools of BSC. In this thesis I leverage this tool to analyze and improve my implementations of graph algorithms for long-vector architectures, collecting the process and thoughts behind each optimization. Finally, I compare the performance of my vector implementations with other machines, such as the NEC SX-Aurora, a commercial RISC-V board, and an Intel chip
KADABRA is an ADaptive Algorithm for Betweenness via Random Approximation
We present KADABRA, a new algorithm to approximate betweenness centrality in
directed and undirected graphs, which significantly outperforms all previous
approaches on real-world complex networks. The efficiency of the new algorithm
relies on two new theoretical contributions, of independent interest. The first
contribution focuses on sampling shortest paths, a subroutine used by most
algorithms that approximate betweenness centrality. We show that, on realistic
random graph models, we can perform this task in time
with high probability, obtaining a significant speedup with respect to the
worst-case performance. We experimentally show that this new
technique achieves similar speedups on real-world complex networks, as well.
The second contribution is a new rigorous application of the adaptive sampling
technique. This approach decreases the total number of shortest paths that need
to be sampled to compute all betweenness centralities with a given absolute
error, and it also handles more general problems, such as computing the
most central nodes. Furthermore, our analysis is general, and it might be
extended to other settings.Comment: Some typos correcte
- …