Search CORE

4,213 research outputs found

GraphBLAST: A High-Performance Linear Algebra-based Graph Framework on the GPU

Author: Buluc Aydin
Owens John D.
Yang Carl
Publication venue
Publication date: 14/11/2020
Field of study

High-performance implementations of graph algorithms are challenging to implement on new parallel hardware such as GPUs because of three challenges: (1) the difficulty of coming up with graph building blocks, (2) load imbalance on parallel hardware, and (3) graph problems having low arithmetic intensity. To address some of these challenges, GraphBLAS is an innovative, on-going effort by the graph analytics community to propose building blocks based on sparse linear algebra, which will allow graph algorithms to be expressed in a performant, succinct, composable and portable manner. In this paper, we examine the performance challenges of a linear-algebra-based approach to building graph frameworks and describe new design principles for overcoming these bottlenecks. Among the new design principles is exploiting input sparsity, which allows users to write graph algorithms without specifying push and pull direction. Exploiting output sparsity allows users to tell the backend which values of the output in a single vectorized computation they do not want computed. Load-balancing is an important feature for balancing work amongst parallel workers. We describe the important load-balancing features for handling graphs with different characteristics. The design principles described in this paper have been implemented in "GraphBLAST", the first high-performance linear algebra-based graph framework on NVIDIA GPUs that is open-source. The results show that on a single GPU, GraphBLAST has on average at least an order of magnitude speedup over previous GraphBLAS implementations SuiteSparse and GBTL, comparable performance to the fastest GPU hardwired primitives and shared-memory graph frameworks Ligra and Gunrock, and better performance than any other GPU graph framework, while offering a simpler and more concise programming model.Comment: 50 pages, 14 figures, 14 table

arXiv.org e-Print Archive

eScholarship - University of California

Gunrock: A High-Performance Graph Processing Library on the GPU

Author: Cederman D.
Goel A.
Gonzalez J. E.
Gregor D.
Jia Y.
Low Y.
Pande P. R.
Siek J. G.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/01/2016
Field of study

For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs have been two significant challenges for developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We evaluate Gunrock on five key graph primitives and show that Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives, and better performance than any other GPU high-level graph library.Comment: 14 pages, accepted by PPoPP'16 (removed the text repetition in the previous version v5

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Deep Space Habitability Design Guidelines Based on the NASA NextSTEP Phase 2 Ground Test Program

Author: Beaton Kara
Bekdash Omar
Chappell Steve
Gernhardt Michael
Litaker Harry
Newton Carolyn
Stoffel James
Publication venue
Publication date
Field of study

This report summarizes habitation design guidelines for deep space habitats, which were derived from the NASA Next Space Technologies for Exploration Partnerships (NextSTEP) Phase 2 Habitat Ground Test Program. All data presented in this document have been contractor-deidentified and approved for public release. The report prioritizes capabilities and recommends allocating those capabilities to either the Habitation and Logistics Outpost (HALO) or the International Habitat (I-Hab). A review of the design guidelines is presented in the main body of the report, along with a list of the 170 specific design guidelines with references to the specific data sources from which they were derived

NASA Technical Reports Server

Gunrock: GPU Graph Analytics

Author: Davidson Andrew
Liu Weitang
Osama Muhammad
Owens John D.
Pan Yuechao
Riffel Andy T.
Wang Leyuan
Wang Yangzihao
Wu Yuduo
Yang Carl
Yuan Chenshan
Publication venue
Publication date: 04/01/2017
Field of study

For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

arXiv.org e-Print Archive

eScholarship - University of California

FigShare

The digital data processing concepts of the LOFT mission

Author: Argan A.
Cros A.
Favre Y.
Gschwender M.
Jetter F.
Santangelo A.
Schanne S.
Smith P.
Suchy S.
Tenzer C.
Uter P.
Walton D.
Wende H.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2014
Field of study

The Large Observatory for X-ray Timing (LOFT) is one of the five mission candidates that were considered by ESA for an M3 mission (with a launch opportunity in 2022 - 2024). LOFT features two instruments: the Large Area Detector (LAD) and the Wide Field Monitor (WFM). The LAD is a 10 m 2 -class instrument with approximately 15 times the collecting area of the largest timing mission so far (RXTE) for the first time combined with CCD-class spectral resolution. The WFM will continuously monitor the sky and recognise changes in source states, detect transient and bursting phenomena and will allow the mission to respond to this. Observing the brightest X-ray sources with the effective area of the LAD leads to enormous data rates that need to be processed on several levels, filtered and compressed in real-time already on board. The WFM data processing on the other hand puts rather low constraints on the data rate but requires algorithms to find the photon interaction location on the detector and then to deconvolve the detector image in order to obtain the sky coordinates of observed transient sources. In the following, we want to give an overview of the data handling concepts that were developed during the study phase.Comment: Proc. SPIE 9144, Space Telescopes and Instrumentation 2014: Ultraviolet to Gamma Ray, 91446

arXiv.org e-Print Archive

Crossref

OA@INAF - Istituto Nazionale di Astrofisica

The UARS and open data concept and analysis study

Author: Mittal M.
Nebb J.
Woodward H.
Publication venue
Publication date
Field of study

Alternative concepts for a common design for the UARS and OPEN Central Data Handling Facility (CDHF) are offered. Costs for alternative implementations of the UARS designs are presented, showing that the system design does not restrict the implementation to a single manufacturer. Processing demands on the alternative UARS CDHF implementations are then discussed. With this information at hand together with estimates for OPEN processing demands, it is shown that any shortfall in system capability for OPEN support can be remedied by either component upgrades or array processing attachments rather than a system redesign. In addition to a common system design, it is shown that there is significant potential for common software design, especially in the areas of data management software and non-user-unique production software. Archiving the CDHF data are discussed. Following that, cost examples for several modes of communications between the CDHF and Remote User Facilities are presented. Technology application is discussed

NASA Technical Reports Server

Space shuttle avionics system

Author: Hanaway John F.
Moorehead Robert W.
Publication venue
Publication date
Field of study

The Space Shuttle avionics system, which was conceived in the early 1970's and became operational in the 1980's represents a significant advancement of avionics system technology in the areas of systems and redundacy management, digital data base technology, flight software, flight control integration, digital fly-by-wire technology, crew display interface, and operational concepts. The origins and the evolution of the system are traced; the requirements, the constraints, and other factors which led to the final configuration are outlined; and the functional operation of the system is described. An overall system block diagram is included

NASA Technical Reports Server

An improved cell controller for the aerospace manufacturing

Author: Asif Seemal
Webb Philip
Publication venue: Cranfield University Press
Publication date: 19/09/2013
Field of study

The aerospace manufacturing industry is unique in that production typically focuses on high variety and quality but low volume. Existing flexible manufacturing cells are limited to certain types of machines, robots and cells which makes it difficult to introduce any changes. In this paper idea of treating machines, robots, any hardware and software as resource has been introduced. It describes the development of the Flexa Cell Coordinator (FCC), a system that is providing a solution to manage cells and their resources in a new flexible manner. It can control, organise and coordinate between cells and resources and is capable of controlling remote cells because of its distributed nature. It also provides connectivity with company systems e.g., Enterprise Resource Planner (ERP). It is extendable and capable of adding multiple cells inside the system. In FCC resources (e.g., tracker) can also be shared between cells. The paper presents its development and results of initial successful testing

Cranfield CERES

The revolution in data gathering systems

Author: Cambra J. M.
Trover W. F.
Publication venue
Publication date
Field of study

Data acquisition systems used in NASA's wind tunnels from the 1950's through the present time are summarized as a baseline for assessing the impact of minicomputers and microcomputers on data acquisition and data processing. Emphasis is placed on the cyclic evolution in computer technology which transformed the central computer system, and finally the distributed computer system. Other developments discussed include: medium scale integration, large scale integration, combining the functions of data acquisition and control, and micro and minicomputers

NASA Technical Reports Server