4,213 research outputs found
GraphBLAST: A High-Performance Linear Algebra-based Graph Framework on the GPU
High-performance implementations of graph algorithms are challenging to
implement on new parallel hardware such as GPUs because of three challenges:
(1) the difficulty of coming up with graph building blocks, (2) load imbalance
on parallel hardware, and (3) graph problems having low arithmetic intensity.
To address some of these challenges, GraphBLAS is an innovative, on-going
effort by the graph analytics community to propose building blocks based on
sparse linear algebra, which will allow graph algorithms to be expressed in a
performant, succinct, composable and portable manner. In this paper, we examine
the performance challenges of a linear-algebra-based approach to building graph
frameworks and describe new design principles for overcoming these bottlenecks.
Among the new design principles is exploiting input sparsity, which allows
users to write graph algorithms without specifying push and pull direction.
Exploiting output sparsity allows users to tell the backend which values of the
output in a single vectorized computation they do not want computed.
Load-balancing is an important feature for balancing work amongst parallel
workers. We describe the important load-balancing features for handling graphs
with different characteristics. The design principles described in this paper
have been implemented in "GraphBLAST", the first high-performance linear
algebra-based graph framework on NVIDIA GPUs that is open-source. The results
show that on a single GPU, GraphBLAST has on average at least an order of
magnitude speedup over previous GraphBLAS implementations SuiteSparse and GBTL,
comparable performance to the fastest GPU hardwired primitives and
shared-memory graph frameworks Ligra and Gunrock, and better performance than
any other GPU graph framework, while offering a simpler and more concise
programming model.Comment: 50 pages, 14 figures, 14 table
Gunrock: A High-Performance Graph Processing Library on the GPU
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs have been two
significant challenges for developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We evaluate Gunrock on five key graph
primitives and show that Gunrock has on average at least an order of magnitude
speedup over Boost and PowerGraph, comparable performance to the fastest GPU
hardwired primitives, and better performance than any other GPU high-level
graph library.Comment: 14 pages, accepted by PPoPP'16 (removed the text repetition in the
previous version v5
Deep Space Habitability Design Guidelines Based on the NASA NextSTEP Phase 2 Ground Test Program
This report summarizes habitation design guidelines for deep space habitats, which were derived from the NASA Next Space Technologies for Exploration Partnerships (NextSTEP) Phase 2 Habitat Ground Test Program. All data presented in this document have been contractor-deidentified and approved for public release. The report prioritizes capabilities and recommends allocating those capabilities to either the Habitation and Logistics Outpost (HALO) or the International Habitat (I-Hab). A review of the design guidelines is presented in the main body of the report, along with a list of the 170 specific design guidelines with references to the specific data sources from which they were derived
Gunrock: GPU Graph Analytics
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock's overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing
(TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance
Graph Processing Library on the GPU
The digital data processing concepts of the LOFT mission
The Large Observatory for X-ray Timing (LOFT) is one of the five mission
candidates that were considered by ESA for an M3 mission (with a launch
opportunity in 2022 - 2024). LOFT features two instruments: the Large Area
Detector (LAD) and the Wide Field Monitor (WFM). The LAD is a 10 m 2 -class
instrument with approximately 15 times the collecting area of the largest
timing mission so far (RXTE) for the first time combined with CCD-class
spectral resolution. The WFM will continuously monitor the sky and recognise
changes in source states, detect transient and bursting phenomena and will
allow the mission to respond to this. Observing the brightest X-ray sources
with the effective area of the LAD leads to enormous data rates that need to be
processed on several levels, filtered and compressed in real-time already on
board. The WFM data processing on the other hand puts rather low constraints on
the data rate but requires algorithms to find the photon interaction location
on the detector and then to deconvolve the detector image in order to obtain
the sky coordinates of observed transient sources. In the following, we want to
give an overview of the data handling concepts that were developed during the
study phase.Comment: Proc. SPIE 9144, Space Telescopes and Instrumentation 2014:
Ultraviolet to Gamma Ray, 91446
The UARS and open data concept and analysis study
Alternative concepts for a common design for the UARS and OPEN Central Data Handling Facility (CDHF) are offered. Costs for alternative implementations of the UARS designs are presented, showing that the system design does not restrict the implementation to a single manufacturer. Processing demands on the alternative UARS CDHF implementations are then discussed. With this information at hand together with estimates for OPEN processing demands, it is shown that any shortfall in system capability for OPEN support can be remedied by either component upgrades or array processing attachments rather than a system redesign. In addition to a common system design, it is shown that there is significant potential for common software design, especially in the areas of data management software and non-user-unique production software. Archiving the CDHF data are discussed. Following that, cost examples for several modes of communications between the CDHF and Remote User Facilities are presented. Technology application is discussed
Space shuttle avionics system
The Space Shuttle avionics system, which was conceived in the early 1970's and became operational in the 1980's represents a significant advancement of avionics system technology in the areas of systems and redundacy management, digital data base technology, flight software, flight control integration, digital fly-by-wire technology, crew display interface, and operational concepts. The origins and the evolution of the system are traced; the requirements, the constraints, and other factors which led to the final configuration are outlined; and the functional operation of the system is described. An overall system block diagram is included
An improved cell controller for the aerospace manufacturing
The aerospace manufacturing industry is unique in that production typically focuses on high variety and quality but low volume. Existing flexible manufacturing cells are limited to certain types of machines, robots and cells which makes it difficult to introduce any changes. In this paper idea of treating machines, robots, any hardware and software as resource has been introduced. It describes the development of the Flexa Cell Coordinator (FCC), a system that is providing a solution to manage cells and their resources in a new flexible manner. It can control, organise and coordinate between cells and resources and is capable of controlling remote cells because of its distributed nature. It also provides connectivity with company systems e.g., Enterprise Resource Planner (ERP). It is extendable and capable of adding multiple cells inside the system. In FCC resources (e.g., tracker) can also be shared between cells. The paper presents its development and results of initial successful testing
The revolution in data gathering systems
Data acquisition systems used in NASA's wind tunnels from the 1950's through the present time are summarized as a baseline for assessing the impact of minicomputers and microcomputers on data acquisition and data processing. Emphasis is placed on the cyclic evolution in computer technology which transformed the central computer system, and finally the distributed computer system. Other developments discussed include: medium scale integration, large scale integration, combining the functions of data acquisition and control, and micro and minicomputers
- …