11,988 research outputs found

    Gunrock: GPU Graph Analytics

    Full text link
    For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

    Scalable Approach to Uncertainty Quantification and Robust Design of Interconnected Dynamical Systems

    Full text link
    Development of robust dynamical systems and networks such as autonomous aircraft systems capable of accomplishing complex missions faces challenges due to the dynamically evolving uncertainties coming from model uncertainties, necessity to operate in a hostile cluttered urban environment, and the distributed and dynamic nature of the communication and computation resources. Model-based robust design is difficult because of the complexity of the hybrid dynamic models including continuous vehicle dynamics, the discrete models of computations and communications, and the size of the problem. We will overview recent advances in methodology and tools to model, analyze, and design robust autonomous aerospace systems operating in uncertain environment, with stress on efficient uncertainty quantification and robust design using the case studies of the mission including model-based target tracking and search, and trajectory planning in uncertain urban environment. To show that the methodology is generally applicable to uncertain dynamical systems, we will also show examples of application of the new methods to efficient uncertainty quantification of energy usage in buildings, and stability assessment of interconnected power networks

    Complex networks in climate dynamics - Comparing linear and nonlinear network construction methods

    Full text link
    Complex network theory provides a powerful framework to statistically investigate the topology of local and non-local statistical interrelationships, i.e. teleconnections, in the climate system. Climate networks constructed from the same global climatological data set using the linear Pearson correlation coefficient or the nonlinear mutual information as a measure of dynamical similarity between regions, are compared systematically on local, mesoscopic and global topological scales. A high degree of similarity is observed on the local and mesoscopic topological scales for surface air temperature fields taken from AOGCM and reanalysis data sets. We find larger differences on the global scale, particularly in the betweenness centrality field. The global scale view on climate networks obtained using mutual information offers promising new perspectives for detecting network structures based on nonlinear physical processes in the climate system.Comment: 24 pages, 10 figure

    Extreme Scale De Novo Metagenome Assembly

    Full text link
    Metagenome assembly is the process of transforming a set of short, overlapping, and potentially erroneous DNA segments from environmental samples into the accurate representation of the underlying microbiomes's genomes. State-of-the-art tools require big shared memory machines and cannot handle contemporary metagenome datasets that exceed Terabytes in size. In this paper, we introduce the MetaHipMer pipeline, a high-quality and high-performance metagenome assembler that employs an iterative de Bruijn graph approach. MetaHipMer leverages a specialized scaffolding algorithm that produces long scaffolds and accommodates the idiosyncrasies of metagenomes. MetaHipMer is end-to-end parallelized using the Unified Parallel C language and therefore can run seamlessly on shared and distributed-memory systems. Experimental results show that MetaHipMer matches or outperforms the state-of-the-art tools in terms of accuracy. Moreover, MetaHipMer scales efficiently to large concurrencies and is able to assemble previously intractable grand challenge metagenomes. We demonstrate the unprecedented capability of MetaHipMer by computing the first full assembly of the Twitchell Wetlands dataset, consisting of 7.5 billion reads - size 2.6 TBytes.Comment: Accepted to SC1

    Energy Efficient Network Generation for Application Specific NoC

    Get PDF
    Networks-on-Chip is emerging as a communication platform for future complex SoC designs, composed of a large number of homogenous or heterogeneous processing resources. Most SoC platforms are customized to the domainspecific requirements of their applications, which communicate in a specific, mostly irregular way. The specific but often diverse communication requirements among cores of the SoC call for the design of application-specific network of SoC for improved performance in terms of communication energy, latency, and throughput. In this work, we propose a methodology for the design of customized irregular network architecture of SoC. The proposed method exploits priori knowledge of the application2019;s communication characteristic to generate an energy optimized network and corresponding routing tables

    Use of regular topology in logical topology design.

    Get PDF

    An infrared bootstrap of the Schur index with surface defects

    Get PDF
    The infrared formula relates the Schur index of a 4d N = 2 theory to its wall-crossing invariant, a.k.a. BPS monodromy. A further extension of this formula, proposed by Córdova, Gaiotto and Shao, includes contributions by various types of line and surface defects. We study BPS monodromies in the presence of vortex surface defects of arbitrary vorticity for general class SS theories of type A_1 engineered by UV curves with at least one regular puncture. The trace of these defect BPS monodromies is shown to coincide with the action of certain q-difference operators acting on the trace of the (pure) 4d BPS monodromy. We use these operators to develop a “bootstrap” (of traces) of BPS monodromies, relying only on their infrared properties, thereby reproducing the very general ultraviolet characterization of the Schur index
    corecore