16,076 research outputs found

    Gunrock: GPU Graph Analytics

    Full text link
    For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

    Adaptive data synchronization algorithm for IoT-oriented low-power wide-area networks

    Get PDF
    The Internet of Things (IoT) is by now very close to be realized, leading the world towards a new technological era where people’s lives and habits will be definitively revolutionized. Furthermore, the incoming 5G technology promises significant enhancements concerning the Quality of Service (QoS) in mobile communications. Having billions of devices simultaneously connected has opened new challenges about network management and data exchange rules that need to be tailored to the characteristics of the considered scenario. A large part of the IoT market is pointing to Low-Power Wide-Area Networks (LPWANs) representing the infrastructure for several applications having energy saving as a mandatory goal besides other aspects of QoS. In this context, we propose a low-power IoT-oriented file synchronization protocol that, by dynamically optimizing the amount of data to be transferred, limits the device level of interaction within the network, therefore extending the battery life. This protocol can be adopted with different Layer 2 technologies and provides energy savings at the IoT device level that can be exploited by different applications

    Efficient checkpointing over local area networks

    Get PDF
    Parallel and distributed computing on clusters of workstations is becoming very popular as it provides a cost effective way for high performance computing. In these systems, the bandwidth of the communication subsystem (Using Ethernet technology) is about an order of magnitude smaller compared to the bandwidth of the storage subsystem. Hence, storing a state in a checkpoint is much more efficient than comparing states over the network. In this paper we present a novel checkpointing approach that enables efficient performance over local area networks. The main idea is that we use two types of checkpoints: compare-checkpoints (comparing the states of the redundant processes to detect faults) and store-checkpoints (where the state is only stored). The store-checkpoints reduce the rollback needed after a fault is detected, without performing many unnecessary comparisons. As a particular example of this approach we analyzed the DMR checkpointing scheme with store-checkpoints. Our main result is that the overhead of the execution time can be significantly reduced when store-checkpoints are introduced. We have implemented a prototype of the new DMR scheme and run it on workstations connected by a LAN. The experimental results we obtained match the analytical results and show that in some cases the overhead of the DMR checkpointing schemes over LAN's can be improved by as much as 20%

    Robust Motion Segmentation from Pairwise Matches

    Full text link
    In this paper we address a classification problem that has not been considered before, namely motion segmentation given pairwise matches only. Our contribution to this unexplored task is a novel formulation of motion segmentation as a two-step process. First, motion segmentation is performed on image pairs independently. Secondly, we combine independent pairwise segmentation results in a robust way into the final globally consistent segmentation. Our approach is inspired by the success of averaging methods. We demonstrate in simulated as well as in real experiments that our method is very effective in reducing the errors in the pairwise motion segmentation and can cope with large number of mismatches

    Addressable time division multiplexer system /cable and connector study/ Final report

    Get PDF
    Reliability studies of single interrogation and single data cables used in prototype addressable time division multiplexer syste

    A Compression Technique Exploiting References for Data Synchronization Services

    Get PDF
    Department of Computer Science and EngineeringIn a variety of network applications, there exists significant amount of shared data between two end hosts. Examples include data synchronization services that replicate data from one node to another. Given that shared data may have high correlation with new data to transmit, we question how such shared data can be best utilized to improve the efficiency of data transmission. To answer this, we develop an encoding technique, SyncCoding, that effectively replaces bit sequences of the data to be transmitted with the pointers to their matching bit sequences in the shared data so called references. By doing so, SyncCoding can reduce data traffic, speed up data transmission, and save energy consumption for transmission. Our evaluations of SyncCoding implemented in Linux show that it outperforms existing popular encoding techniques, Brotli, LZMA, Deflate, and Deduplication. The gains of SyncCoding over those techniques in the perspective of data size after compression in a cloud storage scenario are about 12.4%, 20.1%, 29.9%, and 61.2%, and are about 78.3%, 79.6%, 86.1%, and 92.9% in a web browsing scenario, respectively.ope

    Panel discussion: Proposals for improving OCL

    Get PDF
    During the panel session at the OCL workshop, the OCL community discussed, stimulated by short presentations by OCL experts, potential future extensions and improvements of the OCL. As such, this panel discussion continued the discussion that started at the OCL meeting in Aachen in 2013 and on which we reported in the proceedings of the last year's OCL workshop. This collaborative paper, to which each OCL expert contributed one section, summarises the panel discussion as well as describes the suggestions for further improvements in more detail.Peer ReviewedPostprint (published version
    corecore