1,871 research outputs found

    GPUs as Storage System Accelerators

    Full text link
    Massively multicore processors, such as Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any order-of-magnitude drop in the cost per unit of performance for a class of system components, triggers the opportunity to redesign systems and to explore new ways to engineer them to recalibrate the cost-to-performance relation. This project explores the feasibility of harnessing GPUs' computational power to improve the performance, reliability, or security of distributed storage systems. In this context, we present the design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing, and introduce techniques to efficiently leverage the processing power of GPUs. We evaluate the performance of this prototype under two configurations: as a content addressable storage system that facilitates online similarity detection between successive versions of the same file and as a traditional system that uses hashing to preserve data integrity. Further, we evaluate the impact of offloading to the GPU on competing applications' performance. Our results show that this technique can bring tangible performance gains without negatively impacting the performance of concurrently running applications.Comment: IEEE Transactions on Parallel and Distributed Systems, 201

    Big Data Caching for Networking: Moving from Cloud to Edge

    Full text link
    In order to cope with the relentless data tsunami in 5G5G wireless networks, current approaches such as acquiring new spectrum, deploying more base stations (BSs) and increasing nodes in mobile packet core networks are becoming ineffective in terms of scalability, cost and flexibility. In this regard, context-aware 55G networks with edge/cloud computing and exploitation of \emph{big data} analytics can yield significant gains to mobile operators. In this article, proactive content caching in 55G wireless networks is investigated in which a big data-enabled architecture is proposed. In this practical architecture, vast amount of data is harnessed for content popularity estimation and strategic contents are cached at the BSs to achieve higher users' satisfaction and backhaul offloading. To validate the proposed solution, we consider a real-world case study where several hours of mobile data traffic is collected from a major telecom operator in Turkey and a big data-enabled analysis is carried out leveraging tools from machine learning. Based on the available information and storage capacity, numerical studies show that several gains are achieved both in terms of users' satisfaction and backhaul offloading. For example, in the case of 1616 BSs with 30%30\% of content ratings and 1313 Gbyte of storage size (78%78\% of total library size), proactive caching yields 100%100\% of users' satisfaction and offloads 98%98\% of the backhaul.Comment: accepted for publication in IEEE Communications Magazine, Special Issue on Communications, Caching, and Computing for Content-Centric Mobile Network

    On the traffic offloading in Wi-Fi supported heterogeneous wireless networks

    Get PDF
    Heterogeneous small cell networks (HetSNet) comprise several low power, low cost (SBSa), (D2D) enabled links wireless-fidelity (Wi-Fi) access points (APs) to support the existing macrocell infrastructure, decrease over the air signaling and energy consumption, and increase network capacity, data rate and coverage. This paper presents an active user dependent path loss (PL) based traffic offloading (TO) strategy for HetSNets and a comparative study on two techniques to offload the traffic from macrocell to (SBSs) for indoor environments: PL and signal-to-interference ratio (SIR) based strategies. To quantify the improvements, the PL based strategy against the SIR based strategy is compared while considering various macrocell and (SBS) coverage areas and traffic–types. On the other hand, offloading in a dense urban setting may result in overcrowding the (SBSs). Therefore, hybrid traffic–type driven offloading technologies such as (WiFi) and (D2D) were proposed to en route the delay tolerant applications through (WiFi) (APs) and (D2D) links. It is necessary to illustrate the impact of daily user traffic profile, (SBSs) access schemes and traffic–type while deciding how much of the traffic should be offloaded to (SBSs). In this context, (AUPF) is introduced to account for the population of active small cells which depends on the variable traffic load due to the active users

    GPU Accelerated Particle Visualization with Splotch

    Get PDF
    Splotch is a rendering algorithm for exploration and visual discovery in particle-based datasets coming from astronomical observations or numerical simulations. The strengths of the approach are production of high quality imagery and support for very large-scale datasets through an effective mix of the OpenMP and MPI parallel programming paradigms. This article reports our experiences in re-designing Splotch for exploiting emerging HPC architectures nowadays increasingly populated with GPUs. A performance model is introduced for data transfers, computations and memory access, to guide our re-factoring of Splotch. A number of parallelization issues are discussed, in particular relating to race conditions and workload balancing, towards achieving optimal performances. Our implementation was accomplished by using the CUDA programming paradigm. Our strategy is founded on novel schemes achieving optimized data organisation and classification of particles. We deploy a reference simulation to present performance results on acceleration gains and scalability. We finally outline our vision for future work developments including possibilities for further optimisations and exploitation of emerging technologies.Comment: 25 pages, 9 figures. Astronomy and Computing (2014

    EbbRT: a customizable operating system for cloud applications

    Full text link
    Efficient use of hardware requires operating system components be customized to the application workload. Our general purpose operating systems are ill-suited for this task. We present Genesis, a new operating system that enables per-application customizations for cloud applications. Genesis achieves this through a novel heterogeneous distributed structure, a partitioned object model, and an event-driven execution environment. This paper describes the design and prototype implementation of Genesis, and evaluates its ability to improve the performance of common cloud applications. The evaluation of the Genesis prototype demonstrates memcached, run within a VM, can outperform memcached run on an unvirtualized Linux. The prototype evaluation also demonstrates an 14% performance improvement of a V8 JavaScript engine benchmark, and a node.js webserver that achieves a 50% reduction in 99th percentile latency compared to it run on Linux
    corecore