2,413 research outputs found

    Performance Characterization of Multi-threaded Graph Processing Applications on Intel Many-Integrated-Core Architecture

    Full text link
    Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of terascale integration. Among emerging killer applications, parallel graph processing has been a critical technique to analyze connected data. In this paper, we empirically evaluate various computing platforms including an Intel Xeon E5 CPU, a Nvidia Geforce GTX1070 GPU and an Xeon Phi 7210 processor codenamed Knights Landing (KNL) in the domain of parallel graph processing. We show that the KNL gains encouraging performance when processing graphs, so that it can become a promising solution to accelerating multi-threaded graph applications. We further characterize the impact of KNL architectural enhancements on the performance of a state-of-the art graph framework.We have four key observations: 1 Different graph applications require distinctive numbers of threads to reach the peak performance. For the same application, various datasets need even different numbers of threads to achieve the best performance. 2 Only a few graph applications benefit from the high bandwidth MCDRAM, while others favor the low latency DDR4 DRAM. 3 Vector processing units executing AVX512 SIMD instructions on KNLs are underutilized when running the state-of-the-art graph framework. 4 The sub-NUMA cache clustering mode offering the lowest local memory access latency hurts the performance of graph benchmarks that are lack of NUMA awareness. At last, We suggest future works including system auto-tuning tools and graph framework optimizations to fully exploit the potential of KNL for parallel graph processing.Comment: published as L. Jiang, L. Chen and J. Qiu, "Performance Characterization of Multi-threaded Graph Processing Applications on Many-Integrated-Core Architecture," 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Belfast, United Kingdom, 2018, pp. 199-20

    Exploring heterogeneity of unreliable machines for p2p backup

    Full text link
    P2P architecture is a viable option for enterprise backup. In contrast to dedicated backup servers, nowadays a standard solution, making backups directly on organization's workstations should be cheaper (as existing hardware is used), more efficient (as there is no single bottleneck server) and more reliable (as the machines are geographically dispersed). We present the architecture of a p2p backup system that uses pairwise replication contracts between a data owner and a replicator. In contrast to standard p2p storage systems using directly a DHT, the contracts allow our system to optimize replicas' placement depending on a specific optimization strategy, and so to take advantage of the heterogeneity of the machines and the network. Such optimization is particularly appealing in the context of backup: replicas can be geographically dispersed, the load sent over the network can be minimized, or the optimization goal can be to minimize the backup/restore time. However, managing the contracts, keeping them consistent and adjusting them in response to dynamically changing environment is challenging. We built a scientific prototype and ran the experiments on 150 workstations in the university's computer laboratories and, separately, on 50 PlanetLab nodes. We found out that the main factor affecting the quality of the system is the availability of the machines. Yet, our main conclusion is that it is possible to build an efficient and reliable backup system on highly unreliable machines (our computers had just 13% average availability)

    CoolCloud: improving energy efficiency in virtualized data centers

    Get PDF
    In recent years, cloud computing services continue to grow and has become more pervasive and indispensable in people\u27s lives. The energy consumption continues to rise as more and more data centers are being built. How to provide a more energy efficient data center infrastructure that can support today\u27s cloud computing services has become one of the most important issues in the field of cloud computing research. In this thesis, we mainly tackle three research problems: 1. how to achieve energy savings in a virtualized data center environment; 2. how to maintain service level agreements; 3. how to make our design practical for actual implementation in enterprise data centers. Combining all the studies above, we propose an optimization framework named CoolCloud to minimize energy consumption in virtualized data centers with the service level agreement taken into consideration. The proposed framework minimizes energy at two different layers: (1) minimize local server energy using dynamic voltage \& frequency scaling (DVFS) exploiting runtime program phases. (2) minimize global cluster energy using dynamic mapping between virtual machines (VMs) and servers based on each VM\u27s resource requirement. Such optimization leads to the most economical way to operate an enterprise data center. On each local server, we develop a voltage and frequency scheduler that can provide CPU energy savings under applications\u27 or virtual machines\u27 specified SLA requirements by exploiting applications\u27 run-time program phases. At the cluster level, we propose a practical solution for managing the mappings of VMs to physical servers. This framework solves the problem of finding the most energy efficient way (least resource wastage and least power consumption) of placing the VMs considering their resource requirements

    Analysis and performance improvement of consumer-grade millimeter wave wireless networks

    Get PDF
    Millimeter-wave (mmWave) networks are one of the main key components in next cellular and WLANs (Wireless Local Area Networks). mmWave networks are capable of providing multi gigabit-per-second rates with very directional low-interference and high spatial reuse links. In 2013, the first 60 GHz wireless solution for WLAN appeared in the market. These were wireless docking stations under theWiGig protocol. Today, in 2019, 60 GHz communications have gained importance with the IEEE 802.11ad amendment with different products on the market, including routers, laptops and wireless Ethernet solutions. More importantly, mmWave networks are going to be used in next generation cellular networks, where smartphones will be using the 28 GHz band. For backbone links, 60 GHz communications have been proposed due to its higher directionality and unlicensed use. This thesis fits in this frame of constant development of themmWave bands to meet the needs of latency and throughput that will be necessary to support future communications. In this thesis, we first characterize the cost-effective design of COTS (commercial off-the-shelf) 60 GHz devices and later we improve their two main weaknesses, which are their low link distance and their non-ideal spatial reuse. It is critical to take into consideration the cost-effective design of COTS devices when designing networking mechanisms. This is why in this thesis we do the first-of-its-kind COTS analysis of 60 GHz devices, studying the D5000 WiGig Docking station and the TP-Link Talon IEEE 802.11ad router. We include static measurements such as the synthesized beam patterns of these devices or an analysis of the area-wide coverage that these devices can fulfill. We perform a spatial reuse analysis and study the performance of these devices under user mobility, showing how robust the link can be under user movement. We also study the feasibility of having flying mmWave links. We mount a 60 GHz COTS device into a drone and perform different measurement campaigns. In this first analysis, we see that these 60 GHz devices have a large performance gap for the achieved communication range as well as a very low spatial reuse. However, they are still suitable for low density WLANs and for next generation aerial micro cell stations. Seeing that these COTS devices are not as directional as literature suggests, we analyze how channels are not as frequency stable as expected due to the large amount of reflected signals. Ideally, frequency selective techniques could be used in these frequency selective channels in order to enlarge the range of these 60 GHz devices. To validate this, we measure real-world 60 GHz indoor channels with a bandwidth of 2 GHz and study their behavior with respect to techniques such as bitloading, subcarrier switch-off, and waterfilling. To this end, we consider a Orthogonal Frequency-Division Multiplexing (OFDM) channel as defined in the IEEE 802.11ad standard and show that in point of fact, these techniques are highly beneficial in mmWave networks allowing for a range extension of up to 50%, equivalent to power savings of up to 7 dB. In order to increase the very limited spatial reuse of these wireless networks, we propose a centralized system that allows the network to carry out the beam training process not only to maximize power but also taking into account other stations in order to minimize interference. This system is designed to work with unmodified clients. We implement and validate our system on commercial off-the-shelf IEEE 802.11ad hardware, achieving an average throughput gain of 24.67% for TCP traffic, and up to a twofold throughput gain in specific cases.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Andrés García Saavedra.- Secretario: Matilde Pilar Sánchez Fernández.- Vocal: Ljiljana Simi
    • …
    corecore