23 research outputs found

    Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation

    Get PDF
    Cataloged from PDF version of article.The PageRank algorithm is an important component in effective web search. At the core of this algorithm are repeated sparse matrix-vector multiplications where the involved web matrices grow in parallel with the growth of the web and are stored in a distributed manner due to space limitations. Hence, the PageRank computation, which is frequently repeated, must be performed in parallel with high-efficiency and low-preprocessing overhead while considering the initial distributed nature of the web matrices. Our contributions in this work are twofold. We first investigate the application of state-of-the-art sparse matrix partitioning models in order to attain high efficiency in parallel PageRank computations with a particular focus on reducing the preprocessing overhead they introduce. For this purpose, we evaluate two different compression schemes on the web matrix using the site information inherently available in links. Second, we consider the more realistic scenario of starting with an initially distributed data and extend our algorithms to cover the repartitioning of such data for efficient PageRank computation. We report performance results using our parallelization of a state-of-the-art PageRank algorithm on two different PC clusters with 40 and 64 processors. Experiments show that the proposed techniques achieve considerably high speedups while incurring a preprocessing overhead of several iterations (for some instances even less than a single iteration) of the underlying sequential PageRank algorithm. ยฉ 2011 IEEE

    MINIMIZATION OF RESOURCE CONSUMPTION THROUGH WORKLOAD CONSOLIDATION IN LARGE-SCALE DISTRIBUTED DATA PLATFORMS

    Get PDF
    The rapid increase in the data volumes encountered in many application domains has led to widespread adoption of parallel and distributed data management systems like parallel databases and MapReduce-based frameworks (e.g., Hadoop) in recent years. Use of such parallel and distributed frameworks is expected to accelerate in the coming years, putting further strain on already-scarce resources like compute power, network bandwidth, and energy. To reduce total execution times, there is a trend towards increasing execution parallelism by spreading out data across a large number of machines. However, this often increases the total resource consumption, and especially energy consumption, significantly because of process startup costs and other overheads (e.g., communication overheads). In this dissertation, we develop several data management techniques to minimize resource consumption through workload consolidation. In this dissertation, we introduce a key metric called query span, i.e., number of machines involved in the execution of a query or a job. In order to minimize the per query resource consumption we propose to minimize query span. To that end, we develop several workload-driven data partitioning and replica selection algorithms that attempt to minimize the average query span by exploiting the fact that most distributed environments need to use replication for fault tolerance. Extensive experiments on various datasets show that judicious data placement and replication can dramatically reduce the average query spans resulting in significant reductions in resource consumption. We show our results primarily on two applications, distributed data warehouse system and distributed information retrieval. In the first case, we show that minimizing average query spans can minimize overall resource consumption for a given workload and can also improve the performance of complex analytical queries. In the second case, our approach minimizes the overall search cost as well as effectively trades off search cost with load imbalance. The best case of resource efficiency for any underlying data processing system is achieved when the job or the query can be run efficiently on a single machine (i.e., query span=1). In the final part of dissertation, we discuss an in-memory MapReduce system optimized for performing complex analytics tasks on input data sizes that fit in a single machine's memory. We argue that systems like Hadoop that are designed to operate across a large number of machines are not optimal in performance for small and medium sized complex analytics tasks because of high startup costs, heavy disk activity, and wasteful checkpointing. We have developed a prototype runtime called HONE that is API compatible with standard (distributed) Hadoop. In other words, we can take existing Hadoop code and run it, without modification, on a multi-core shared memory machine. This allows us to take existing Hadoop algorithms and find the most suitable runtime environment for execution on datasets of varying sizes. Overall, in this dissertation, our key contributions in this work include identification of key metric query span and its relationship with overall resource consumption in scale-out architectures. We introduce several workload-aware techniques to optimize this key metric. We go on to demonstrate the effectiveness of query span minimization on different application scenarios. In order to take advantage of scale-up architectures effectively we develop novel in-memory MapReduce system HONE for single machine. Our thorough experiments on real and synthetic datasets demonstrate the efficacy of our proposed approaches

    A Scalable Framework for Parallelizing Sampling-Based Motion Planning Algorithms

    Get PDF
    Motion planning is defined as the problem of finding a valid path taking a robot (or any movable object) from a given start configuration to a goal configuration in an environment. While motion planning has its roots in robotics, it now finds application in many other areas of scientific computing such as protein folding, drug design, virtual prototyping, computer-aided design (CAD), and computer animation. These new areas test the limits of the best sequential planners available, motivating the need for methods that can exploit parallel processing. This dissertation focuses on the design and implementation of a generic and scalable framework for parallelizing motion planning algorithms. In particular, we focus on sampling-based motion planning algorithms which are considered to be the state-of-the-art. Our work covers the two broad classes of sampling-based motion planning algorithms--the graph-based and the tree-based methods. Central to our approach is the subdivision of the planning space into regions. These regions represent sub- problems that can be processed in parallel. Solutions to the sub-problems are later combined to form a solution to the entire problem. By subdividing the planning space and restricting the locality of connection attempts to adjacent regions, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We also describe how load balancing strategies can be applied in complex environments. We present experimental results that scale to thousands of processors on different massively parallel machines for a range of motion planning problems

    ๊ฐ€์ƒํ™” ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. Bernhard Egger.ํด๋ผ์šฐ๋“œ ํ™˜๊ฒฝ์€ ๊ฑฐ๋Œ€ํ•œ ์—ฐ์‚ฐ ์ž์›์„ ์ƒ์‹œ ๊ฐ€๋™ํ•  ํ•„์š” ์—†๊ณ  ์›ํ•˜๋Š” ์ˆœ๊ฐ„ ์›ํ•˜๋Š” ์–‘์˜ ๋Œ€ํ•œ ์—ฐ์‚ฐ ๋น„์šฉ๋งŒ์„ ์ง€๋ถˆํ•˜๋ฉด ๋˜๊ธฐ ๋•Œ๋ฌธ์—, ์ตœ๊ทผ ์ธ๊ณต์ง€๋Šฅ ๋ฐ ๋น…๋ฐ์ดํ„ฐ ์—ฐ์‚ฐ์˜ ์œ ํ–‰์œผ๋กœ ์ธํ•ด ๊ทธ ์ˆ˜์š”๊ฐ€ ํฌ๊ฒŒ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ํด๋ผ์šฐ๋“œ ์ปดํ“จํŒ…์˜ ๋„์ž…์œผ๋กœ์ธํ•ด ๊ณ ๊ฐ์€ ์„œ๋ฒ„ ์œ ์ง€์— ๋Œ€ํ•œ ๋น„์šฉ์„ ํฌ๊ฒŒ ์ ˆ๊ฐํ•  ์ˆ˜ ์žˆ๊ณ  ์„œ๋น„์Šค ์ œ๊ณต์ž๋Š” ์—ฐ์‚ฐ ์ž์›์˜ ์ด์šฉ ํšจ์œจ์„ ๊ทน๋Œ€ํ™” ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ๋ฐ์ดํ„ฐ์„ผํ„ฐ ์ž…์žฅ์—์„œ๋Š” ์—ฐ์‚ฐ ์ž์› ํ™œ์šฉ ํšจ์œจ์„ ๊ฐœ์„ ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•œ ๋ชฉํ‘œ๊ฐ€ ๋œ๋‹ค. ํŠนํžˆ ์ตœ๊ทผ ํญ์ฆํ•˜๊ณ  ์žˆ๋Š” ๋ฐ์ดํ„ฐ ์„ผํ„ฐ์˜ ๊ทœ๋ชจ๋ฅผ ๊ณ ๋ คํ•˜๋ฉด ์ž‘์€ ํšจ์œจ ๊ฐœ์„ ์œผ๋กœ๋„ ๋ง‰๋Œ€ํ•œ ๊ฒฝ์ œ์  ๊ฐ€์น˜๋ฅผ ์ฐฝ์ถœ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ฐ์ดํ„ฐ ์„ผํ„ฐ์˜ ํšจ์œจ์€ ์œ„์น˜ ์„ ์ •, ๊ตฌ์กฐ ์„ค๊ณ„, ๋ƒ‰๊ฐ ์‹œ์Šคํ…œ, ํ•˜๋“œ์›จ์–ด ๊ตฌ์„ฑ ๋“ฑ๋“ฑ ๋‹ค์–‘ํ•œ ์š”์†Œ๋“ค์— ์˜ํ–ฅ์„ ๋ฐ›์ง€๋งŒ, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ํŠนํžˆ ์—ฐ์‚ฐ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ ์ž์›์„ ๊ด€๋ฆฌํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ์„ค๊ณ„ ๋ฐ ๊ตฌํ˜„์„ ๋‹ค๋ฃฌ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ํšจ์œจ ๊ฐœ์„ ์„ ํš๊ธฐ์ ์œผ๋กœ ๊ฐœ์„ ํ•˜๋Š” ๋‘๊ฐ€์ง€ ์†Œํ”„ํŠธ์›จ์–ด ๊ธฐ๋ฐ˜ ๊ธฐ์ˆ ์„ ์ œ์•ˆํ•œ๋‹ค. ์ฒซ ์งธ๋กœ ๊ฐ€์ƒํ™” ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์†Œํ”„ํŠธ์›จ์–ด ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ ๋ถ„๋ฆฌ ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•œ๋‹ค. ์ตœ๊ทผ ๊ณ ์† ๋„คํŠธ์›Œํฌ์˜ ๋ฐœ์ „์œผ๋กœ ์ธํ•ด ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ๋น„์šฉ์ด ํš๊ธฐ์ ์œผ๋กœ ์ค„์–ด ๋“ค์—ˆ๊ณ , ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๊ณ ์„ฑ๋Šฅ ๋„คํŠธ์›Œํ‚น ํ•˜๋“œ์›จ์–ด๋ฅผ ์ด์šฉํ•˜์—ฌ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ์œ„์—์„œ ์‹คํ–‰๋˜๋Š” ๊ฐ€์ƒ ๋จธ์‹ ์˜ ํฐ ์„ฑ๋Šฅ ์ €ํ•˜ ์—†์ด ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค. ์ œ์•ˆ๋œ ๊ธฐ์ˆ ์„ QEMU/KVM ๊ฐ€์ƒ๋จธ์‹  ํ•˜์ดํผ๋ฐ”์ด์ €๋ฅผ ํ†ตํ•ด ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ, ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•์€ ๊ธฐ์กด ์‹œ์Šคํ…œ ๋Œ€๋น„ ์›๊ฒฉ ํŽ˜์ด์ง•์— ๋Œ€ํ•œ ๊ผฌ๋ฆฌ ์ง€์—ฐ์‹œ๊ฐ„์„ 98.2% ๊ฐœ์„ ํ•จ์„ ๋ณด์ธ๋‹ค. ๋˜ํ•œ ๋ž™ ๊ทœ๋ชจ์˜ ์ž‘์—…์ฒ˜๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•œ ์‹คํ—˜์—์„œ, ์ œ์•ˆ๋œ ์‹œ์Šคํ…œ์€ ์ „์ฒด ์ž‘์—… ์ฒ˜๋ฆฌ ์‹œ๊ฐ„์„ ๊ธฐ์กด ์‹œ์Šคํ…œ ๋Œ€๋น„ 40.9% ์ค„์ผ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ด์šฉํ•˜๋Š” ์ฆ‰๊ฐ์ ์ธ ๊ฐ€์ƒ๋จธ์‹  ์ด์ฃผ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜๋‹ค. ๊ฐ€์ƒํ™” ํ™˜๊ฒฝ์˜ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ํ™œ์šฉ์— ๋Œ€ํ•œ ํ™•์žฅ์€ ๊ทธ๊ฒƒ๋งŒ์œผ๋กœ ์ž์› ์ด์šฉ๋ฅ  ํ–ฅ์ƒ์— ๋Œ€ํ•ด ํฐ ๊ธฐ์—ฌ๋ฅผ ํ•˜์ง€๋งŒ, ์—ฌ์ „ํžˆ ํ•œ ์„œ๋ฒ„์—์„œ ์—ฌ๋Ÿฌ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ๊ฒฝ์Ÿ์ ์œผ๋กœ ์ž์›์„ ์ด์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ์ €ํ•˜ ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ์ฆ‰๊ฐ์ ์ธ ๊ฐ€์ƒ๋จธ์‹  ์ด์ฃผ ๊ธฐ๋ฒ•์€ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ์ƒ์—์„œ ์•„์ฃผ ์ž‘์€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์˜ ์ „์†ก๋งŒ์œผ๋กœ ๊ฐ€์ƒ๋จธ์‹ ์˜ ์ด์ฃผ๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋ฉฐ, ๋ฉ”๋ชจ๋ฆฌ ์ƒ์— ํ‚ค์™€ ๊ฐ’์„ ์ €์žฅํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๋ฒค์น˜๋งˆํฌ๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ฐ€์ƒ๋จธ์‹ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ํ‰๊ฐ€์—์„œ ๊ธฐ์กด ๊ธฐ๋ฒ•๋Œ€๋น„ ์‹ค์งˆ์ ์ธ ์„œ๋น„์Šค ์ค‘๋‹จ์‹œ๊ฐ„์„ ์ตœ๋Œ€ 92.6% ๊ฐœ์„ ํ•จ์„ ๋ณด์ธ๋‹ค.The raising importance of big data and artificial intelligence (AI) has led to an unprecedented shift in moving local computation into the cloud. One of the key drivers behind this transformation was the exploding cost of owning and maintaining large computing systems powerful enough to process these new workloads. Customers experience a reduced cost by renting only the required resources and only when needed, while data center operators benefit from efficiency at scale. A key factor in operating a profitable data center is a high overall utilization of its resources. Due to the scale of modern data centers, small improvements in efficiency translate to significant savings in the total cost of ownership (TCO). There are many important elements that constitute an efficient data center such as its location, architecture, cooling system, or the employed hardware. In this thesis, we focus on software-related aspects, namely the utilization of computational and memory resources. Reports from data centers operated by Alibaba and Google show that the overall resource utilization has stagnated at a level of around 50 to 60 percent over the past decade. This low average utilization is mostly attributable to peak demand-driven resource allocation despite the high variability of modern workloads in their resource usage. In other words, data centers today lack an efficient way to put idle resources that are reserved but not used to work. In this dissertation we present RackMem, a software-based solution to address the problem of low resource utilization through two main contributions. First, we introduce a disaggregated memory system tailored for virtual environments. We observe that virtual machines can use remote memory without noticeable performance degradation under moderate memory pressure on modern networking infrastructure. We implement a specialized remote paging system for QEMU/KVM that reduces the remote paging tail-latency by 98.2% in comparison to the state of the art. A job processing simulation at rack-scale shows that the total makespan can be reduced by 40.9% under our memory system. While seamless disaggregated memory helps to balance memory usage across nodes, individual nodes can still suffer overloaded resources if co-located workloads exhibit high resource usage at the same time. In a second contribution, we present a novel live migration technique for machines running on top of our remote paging system. Under this instant live migration technique, entire virtual machines can be migrated in as little as 100 milliseconds. An evaluation with in-memory key-value database workloads shows that the presented migration technique improves the state of the art by a wide margin in all key performance metrics. The presented software-based solutions lay the technical foundations that allow data center operators to significantly improve the utilization of their computational and memory resources. As future work, we propose new job schedulers and load balancers to make full use of these new technical foundations.Chapter 1. Introduction 1 1.1 Contributions of the Dissertation 3 Chapter 2. Background 5 2.1 Resource Disaggregation 5 2.2 Transparent Remote Paging 7 2.3 Remote Direct Memory Access (RDMA) 9 2.4 Live Migration of Virtual Machines 10 Chapter 3. RackMem Overview 13 3.1 RackMem Virtual Memory 13 3.2 RackMem Distributed Virtual Storage 14 3.3 RackMem Networking 15 3.4 Instant VM Live Migration 16 Chapter 4. Virtual Memory 17 4.1 Design Considerations for Achieving Low-latency 19 4.2 Pagefault handling 20 4.2.1 Fast-path and slow-path in the pagefault handler 21 4.2.2 State transition of RackVM page 23 4.3 Latency Hiding Techniques 25 4.4 Implementation 26 4.4.1 RackMem Virtual Memory Module 27 4.4.2 Dynamic Rebalancing of Local Memory 29 4.4.3 RackVM for Virtual Machines 29 4.4.4 Running Unmodified Applications 30 Chapter 5. RackMem Distributed Virtual Storage 31 5.1 The distributed Storage Abstraction 32 5.2 Memory Management 33 5.2.1 Remote memory allocation 33 5.2.2 Remote memory reclamation 33 5.3 Fault Tolerance 34 5.3.1 Fault-tolerance and Write-duplication 34 5.4 Multiple Storage Support in RackMem 36 5.5 Implementation 38 5.5.1 The Remote Memory Backend 38 5.5.2 Linux Demand Paging on RackDVS 39 Chapter 6. Networking 40 6.1 Design of RackNet 40 6.2 Implementation 41 6.2.1 RPC message layout 41 6.2.2 RackNet RPC Implementation 42 Chapter 7. Instant VM Live Migration 44 7.1 Motivation 45 7.1.1 The need for a tailored live migration technique 45 7.1.2 Software Bottlenecks 46 7.1.3 Utilizing workload variability 46 7.2 Design of Instant 47 7.2.1 Instant Region Migration 47 7.3 Implementation 48 7.3.1 Extension of RackVM for Instant 49 7.3.2 Instant region migration 49 7.3.3 Pre-fetch optimizations 51 7.3.4 Downtime optimizations 51 7.3.5 QEMU modification for Instant 52 Chapter 8. Evaluation - RackMem 53 8.1 Execution Environment 54 8.2 Pagefault Handler Latency 56 8.3 Single Application Performance 57 8.3.1 Batch-oriented Applications 58 8.3.2 Internal Pagesize and Performance 59 8.3.3 Write-duplication overhead 60 8.3.4 RackDVS slab size and performance 62 8.3.5 Latency-oriented Applications 63 8.3.6 Network Bandwidth Analysis 64 8.3.7 Dynamic Local Memory Partitioning 66 8.3.8 Rack-scale Job Processing Simulation 67 Chapter 9. Evaluation - Instant VM Live Migration 69 9.1 Experimental setup 69 9.2 Target Applications 70 9.3 Comparison targets 70 9.4 Database and client setups 71 9.5 Memory disaggregation scenarios 71 9.6.1 Time-to-responsiveness 71 9.6.2 Effective Downtime 73 9.6.3 Effect of Instant optimizations 75 Chapter 10. Conclusion 77 10.1 Future Directions 78 ์š”์•ฝ 89๋ฐ•

    Distributed graph processing and partitioning for spatiotemporal queries in the context of camera networks

    Get PDF
    This work presents a scalable, distributed architecture for processing spatiotemporal queries in the context of camera networks based on a graph structure. With the ever-increasing presence of cameras and the emergence of camera-networks, e.g., in the context of campus security, it becomes increasingly important to provide a robust and scalable architecture to store and retrieve detected events. In this work a distributed graph processing engine will be presented which is well suited for read and write tasks in the environment of spatiotemporal image-similarity based workloads. The key ideas presented in this work are the architecture of a scalable graph processing system well-suited for processing spatio-temporal queries and the design of a distributed and robust vertex-partitioning strategy for the graph which is being defined by the spatiotemporal attributes of the stored events. The work will show multiple lightweight heuristics for partitioning the graph among the nodes participating in the system, focusing on load-balancing between workers and high edge-locality for vertices. The system and the partitioning strategies will be evaluated, showing that the system scales with the number of workers and the problem size and is able to answer proportionally more queries per second. It will also be shown that the lightweight heuristics for partitioning the graph produce a relatively good balancing of the vertices on the worker-nodes and can be executed in an online-fashion, resulting in similar performance when compared to a traditional hash-partitioning while providing far superior edge-locality.Diese Arbeit stellt eine verteilte und skalierbare Architektur zur Verarbeitung von rรคumlich-zeitlichen Anfragen auf Kamera-Netzwerken vor und basiert dabei auf einer Graph-Struktur. Aufgrund der allgegenwรคrtigen Prรคsenz von Kameras und dem Aufkommen von Kamera-Netzwerken, zum Beispiel im Bereich der รœberwachung eines Universitรคtsgelรคndes, wird es immer wichtiger, eine robuste und skalierbare Architektur zur Speicherung und Auffindung von detektierten Ereignissen anzubieten. In dieser Arbeit soll daher eine verteilte Graph-Verarbeitungs-Architektur vorgestellt werden, welche fรผr die Verarbeitung von Einfรผge- und Anfrage-Operationen im Umfeld von rรคumlich-zeitlichen Bild-ร„hnlichkeits-basierten Aufgaben gut geeignet ist. Die Haupt-Ideen dieser Arbeit sind dabei die Architektur eines skalierbaren Graph-Verarbeitungs-Systems zur Ausfรผhrung der rรคumlich-zeitlichen Anfragen und der Entwurf verteilter und robuster Knoten-Partitionierungs-Schemas fรผr den Graph der detektierten Ereignisse. Dafรผr werden mehrere, leichtgewichtigte Heuristiken zur Partitionierung des Graphen zwischen den am System teilnehmenden Rechner-Knoten vorgestellt, wobei der Fokus auf einer Lasten-Verteilung zwischen den Rechner-Knoten des Systems liegt, bei gleichzeitiger Optimierung der Kanten-Lokalitรคt. Das dabei entstehende System und die Partitionierungsstrategien werden evaluiert, wobei die Skalierbarkeit des Systems sowohl in der Zahl der teilnehmenden Rechner-Knoten als auch der GrรถรŸe des Graphen demonstriert wird, dies hat einen hรถheren Durchsatz an Anfragen pro Sekunde bei grรถรŸerer Anzahl an verfรผgbaren Rechner-Knoten als Konsequenz. Zudem wird fรผr die leichtgewichtigten Heuristiken zur Partitionierung des Graphen gezeigt, dass diese eine relativ gute Lastverteilung aufweisen und parallel zu anderen Systemfunktionen ausgefรผhrt werden kรถnnen, dies erlaubt ein รคhnliches Performanz-Verhalten wie eine naive, traditionelle Hash-basierende Partitionierung, weist allerdings eine deutlich verbesserte Kanten-Lokalitรคt auf

    Hypergraph Partitioning in the Cloud

    Get PDF
    The thesis investigates the partitioning and load balancing problem which has many applications in High Performance Computing (HPC). The application to be partitioned is described with a graph or hypergraph. The latter is of greater interest as hypergraphs, compared to graphs, have a more general structure and can be used to model more complex relationships between groups of objects such as non-symmetric dependencies. Optimal graph and hypergraph partitioning is known to be NP-Hard but good polynomial time heuristic algorithms have been proposed. In this thesis, we propose two multi-level hypergraph partitioning algorithms. The algorithms are based on rough set clustering techniques. The first algorithm, which is a serial algorithm, obtains high quality partitionings and improves the partitioning cut by up to 71\% compared to the state-of-the-art serial hypergraph partitioning algorithms. Furthermore, the capacity of serial algorithms is limited due to the rapid growth of problem sizes of distributed applications. Consequently, we also propose a parallel hypergraph partitioning algorithm. Considering the generality of the hypergraph model, designing a parallel algorithm is difficult and the available parallel hypergraph algorithms offer less scalability compared to their graph counterparts. The issue is twofold: the parallel algorithm and the complexity of the hypergraph structure. Our parallel algorithm provides a trade-off between global and local vertex clustering decisions. By employing novel techniques and approaches, our algorithm achieves better scalability than the state-of-the-art parallel hypergraph partitioner in the Zoltan tool on a set of benchmarks, especially ones with irregular structure. Furthermore, recent advances in cloud computing and the services they provide have led to a trend in moving HPC and large scale distributed applications into the cloud. Despite its advantages, some aspects of the cloud, such as limited network resources, present a challenge to running communication-intensive applications and make them non-scalable in the cloud. While hypergraph partitioning is proposed as a solution for decreasing the communication overhead within parallel distributed applications, it can also offer advantages for running these applications in the cloud. The partitioning is usually done as a pre-processing step before running the parallel application. As parallel hypergraph partitioning itself is a communication-intensive operation, running it in the cloud is hard and suffers from poor scalability. The thesis also investigates the scalability of parallel hypergraph partitioning algorithms in the cloud, the challenges they present, and proposes solutions to improve the cost/performance ratio for running the partitioning problem in the cloud. Our algorithms are implemented as a new hypergraph partitioning package within Zoltan. It is an open source Linux-based toolkit for parallel partitioning, load balancing and data-management designed at Sandia National Labs. The algorithms are known as FEHG and PFEHG algorithms
    corecore