3 research outputs found

    Algorithms for Data Migration

    Get PDF
    This thesis is concerned with the problem related to data storage and management. A large storage server consists of several hundreds of disks. To balance the load across disks, the system computes data layouts that are typically adjusted according to the workload. As workloads change over time, the system recomputes the data layout, and rearranges the data items according to the new layout. We identify the problem of computing an efficient data migration plan that converts an initial layout to a target layout. We define the data migration problem as follows: for each item, there are a set of disks that have the item (sources) and a set of disks that want to receive the item (destinations). We want to migrate the data items from the sources to destinations. The crucial constraint is that each disk can participate in only one transfer at a time. The most common objective has been to minimize the makespan, which is the time when we finish all the migrations. The problem is NP-hard, and we develop polynomial time algorithms with constant factor approximation guarantees and several other heuristic algorithms. We present the performance evaluation of the different methods through an experimental study. We also consider the data migration problem to minimize the sum of completion times over all migration jobs or storage devices. Minimizing the sum of completion times of jobs is one of the most common objectives in scheduling literature. On the other hand, since a storage device may run inefficiently while the device is involved in migrations, another interesting objective is to minimize the sum of completion times over all storage devices. We present hardness results and constant factor approximation algorithms for these objectives. In addition, we consider the case when we have a heterogeneous collection of machines. We assume that heterogeneity is modeled by a non-uniform speed of the sending machine. For the basic problem of multicasting and broadcasting in the model, we show that Fastest Node First scheme gives a approximation ratio of 1.5 for minimizing the makespan. We also prove that there is a polynomial time approximation scheme

    Algorithms for Data Dissemination and Collection

    Get PDF
    Broadcasting and gossiping are classical problems that have been widely studied for decades. In broadcasting, one source node wishes to send a message to every other node, while in gossiping, each node has a message that they wish to send to everyone else. Both are some of the most basic problems arising in communication networks. In this dissertation we study problems that generalize gossiping and broadcasting. For example, the source node may have several messages to broadcast or multicast. Many of the works on broadcasting in the literature are focused on homogeneous networks. The algorithms developed are more applicable to managing data on local-area networks. However, large-scale storage systems often consist of storage devices clustered over a wide-area network. Finding a suitable model and developing algorithms for broadcast that recognize the heterogeneous nature of the communication network is a significant part of this dissertation. We also address the problem of data collection in a wide-area network, which has largely been neglected, and is likely to become more significant as the Internet becomes more embedded in everyday life. We consider a situation where large amounts of data have to be moved from several different locations to a destination. In this work, we focus on two key properties: the available bandwidth can fluctuate, and the network may not choose the best route to transfer the data between two hosts. We focus on improving the task completion time by re-routing the data through intermediate hosts and show that under certain network conditions we can reduce the total completion time by a factor of two. This is done by developing an approach for computing coordinated data collection schedules using network flows

    Approximation Algorithms for Broadcasting in Simple Graphs with Intersecting Cycles

    Get PDF
    Broadcasting is an information dissemination problem in a connected network in which one node, called the originator, must distribute a message to all other nodes by placing a series of calls along the communication lines of the network. Every time the informed nodes aid the originator in distributing the message. Finding the minimum broadcast time of any vertex in an arbitrary graph is NP-Complete. The problem remains NP-Complete even for planar graphs of degree 3 and for a graph whose vertex set can be partitioned into a clique and an independent set. The best theoretical upper bound gives logarithmic approximation. It has been shown that the broadcasting problem is NP-Hard to approximate within a factor of 3-ɛ. The polynomial time solvability is shown only for tree-like graphs; trees, unicyclic graphs, tree of cycles, necklace graphs and some graphs where the underlying graph is a clique; such as fully connected trees and tree of cliques. In this thesis we study the broadcast problem in different classes of graphs where cycles intersect in at least one vertex. First we consider broadcasting in a simple graph where several cycles have common paths and two intersecting vertices, called a k-path graph. We present a constant approximation algorithm to find the broadcast time of an arbitrary k-path graph. We also study the broadcast problem in a simple cactus graph called k-cycle graph where several cycles of arbitrary lengths are connected by a central vertex on one end. We design a constant approximation algorithm to find the broadcast time of an arbitrary k-cycle graph. Next we study the broadcast problem in a hypercube of trees for which we present a 2-approximation algorithm for any originator. We provide a linear algorithm to find the broadcast time in hypercube of trees with one tree. We extend the result for any arbitrary graph whose nodes contain trees and design a linear time constant approximation algorithm where the broadcast scheme in the arbitrary graph is already known. In Chapter 6 we study broadcasting in Harary graph for which we present an additive approximation which gives 2-approximation in the worst case to find the broadcast time in an arbitrary Harary graph. Next for even values of n, we introduce a new graph, called modified-Harary graph and present a 1-additive approximation algorithm to find the broadcast time. We also show that a modified-Harary graph is a broadcast graph when k is logarithmic of n. Finally we consider a diameter broadcast problem where we obtain a lower bound on the broadcast time of the graph which has at least (d+k-1 choose d) + 1 vertices that are at a distance d from the originator, where k >= 1
    corecore