2,387 research outputs found

    Concurrent Distributed Serving with Mobile Servers

    Get PDF
    This paper introduces a new resource allocation problem in distributed computing called distributed serving with mobile servers (DSMS). In DSMS, there are k identical mobile servers residing at the processors of a network. At arbitrary points of time, any subset of processors can invoke one or more requests. To serve a request, one of the servers must move to the processor that invoked the request. Resource allocation is performed in a distributed manner since only the processor that invoked the request initially knows about it. All processors cooperate by passing messages to achieve correct resource allocation. They do this with the goal to minimize the communication cost. Routing servers in large-scale distributed systems requires a scalable location service. We introduce the distributed protocol Gnn that solves the DSMS problem on overlay trees. We prove that Gnn is starvation-free and correctly integrates locating the servers and synchronizing the concurrent access to servers despite asynchrony, even when the requests are invoked over time. Further, we analyze Gnn for "one-shot" executions, i.e., all requests are invoked simultaneously. We prove that when running Gnn on top of a special family of tree topologies - known as hierarchically well-separated trees (HSTs) - we obtain a randomized distributed protocol with an expected competitive ratio of O(log n) on general network topologies with n processors. From a technical point of view, our main result is that Gnn optimally solves the DSMS problem on HSTs for one-shot executions, even if communication is asynchronous. Further, we present a lower bound of Omega(max {k, log n/log log n}) on the competitive ratio for DSMS. The lower bound even holds when communication is synchronous and requests are invoked sequentially

    Improving Utility of GPU in Accelerating Industrial Applications with User-centred Automatic Code Translation

    Get PDF
    SMEs (Small and medium-sized enterprises), particularly those whose business is focused on developing innovative produces, are limited by a major bottleneck on the speed of computation in many applications. The recent developments in GPUs have been the marked increase in their versatility in many computational areas. But due to the lack of specialist GPU (Graphics processing units) programming skills, the explosion of GPU power has not been fully utilized in general SME applications by inexperienced users. Also, existing automatic CPU-to-GPU code translators are mainly designed for research purposes with poor user interface design and hard-to-use. Little attentions have been paid to the applicability, usability and learnability of these tools for normal users. In this paper, we present an online automated CPU-to-GPU source translation system, (GPSME) for inexperienced users to utilize GPU capability in accelerating general SME applications. This system designs and implements a directive programming model with new kernel generation scheme and memory management hierarchy to optimize its performance. A web-service based interface is designed for inexperienced users to easily and flexibly invoke the automatic resource translator. Our experiments with non-expert GPU users in 4 SMEs reflect that GPSME system can efficiently accelerate real-world applications with at least 4x and have a better applicability, usability and learnability than existing automatic CPU-to-GPU source translators

    Compositional competitiveness for distributed algorithms

    Full text link
    We define a measure of competitive performance for distributed algorithms based on throughput, the number of tasks that an algorithm can carry out in a fixed amount of work. This new measure complements the latency measure of Ajtai et al., which measures how quickly an algorithm can finish tasks that start at specified times. The novel feature of the throughput measure, which distinguishes it from the latency measure, is that it is compositional: it supports a notion of algorithms that are competitive relative to a class of subroutines, with the property that an algorithm that is k-competitive relative to a class of subroutines, combined with an l-competitive member of that class, gives a combined algorithm that is kl-competitive. In particular, we prove the throughput-competitiveness of a class of algorithms for collect operations, in which each of a group of n processes obtains all values stored in an array of n registers. Collects are a fundamental building block of a wide variety of shared-memory distributed algorithms, and we show that several such algorithms are competitive relative to collects. Inserting a competitive collect in these algorithms gives the first examples of competitive distributed algorithms obtained by composition using a general construction.Comment: 33 pages, 2 figures; full version of STOC 96 paper titled "Modular competitiveness for distributed algorithms.

    Scalable and Reliable Middlebox Deployment

    Get PDF
    Middleboxes are pervasive in modern computer networks providing functionalities beyond mere packet forwarding. Load balancers, intrusion detection systems, and network address translators are typical examples of middleboxes. Despite their benefits, middleboxes come with several challenges with respect to their scalability and reliability. The goal of this thesis is to devise middlebox deployment solutions that are cost effective, scalable, and fault tolerant. The thesis includes three main contributions: First, distributed service function chaining with multiple instances of a middlebox deployed on different physical servers to optimize resource usage; Second, Constellation, a geo-distributed middlebox framework enabling a middlebox application to operate with high performance across wide area networks; Third, a fault tolerant service function chaining system

    On the Scalability of Addressing in Private Networks Using RPX

    Get PDF
    In recent times, the imminent lack of public IPv4 addresses has attracted the attention of both research community and industry. The cellular industry has decided to combat this problem by using IPv6 for all new terminals. However, the success of 3G network deployment will depend on the services offered to end users. Currently, almost all services reside in the public IPv4 address space, making them inaccessible to users in IPv6 networks. Thus, an intermediate translation mechanism is required. Previous studies on network address translation methods have shown that REBEKAH-IP with Port Extension (RPX) supports all types of services that can be offered to IPv6 terminals from the public IPv4 based Internet, and provides excellent scalability. However, this method suffers from an ambiguity problem which may lead to call blocking. In this paper, we present an improvement to RPX scheme in which the side effect is removed and fully scalable system. We firstly show the expected number of public IPv4 addresses utilization to the DNS of RPX server. This utilization is computed in terms of the probability of socket open requests from mobile terminals, the probability of call blocking and the estimated number of mobile terminals at the network initialization phase. The mathematical model is also provided as a guideline to determine the range of public IPv4 addresses allocated to an RPX gateway in a cellular network. In addition, the results are presented through a set of simulations. However, we proposed the RPX scheme to use a simple round robin scheduling algorithm is sub-optimal in terms of call blocking probability and further propose to use a priority queue algorithm to improve the scalability. In addition, we present extensive simulation results on the practical scalability of RPX with different traffic compositions to provide a guideline of the expected scalability in large-scale networks such as 3G networks
    corecore