284 research outputs found

    Towards Scalable Web Documents

    Get PDF
    The current Web is running into serious scalability problems. The standard solution is to apply techniques like caching, replication, and distribution. Unfortunately, as the variety of Web applications continues to grow, it will be impossible to find a single solution that fits all needs. The authors advocate a different approach to tackling scaling problems. Instead of seeking a general-purpose solution, they argue that it makes more sense to look at each Web document separately. For each document, three issues need to be addressed: placement of replicas, required coherence, and best coherence protocol. The authors examine each of these issues, and identify the alternatives. However, forcing developers to decide on the best alternatives will turn the Web into an unworkable system. Therefore, a number of possible ways to reduce complexity is indicated. Also, the authors briefly discuss a wide-area infrastructure that can be used as a flexible basis for developing per-document solutions

    Exploiting Semantic Proximity in Peer-to-Peer Content Searching

    Get PDF
    A lot of recent work has dealt with improving performance of content searching in peer-to-peer file sharing systems. In this paper we attack this problem by modifying the overlay topology describing the peer relations in the system. More precisely, we create a semantic overlay, linking nodes that are "semantically close", by which we mean that they are interested in similar documents. This semantic overlay provides the primary search mechanism, while the initial peer-to-peer system provides the fail-over search mechanism. We focus on implicit approaches for discovering semantic proximity. We evaluate and compare three candidate methods, and review open questions

    Yes, Topology Matters in Decentralized Optimization: Refined Convergence and Topology Learning under Heterogeneous Data

    Full text link
    One of the key challenges in federated and decentralized learning is to design algorithms that efficiently deal with highly heterogeneous data distributions across agents. In this paper, we revisit the analysis of Decentralized Stochastic Gradient Descent algorithm (D-SGD), a popular decentralized learning algorithm, under data heterogeneity. We exhibit the key role played by a new quantity, that we call neighborhood heterogeneity, on the convergence rate of D-SGD. Unlike prior work, neighborhood heterogeneity is measured at the level of the neighborhood of an agent in the graph topology. By coupling the topology and the heterogeneity of the agents' distributions, our analysis sheds light on the poorly understood interplay between these two concepts in decentralized learning. We then argue that neighborhood heterogeneity provides a natural criterion to learn sparse data-dependent topologies that reduce (and can even eliminate) the otherwise detrimental effect of data heterogeneity on the convergence time of D-SGD. For the important case of classification with label skew, we formulate the problem of learning such a good topology as a tractable optimization problem that we solve with a Frank-Wolfe algorithm. Our approach provides a principled way to design a sparse topology that balances the number of iterations and the per-iteration communication costs of D-SGD under data heterogeneity

    Adaptive gossip-based broadcast

    Get PDF
    This paper presents a novel adaptation mechanism that allows every node of a gossip-based broadcast algorithm to adjust the rate of message emission 1) to the amount of resources available to the nodes within the same broadcast group and 2) to the global level of congestion in the system. The adaptation mechanism can be applied to all gossip-based broadcast algorithms we know of and makes their use more realistic in practical situations where nodes have limited resources whose quantity changes dynamically with time without decreasing the reliability.(undefined

    Distributed Slicing in Dynamic Systems

    Get PDF
    Peer to peer (P2P) systems are moving from application specific architectures to a generic service oriented design philosophy. This raises interesting problems in connection with providing useful P2P middleware services capable of dealing with resource assignment and management in a large-scale, heterogeneous and unreliable environment. The slicing service, has been proposed to allow for an automatic partitioning of P2P networks into groups (slices) that represent a controllable amount of some resource and that are also relatively homogeneous with respect to that resource. In this paper we propose two gossip-based algorithms to solve the distributed slicing problem. The first algorithm speeds up an existing algorithm sorting a set of uniform random numbers. The second algorithm statistically approximates the rank of nodes in the ordering. The scalability, efficiency and resilience to dynamics of both algorithms rely on their gossip-based models. These algorithms are proved viable theoretically and experimentally

    Dynamics of Rumor Spreading in Complex Networks

    Full text link
    We derive the mean-field equations characterizing the dynamics of a rumor process that takes place on top of complex heterogeneous networks. These equations are solved numerically by means of a stochastic approach. First, we present analytical and Monte Carlo calculations for homogeneous networks and compare the results with those obtained by the numerical method. Then, we study the spreading process in detail for random scale-free networks. The time profiles for several quantities are numerically computed, which allow us to distinguish among different variants of rumor spreading algorithms. Our conclusions are directed to possible applications in replicated database maintenance, peer to peer communication networks and social spreading phenomena.Comment: Final version to appear in PR

    NEEM: network-friendly epidemic multicast

    Get PDF
    Epidemic, or probabilistic, multicast protocols have emerged as a viable mechanism to circumvent the scalabil- ity problems of reliable multicast protocols. However, most existing epidemic approaches use connectionless transport protocols to exchange messages and rely on the intrinsic robustness of the epidemic dissemination to mask network omissions. Unfortunately, such an approach is not network- friendly, since the epidemic protocol makes no effort to re- duce the load imposed on the network when the system is congested. In this paper, we propose a novel epidemic protocol whose main characteristic is to be network-friendly. This property is achieved by relying on connection-oriented transport connections, such as TCP/IP, to support the com- munication among peers. Since during congestion mes- sages accumulate in the border of the network, the pro- tocol uses an innovative buffer management scheme, that combines different selection techniques to discard messages upon overflow. This technique improves the quality of the information delivered to the application during periods of network congestion. The protocol has been implemented and the benefits of the approach are illustrated using a com- bination of experimental and simulation results
    corecore