2,650 research outputs found

    Overlay networks monitoring

    Get PDF
    The phenomenal growth of the Internet and its entry into many aspects of daily life has led to a great dependency on its services. Multimedia and content distribution applications (e.g., video streaming, online gaming, VoIP) require Quality of Service (QoS) guarantees in terms of bandwidth, delay, loss, and jitter to maintain a certain level of performance. Moreover, E-commerce applications and retail websites are faced with increasing demand for better throughput and response time performance. The most practical way to realize such applications is through the use of overlay networks, which are logical networks that implement service and resource management functionalities at the application layer. Overlays offer better deployability, scalability, security, and resiliency properties than network layer based implementation of services. Network monitoring and routing are among the most important issues in the design and operation of overlay networks. Accurate monitoring of QoS parameters is a challenging problem due to: (i) unbounded link stress in the underlying IP network, and (ii) the conflict in measurements caused by spatial and temporal overlap among measurement tasks. In this context, the focus of this dissertation is on the design and evaluation of efficient QoS monitoring and fault location algorithms using overlay networks. First, the issue of monitoring accuracy provided by multiple concurrent active measurements is studied on a large-scale overlay test-bed (PlanetLab), the factors affecting the accuracy are identified, and the measurement conflict problem is introduced. Then, the problem of conducting conflict-free measurements is formulated as a scheduling problem of real-time tasks, its complexity is proven to be NP-hard, and efficient heuristic algorithms for the problem are proposed. Second, an algorithm for minimizing monitoring overhead while controlling the IP link stress is proposed. Finally, the use of overlay monitoring to locate IP links\u27 faults is investigated. Specifically, the problem of designing an overlay network for verifying the location of IP links\u27 faults, under cost and link stress constraints, is formulated as an integer generalized flow problem, and its complexity is proven to be NP-hard. An optimal polynomial time algorithm for the relaxed problem (relaxed link stress constraints) is proposed. A combination of simulation and experimental studies using real-life measurement tools and Internet topologies of major ISP networks is conducted to evaluate the proposed algorithms. The studies show that the proposed algorithms significantly improve the accuracy and link stress of overlay monitoring, while incurring low overheads. The evaluation of fault location algorithms show that fast and highly accurate verification of faults can be achieved using overlay monitoring. In conclusion, the holistic view taken and the solutions developed for network monitoring provide a comprehensive framework for the design, operation, and evolution of overlay networks

    Applying Prolog to Develop Distributed Systems

    Get PDF
    Development of distributed systems is a difficult task. Declarative programming techniques hold a promising potential for effectively supporting programmer in this challenge. While Datalog-based languages have been actively explored for programming distributed systems, Prolog received relatively little attention in this application area so far. In this paper we present a Prolog-based programming system, called DAHL, for the declarative development of distributed systems. DAHL extends Prolog with an event-driven control mechanism and built-in networking procedures. Our experimental evaluation using a distributed hash-table data structure, a protocol for achieving Byzantine fault tolerance, and a distributed software model checker - all implemented in DAHL - indicates the viability of the approach

    Tiny Groups Tackle Byzantine Adversaries

    Full text link
    A popular technique for tolerating malicious faults in open distributed systems is to establish small groups of participants, each of which has a non-faulty majority. These groups are used as building blocks to design attack-resistant algorithms. Despite over a decade of active research, current constructions require group sizes of O(logn)O(\log n), where nn is the number of participants in the system. This group size is important since communication and state costs scale polynomially with this parameter. Given the stubbornness of this logarithmic barrier, a natural question is whether better bounds are possible. Here, we consider an attacker that controls a constant fraction of the total computational resources in the system. By leveraging proof-of-work (PoW), we demonstrate how to reduce the group size exponentially to O(loglogn)O(\log\log n) while maintaining strong security guarantees. This reduction in group size yields a significant improvement in communication and state costs.Comment: This work is supported by the National Science Foundation grant CCF 1613772 and a C Spire Research Gif

    Network-provider-independent overlays for resilience and quality of service.

    Get PDF
    PhDOverlay networks are viewed as one of the solutions addressing the inefficiency and slow evolution of the Internet and have been the subject of significant research. Most existing overlays providing resilience and/or Quality of Service (QoS) need cooperation among different network providers, but an inter-trust issue arises and cannot be easily solved. In this thesis, we mainly focus on network-provider-independent overlays and investigate their performance in providing two different types of service. Specifically, this thesis addresses the following problems: Provider-independent overlay architecture: A provider-independent overlay framework named Resilient Overlay for Mission-Critical Applications (ROMCA) is proposed. We elaborate its structure including component composition and functions and also provide several operational examples. Overlay topology construction for providing resilience service: We investigate the topology design problem of provider-independent overlays aiming to provide resilience service. To be more specific, based on the ROMCA framework, we formulate this problem mathematically and prove its NP-hardness. Three heuristics are proposed and extensive simulations are carried out to verify their effectiveness. Application mapping with resilience and QoS guarantees: Assuming application mapping is the targeted service for ROMCA, we formulate this problem as an Integer Linear Program (ILP). Moreover, a simple but effective heuristic is proposed to address this issue in a time-efficient manner. Simulations with both synthetic and real networks prove the superiority of both solutions over existing ones. Substrate topology information availability and the impact of its accuracy on overlay performance: Based on our survey that summarizes the methodologies available for inferring the selective substrate topology formed among a group of nodes through active probing, we find that such information is usually inaccurate and additional mechanisms are needed to secure a better inferred topology. Therefore, we examine the impact of inferred substrate topology accuracy on overlay performance given only inferred substrate topology information

    LHView: Location Aware Hybrid Partial View

    Get PDF
    The rise of the Cloud creates enormous business opportunities for companies to provide global services, which requires applications supporting the operation of those services to scale while minimizing maintenance costs, either due to unnecessary allocation of resources or due to excessive human supervision and administration. Solutions designed to support such systems have tackled fundamental challenges from individual component failure to transient network partitions. A fundamental aspect that all scalable large systems have to deal with is the membership of the system, i.e, tracking the active components that compose the system. Most systems rely on membership management protocols that operate at the application level, many times exposing the interface of a logical overlay network, that should guarantee high scalability, efficiency, and robustness. Although these protocols are capable of repairing the overlay in face of large numbers of individual components faults, when scaling to global settings (i.e, geo-distributed scenarios), this robustness is a double edged-sword because it is extremely complex for a node in a system to distinguish between a set of simultaneously node failures and a (transient) network partition. Thus the occurrence of a network partition creates isolated sub-sets of nodes incapable of reconnecting even after the recovery from the partition. This work address this challenges by proposing a novel datacenter-aware membership protocol to tolerate network partitions by applying existing overlay management techniques and classification techniques that may allow the system to efficiently cope with such events without compromising the remaining properties of the overlay network. Furthermore, we strive to achieve these goals with a solution that requires minimal human intervention
    corecore