54,083 research outputs found

    On Lifetime Maximization and Fault Tolerance Measurement in Wireless Ad Hoc and Sensor Networks

    Get PDF
    In this dissertation we study two important issues in wireless ad hoc and sensor networks: lifetime maximization and fault tolerance. The first part investigates how to maximally extend the lifetime of randomly deployed wireless sensor networks under limited resource constraint, and the second part focuses on how to measure the fault tolerance and attack resilience of wireless ad hoc networks. We take the approach of adaptive traffic distribution and power control to maximize the lifetime of randomly deployed wireless sensor networks. After abstracting the network into multiple layers, we model the lifetime maximization problem as a linear program. We study both scenarios where receiving/processing power consumption is ignored and receiving/processing is included. In both cases, we have a similar observation: for each packet to be sent, the sender should either transmit it using the transmission range with the highest energy efficiency per bit per meter, or transmit it directly to the sink. We then prove it is true in general. Finally, we propose a fully distributed algorithm to adaptively split traffic and adjust transmission power. Extensive simulation studies demonstrate that the network lifetime can be dramatically extended by applying the proposed approach in various scenarios. Besides studying the lifetime extension problem for fully deployed wireless sensor networks, we also investigate how to extend the network lifetime via joint relay node deployment and adaptive traffic distribution. We formulate this problem as a mixed-integer nonlinear-program problem, which is NP-hard in general. We then propose a greedy heuristic to attack it. Both numerical and simulation results show that significant network lifetime extension can be achieved. In the second part of this dissertation, we investigate how to measure the fault tolerance and attack resilience for randomly deployed wireless ad hoc networks. We first propose two new metrics to measure the average case of network service quality: average pairwise connectivity and pairwise connected ratio. We then propose the fault tolerance and attack resilience metric: alpha-p-resilience, where a network is alpha-p-resilient if at least alpha portion of nodes pairs remain connected as long as no more than p fraction of nodes is removed from the network

    Resource efficient redundancy using quorum-based cycle routing in optical networks

    Get PDF
    In this paper we propose a cycle redundancy technique that provides optical networks almost fault-tolerant point-to-point and multipoint-to-multipoint communications. The technique more importantly is shown to approximately halve the necessary light-trail resources in the network while maintaining the fault-tolerance and dependability expected from cycle-based routing. For efficiency and distributed control, it is common in distributed systems and algorithms to group nodes into intersecting sets referred to as quorum sets. Optimal communication quorum sets forming optical cycles based on light-trails have been shown to flexibly and efficiently route both point-to-point and multipoint-to-multipoint traffic requests. Commonly cycle routing techniques will use pairs of cycles to achieve both routing and fault-tolerance, which uses substantial resources and creates the potential for underutilization. Instead, we intentionally utilize redundancy within the quorum cycles for fault-tolerance such that almost every point-to-point communication occurs in more than one cycle. The result is a set of cycles with 96.60% - 99.37% fault coverage, while using 42.9% - 47.18% fewer resources.Comment: 17th International Conference on Transparent Optical Networks (ICTON), 5-9 July 2015. arXiv admin note: substantial text overlap with arXiv:1608.05172, arXiv:1608.0516

    Edge-Fault Tolerance of Hypercube-like Networks

    Full text link
    This paper considers a kind of generalized measure λs(h)\lambda_s^{(h)} of fault tolerance in a hypercube-like graph GnG_n which contain several well-known interconnection networks such as hypercubes, varietal hypercubes, twisted cubes, crossed cubes and M\"obius cubes, and proves λs(h)(Gn)=2h(nh)\lambda_s^{(h)}(G_n)= 2^h(n-h) for any hh with 0hn10\leqslant h\leqslant n-1 by the induction on nn and a new technique. This result shows that at least 2h(nh)2^h(n-h) edges of GnG_n have to be removed to get a disconnected graph that contains no vertices of degree less than hh. Compared with previous results, this result enhances fault-tolerant ability of the above-mentioned networks theoretically

    What does fault tolerant Deep Learning need from MPI?

    Full text link
    Deep Learning (DL) algorithms have become the de facto Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive - even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults - requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: What is needed from MPI for de- signing fault tolerant DL implementations? In this paper, we address this problem for permanent faults. We motivate the need for a fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by ex- tending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet, and GoogLeNet neural network topologies demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM

    A method for analyzing the performance aspects of the fault-tolerance mechanisms in FDDI

    Get PDF
    The ability of error recovery mechanisms to make the Fiber Distributed Data Interface (FDDI) satisfy real-time performance constraints in the presence of errors is analyzed. A complicating factor in these analyses is the rarity of the error occurrences, which makes direct simulation unattractive. Therefore, a fast simulation technique, called injection simulation, which makes it possible to analyze the performance of FDDI, including its fault tolerance behavior, was developed. The implementation of injection simulation for polling models of FDDI is discussed, along with simulation result
    corecore