6 research outputs found

    Investigation of the robustness of star graph networks

    Full text link
    The star interconnection network has been known as an attractive alternative to n-cube for interconnecting a large number of processors. It possesses many nice properties, such as vertex/edge symmetry, recursiveness, sublogarithmic degree and diameter, and maximal fault tolerance, which are all desirable when building an interconnection topology for a parallel and distributed system. Investigation of the robustness of the star network architecture is essential since the star network has the potential of use in critical applications. In this study, three different reliability measures are proposed to investigate the robustness of the star network. First, a constrained two-terminal reliability measure referred to as Distance Reliability (DR) between the source node u and the destination node I with the shortest distance, in an n-dimensional star network, Sn, is introduced to assess the robustness of the star network. A combinatorial analysis on DR especially for u having a single cycle is performed under different failure models (node, link, combined node/link failure). Lower bounds on the special case of the DR: antipode reliability, are derived, compared with n-cube, and shown to be more fault-tolerant than n-cube. The degradation of a container in a Sn having at least one operational optimal path between u and I is also examined to measure the system effectiveness in the presence of failures under different failure models. The values of MTTF to each transition state are calculated and compared with similar size containers in n-cube. Meanwhile, an upper bound under the probability fault model and an approximation under the fixed partitioning approach on the ( n-1)-star reliability are derived, and proved to be similarly accurate and close to the simulations results. Conservative comparisons between similar size star networks and n-cubes show that the star network is more robust than n-cube in terms of ( n-1)-network reliability

    Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems

    Get PDF
    Abstract Monitoring and troubleshooting distributed systems is notoriously di cult; potential problems are complex, varied, and unpredictable. e monitoring and diagnosis tools commonly used today -logs, counters, and metrics -have two important limitations: what gets recorded is de ned a priori, and the information is recorded in a component-or machine-centric way, making it extremely hard to correlate events that cross these boundaries. is paper presents Pivot Tracing, a monitoring framework for distributed systems that addresses both limitations by combining dynamic instrumentation with a novel relational operator: the happened-before join. Pivot Tracing gives users, at runtime, the ability to de ne arbitrary metrics at one point of the system, while being able to select, lter, and group by events meaningful at other parts of the system, even when crossing component or machine boundaries. We have implemented a prototype of Pivot Tracing for Java-based systems and evaluate it on a heterogeneous Hadoop cluster comprising HDFS, HBase, MapReduce, and YARN. We show that Pivot Tracing can e ectively identify a diverse range of root causes such as so ware bugs, miscon guration, and limping hardware. We show that Pivot Tracing is dynamic, extensible, and enables cross-tier analysis between inter-operating applications, with low execution overhead

    Second Annual Workshop on Space Operations Automation and Robotics (SOAR 1988)

    Get PDF
    Papers presented at the Second Annual Workshop on Space Operation Automation and Robotics (SOAR '88), hosted by Wright State University at Dayton, Ohio, on July 20, 21, 22, and 23, 1988, are documented herein. During the 4 days, approximately 100 technical papers were presented by experts from NASA, the USAF, universities, and technical companies. Panel discussions on Human Factors, Artificial Intelligence, Robotics, and Space Systems were held but are not documented herein. Technical topics addressed included knowledge-based systems, human factors, and robotics
    corecore