2,730 research outputs found

    Routing on the Channel Dependency Graph:: A New Approach to Deadlock-Free, Destination-Based, High-Performance Routing for Lossless Interconnection Networks

    Get PDF
    In the pursuit for ever-increasing compute power, and with Moore's law slowly coming to an end, high-performance computing started to scale-out to larger systems. Alongside the increasing system size, the interconnection network is growing to accommodate and connect tens of thousands of compute nodes. These networks have a large influence on total cost, application performance, energy consumption, and overall system efficiency of the supercomputer. Unfortunately, state-of-the-art routing algorithms, which define the packet paths through the network, do not utilize this important resource efficiently. Topology-aware routing algorithms become increasingly inapplicable, due to irregular topologies, which either are irregular by design, or most often a result of hardware failures. Exchanging faulty network components potentially requires whole system downtime further increasing the cost of the failure. This management approach becomes more and more impractical due to the scale of today's networks and the accompanying steady decrease of the mean time between failures. Alternative methods of operating and maintaining these high-performance interconnects, both in terms of hardware- and software-management, are necessary to mitigate negative effects experienced by scientific applications executed on the supercomputer. However, existing topology-agnostic routing algorithms either suffer from poor load balancing or are not bounded in the number of virtual channels needed to resolve deadlocks in the routing tables. Using the fail-in-place strategy, a well-established method for storage systems to repair only critical component failures, is a feasible solution for current and future HPC interconnects as well as other large-scale installations such as data center networks. Although, an appropriate combination of topology and routing algorithm is required to minimize the throughput degradation for the entire system. This thesis contributes a network simulation toolchain to facilitate the process of finding a suitable combination, either during system design or while it is in operation. On top of this foundation, a key contribution is a novel scheduling-aware routing, which reduces fault-induced throughput degradation while improving overall network utilization. The scheduling-aware routing performs frequent property preserving routing updates to optimize the path balancing for simultaneously running batch jobs. The increased deployment of lossless interconnection networks, in conjunction with fail-in-place modes of operation and topology-agnostic, scheduling-aware routing algorithms, necessitates new solutions to solve the routing-deadlock problem. Therefore, this thesis further advances the state-of-the-art by introducing a novel concept of routing on the channel dependency graph, which allows the design of an universally applicable destination-based routing capable of optimizing the path balancing without exceeding a given number of virtual channels, which are a common hardware limitation. This disruptive innovation enables implicit deadlock-avoidance during path calculation, instead of solving both problems separately as all previous solutions

    Survey of Inter-satellite Communication for Small Satellite Systems: Physical Layer to Network Layer View

    Get PDF
    Small satellite systems enable whole new class of missions for navigation, communications, remote sensing and scientific research for both civilian and military purposes. As individual spacecraft are limited by the size, mass and power constraints, mass-produced small satellites in large constellations or clusters could be useful in many science missions such as gravity mapping, tracking of forest fires, finding water resources, etc. Constellation of satellites provide improved spatial and temporal resolution of the target. Small satellite constellations contribute innovative applications by replacing a single asset with several very capable spacecraft which opens the door to new applications. With increasing levels of autonomy, there will be a need for remote communication networks to enable communication between spacecraft. These space based networks will need to configure and maintain dynamic routes, manage intermediate nodes, and reconfigure themselves to achieve mission objectives. Hence, inter-satellite communication is a key aspect when satellites fly in formation. In this paper, we present the various researches being conducted in the small satellite community for implementing inter-satellite communications based on the Open System Interconnection (OSI) model. This paper also reviews the various design parameters applicable to the first three layers of the OSI model, i.e., physical, data link and network layer. Based on the survey, we also present a comprehensive list of design parameters useful for achieving inter-satellite communications for multiple small satellite missions. Specific topics include proposed solutions for some of the challenges faced by small satellite systems, enabling operations using a network of small satellites, and some examples of small satellite missions involving formation flying aspects.Comment: 51 pages, 21 Figures, 11 Tables, accepted in IEEE Communications Surveys and Tutorial

    Towards low cost prototyping of mobile opportunistic disconnection tolerant networks and systems

    Get PDF
    Fast emerging mobile edge computing, mobile clouds, Internet of Things (IoT) and cyber physical systems require many novel realistic real time multi-layer algorithms for a wide range of domains, such as intelligent content provision and processing, smart transport, smart manufacturing systems and mobile end user applications. This paper proposes a low cost open source platform, MODiToNeS, which uses commodity hardware to support prototyping and testing of fully distributed multi-layer complex algorithms over real world (or pseudo real) traces. MODiToNeS platform is generic and comprises multiple interfaces that allow real time topology and mobility control, deployment and analysis of different self-organised and self-adaptive routing algorithms, real time content processing, and real time environment sensing with predictive analytics. Our platform also allows rich interactivity with the user. We show deployment and analysis of two vastly different complex networking systems: fault and disconnection aware smart manufacturing sensor network and cognitive privacy for personal clouds. We show that our platform design can integrate both contexts transparently and organically and allows a wide range of analysis
    • …
    corecore