5 research outputs found

    Topological Characterization of Hamming and Dragonfly Networks and its Implications on Routing

    Get PDF
    Current HPC and datacenter networks rely on large-radix routers. Hamming graphs (Cartesian products of complete graphs) and dragonflies (two-level direct networks with nodes organized in groups) are some direct topologies proposed for such networks. The original definition of the dragonfly topology is very loose, with several degrees of freedom such as the inter- and intra-group topology, the specific global connectivity and the number of parallel links between groups (or trunking level). This work provides a comprehensive analysis of the topological properties of the dragonfly network, providing balancing conditions for network dimensioning, as well as introducing and classifying several alternatives for the global connectivity and trunking level. From a topological study of the network, it is noted that a Hamming graph can be seen as a canonical dragonfly topology with a large level of trunking. Based on this observation and by carefully selecting the global connectivity, the Dimension Order Routing (DOR) mechanism safely used in Hamming graphs is adapted to dragonfly networks with trunking. The resulting routing algorithms approximate the performance of minimal, non-minimal and adaptive routings typically used in dragonflies, but without requiring virtual channels to avoid packet deadlock, thus allowing for lower-cost router implementations. This is obtained by selecting properly the link to route between groups, based on a graph coloring of the network routers. Evaluations show that the proposed mechanisms are competitive to traditional solutions when using the same number of virtual channels, and enable for simpler implementations with lower cost. Finally, multilevel dragonflies are discussed, considering how the proposed mechanisms could be adapted to them

    Network unfairness in dragonfly topologies

    Get PDF
    Dragonfly networks arrange network routers in a two-level hierarchy, providing a competitive cost-performance solution for large systems. Non-minimal adaptive routing (adaptive misrouting) is employed to fully exploit the path diversity and increase the performance under adversarial traffic patterns. Network fairness issues arise in the dragonfly for several combinations of traffic pattern, global misrouting and traffic prioritization policy. Such unfairness prevents a balanced use of the resources across the network nodes and degrades severely the performance of any application running on an affected node. This paper reviews the main causes behind network unfairness in dragonflies, including a new adversarial traffic pattern which can easily occur in actual systems and congests all the global output links of a single router. A solution for the observed unfairness is evaluated using age-based arbitration. Results show that age-based arbitration mitigates fairness issues, especially when using in-transit adaptive routing. However, when using source adaptive routing, the saturation of the new traffic pattern interferes with the mechanisms employed to detect remote congestion, and the problem grows with the network size. This makes source adaptive routing in dragonflies based on remote notifications prone to reduced performance, even when using age-based arbitration.Peer ReviewedPostprint (author's final draft

    Scalable and Accurate Memory System Simulation

    Get PDF
    Memory systems today possess more complexity than ever. On one hand, main memory technology has a much more diverse portfolio. Other than the mainstream DDR DRAMs, a variety of DRAM protocols have been proliferating in certain domains. Non-Volatile Memory(NVM) also finally has commodity main memory products, introducing more heterogeneity to the main memory media. On the other hand, the scale of computer systems, from personal computers, server computers, to high performance computing systems, has been growing in response to increasing computing demand. Memory systems have to be able to keep scaling to avoid bottlenecking the whole system. However, current memory simulation works cannot accurately or efficiently model these developments, making it hard for researchers and developers to evaluate or to optimize designs for memory systems. In this study, we attack these issues from multiple angles. First, we develop a fast and validated cycle accurate main memory simulator that can accurately model almost all existing DRAM protocols and some NVM protocols, and it can be easily extended to support upcoming protocols as well. We showcase this simulator by conducting a thorough characterization over existing DRAM protocols and provide insights on memory system designs. Secondly, to efficiently simulate the increasingly paralleled memory systems, we propose a lax synchronization model that allows efficient parallel DRAM simulation. We build the first ever practical parallel DRAM simulator that can speedup the simulation by up to a factor of three with single digit percentage loss in accuracy comparing to cycle accurate simulations. We also developed mitigation schemes to further improve the accuracy with no additional performance cost. Moreover, we discuss the limitation of cycle accurate models, and explore the possibility of alternative modeling of DRAM. We propose a novel approach that converts DRAM timing simulation into a classification problem. By doing so we can make predictions on DRAM latency for each memory request upon first sight, which makes it compatible for scalable architecture simulation frameworks. We developed prototypes based on various machine learning models and they demonstrate excellent performance and accuracy results that makes them a promising alternative to cycle accurate models. Finally, for large scale memory systems where data movement is often the performance limiting factor, we propose a set of interconnect topologies and implement them in a parallel discrete event simulation framework. We evaluate the proposed topologies through simulation and prove that their scalability and performance exceeds existing topologies with increasing system size or workloads

    Optimization for Decision Making II

    Get PDF
    In the current context of the electronic governance of society, both administrations and citizens are demanding the greater participation of all the actors involved in the decision-making process relative to the governance of society. This book presents collective works published in the recent Special Issue (SI) entitled “Optimization for Decision Making II”. These works give an appropriate response to the new challenges raised, the decision-making process can be done by applying different methods and tools, as well as using different objectives. In real-life problems, the formulation of decision-making problems and the application of optimization techniques to support decisions are particularly complex and a wide range of optimization techniques and methodologies are used to minimize risks, improve quality in making decisions or, in general, to solve problems. In addition, a sensitivity or robustness analysis should be done to validate/analyze the influence of uncertainty regarding decision-making. This book brings together a collection of inter-/multi-disciplinary works applied to the optimization of decision making in a coherent manner
    corecore