10 research outputs found

    Hypergraph-Based Interconnection Networks for Large Multicomputers

    Get PDF
    This thesis deals with issues pertaining to multicomputer interconnection networks namely topology, technology, switching method, and routing algorithm. It argues that a new class of regular low-dimensional hypergraph networks, the distributed crossbar switch hypermesh (DCSH), represents a promising alternative high-performance interconnection network for future large multicomputers to graph networks such as meshes, tori, and binary n-cubes, which have been widely used in current multicomputers. Channels in existing hypergraph and graph structures suffer from bandwidth limitations imposed by implementation technology. The first part of the thesis shows how the low-dimensional DCSH can use an innovative implementation scheme to alleviate this problem. It relies on the separation of processing and communication functions by physical layering in order to accommodate high wiring density and necessary message buffering, improving performance considerably. Various mathematical models of the DCSH, validated through discrete-event simulation, are then introduced. Effects of different switching methods (e.g., wormhole routing, virtual cut-through, and message switching), routing algorithms (e.g., restricted and random), and different switching element designs are investigated. Further, the impact on performance of different communication patterns, such as those including locality and hot-spots, are assessed. The remainder of the thesis compares the DCSH to other common hypergraph and graph networks assuming different implementation technologies, such as VLSI, multiple-chip technology, and the new layered implementation scheme. More realistic assumptions are introduced such as pipeline-bit transmission and non-zero delays through switching elements. The results show that the proposed structure has superior characteristics assuming equal implementation cost in both VLSI and multiple-chip technology. Furthermore, optimal performance is offered by the new layered implementation

    Performance evaluation of distributed crossbar switch hypermesh

    Get PDF
    The interconnection network is one of the most crucial components in any multicomputer as it greatly influences the overall system performance. Several recent studies have suggested that hypergraph networks, such as the Distributed Crossbar Switch Hypermesh (DCSH), exhibit superior topological and performance characteristics over many traditional graph networks, e.g. k-ary n-cubes. Previous work on the DCSH has focused on issues related to implementation and performance comparisons with existing networks. These comparisons have so far been confined to deterministic routing and unicast (one-to-one) communication. Using analytical models validated through simulation experiments, this thesis extends that analysis to include adaptive routing and broadcast communication. The study concentrates on wormhole switching, which has been widely adopted in practical multicomputers, thanks to its low buffering requirement and the reduced dependence of latency on distance under low traffic. Adaptive routing has recently been proposed as a means of improving network performance, but while the comparative evaluation of adaptive and deterministic routing has been widely reported in the literature, the focus has been on graph networks. The first part of this thesis deals with adaptive routing, developing an analytical model to measure latency in the DCSH, and which is used throughout the rest of the work for performance comparisons. Also, an investigation of different routing algorithms in this network is presented. Conventional k-ary n-cubes have been the underlying topology of contemporary multicomputers, but it is only recently that adaptive routing has been incorporated into such systems. The thesis studies the relative performance merits of the DCSH and k-ary n-cubes under adaptive routing strategy. The analysis takes into consideration real-world factors, such as router complexity and bandwidth constraints imposed by implementation technology. However, in any network, the routing of unicast messages is not the only factor in traffic control. In many situations (for example, parallel iterative algorithms, memory update and invalidation procedures in shared memory systems, global notification of network errors), there is a significant requirement for broadcast traffic. The DCSH, by virtue of its use of hypergraph links, can implement broadcast operations particularly efficiently. The second part of the thesis examines how the DCSH and k-ary n-cube performance is affected by the presence of a broadcast traffic component. In general, these studies demonstrate that because of their relatively high diameter, k-ary n-cubes perform poorly when message lengths are short. This is consistent with earlier more simplistic analyses which led to the proposal for the express-cube, an enhancement of the basic k-ary n-cube structure, which provides additional express channels, allowing messages to bypass groups of nodes along their paths. The final part of the thesis investigates whether this "partial bypassing" can compete with the "total bypassing" capability provided inherently by the DCSH topology

    Small-world interconnection networks for large parallel computer systems

    Get PDF
    The use of small-world graphs as interconnection networks of multicomputers is proposed and analysed in this work. Small-world interconnection networks are constructed by adding (or modifying) edges to an underlying local graph. Graphs with a rich local structure but with a large diameter are shown to be the most suitable candidates for the underlying graph. Generation models based on random and deterministic wiring processes are proposed and analysed. For the random case basic properties such as degree, diameter, average length and bisection width are analysed, and the results show that a fast transition from a large diameter to a small diameter is experienced when the number of new edges introduced is increased. Random traffic analysis on these networks is undertaken, and it is shown that although the average latency experiences a similar reduction, networks with a small number of shortcuts have a tendency to saturate as most of the traffic flows through a small number of links. An analysis of the congestion of the networks corroborates this result and provides away of estimating the minimum number of shortcuts required to avoid saturation. To overcome these problems deterministic wiring is proposed and analysed. A Linear Feedback Shift Register is used to introduce shortcuts in the LFSR graphs. A simple routing algorithm has been constructed for the LFSR and extended with a greedy local optimisation technique. It has been shown that a small search depth gives good results and is less costly to implement than a full shortest path algorithm. The Hilbert graph on the other hand provides some additional characteristics, such as support for incremental expansion, efficient layout in two dimensional space (using two layers), and a small fixed degree of four. Small-world hypergraphs have also been studied. In particular incomplete hypermeshes have been introduced and analysed and it has been shown that they outperform the complete traditional implementations under a constant pinout argument. Since it has been shown that complete hypermeshes outperform the mesh, the torus, low dimensional m-ary d-cubes (with and without bypass channels), and multi-stage interconnection networks (when realistic decision times are accounted for and with a constant pinout), it follows that incomplete hypermeshes outperform them as well

    System level modelling and design of hypergraph based wireless system area networks for multi-computer systems

    Get PDF
    This thesis deals with issues pertaining the wireless multicomputer interconnection networks namely topology and Medium Access Control (MAC). It argues that new channel assignment technique based on regular low-dimensional hypergraph networks, the dual radio wireless hypermesh, represents a promising alternative high-performance wireless interconnection network for the future multicomputers to shared communication medium networks and/or ordinary wireless mesh networks, which have been widely used in current wireless networks. The focus of this work is on improving the network throughput while maintaining a relatively low latency of a wireless network system. By means of a Carrier Sense Multiple Access (CSMA) based design of the MAC protocol and based on the desirable features of hypermesh network topology a relatively high performance network has been introduced. Compared to the CSMA shared communication channel model, which is currently the de facto MAC protocol for most of wireless networks, our design is shown to achieve a significant increase in network throughput with less average network latency for large number of communication nodes. SystemC model of the proposed wireless hypermesh, validated through mathematical models, are then introduced. The analysis has been incorporated in the proper SystemC design methodology which facilitates the integration of communication modelling into the design modelling at the early stages of the system development. Another important application of SystemC modelling techniques is to perform meaningful comparative studies of different protocols, or new implementations to determine which communication scenario performs better and the ability to modify models to test system sensitivity and tune performance. Effects of different design parameters (e.g., packet sizes, number of nodes) has been carried out throughout this work. The results shows that the proposed structure has out perform the existing shared medium network structure and it can support relatively high number of wireless connected computers than conventional networks

    New fault-tolerant routing algorithms for k-ary n-cube networks

    Get PDF
    The interconnection network is one of the most crucial components in a multicomputer as it greatly influences the overall system performance. Networks belonging to the family of k-ary n-cubes (e.g., tori and hypercubes) have been widely adopted in practical machines due to their desirable properties, including a low diameter, symmetry, regularity, and ability to exploit communication locality found in many real-world parallel applications. A routing algorithm specifies how a message selects a path to cross from source to destination, and has great impact on network performance. Routing in fault-free networks has been extensively studied in the past. As the network size scales up the probability of processor and link failure also increases. It is therefore essential to design fault-tolerant routing algorithms that allow messages to reach their destinations even in the presence of faulty components (links and nodes). Although many fault-tolerant routing algorithms have been proposed for common multicomputer networks, e.g. hypercubes and meshes, little research has been devoted to developing fault-tolerant routing for well-known versions of k-ary n-cubes, such as 2 and 3- dimensional tori. Previous work on fault-tolerant routing has focused on designing algorithms with strict conditions imposed on the number of faulty components (nodes and links) or their locations in the network. Most existing fault-tolerant routing algorithms have assumed that a node knows either only the status of its neighbours (such a model is called local-information-based) or the status of all nodes (global-information-based). The main challenge is to devise a simple and efficient way of representing limited global fault information that allows optimal or near-optimal fault-tolerant routing. This thesis proposes two new limited-global-information-based fault-tolerant routing algorithms for k-ary n-cubes, namely the unsafety vectors and probability vectors algorithms. While the first algorithm uses a deterministic approach, which has been widely employed by other existing algorithms, the second algorithm is the first that uses probability-based fault- tolerant routing. These two algorithms have two important advantages over those already existing in the relevant literature. Both algorithms ensure fault-tolerance under relaxed assumptions, regarding the number of faulty components and their locations in the network. Furthermore, the new algorithms are more general in that they can easily be adapted to different topologies, including those that belong to the family of k-ary n-cubes (e.g. tori and hypercubes) and those that do not (e.g., generalised hypercubes and meshes). Since very little work has considered fault-tolerant routing in k-ary n-cubes, this study compares the relative performance merits of the two proposed algorithms, the unsafety and probability vectors, on these networks. The results reveal that for practical number of faulty nodes, both algorithms achieve good performance levels. However, the probability vectors algorithm has the advantage of being simpler to implement. Since previous research has focused mostly on the hypercube, this study adapts the new algorithms to the hypercube in order to conduct a comparative study against the recently proposed safety vectors algorithm. Results from extensive simulation experiments demonstrate that our algorithms exhibit superior performance in terms of reachability (chances of a message reaching its destination), deviation from optimality (average difference between minimum distance and actual routing distance), and looping (chances of a message continuously looping in the network without reaching destination) to the safety vectors

    Performance analysis of wormhole routing in multicomputer interconnection networks

    Get PDF
    Perhaps the most critical component in determining the ultimate performance potential of a multicomputer is its interconnection network, the hardware fabric supporting communication among individual processors. The message latency and throughput of such a network are affected by many factors of which topology, switching method, routing algorithm and traffic load are the most significant. In this context, the present study focuses on a performance analysis of k-ary n-cube networks employing wormhole switching, virtual channels and adaptive routing, a scenario of especial interest to current research. This project aims to build upon earlier work in two main ways: constructing new analytical models for k-ary n-cubes, and comparing the performance merits of cubes of different dimensionality. To this end, some important topological properties of k-ary n-cubes are explored initially; in particular, expressions are derived to calculate the number of nodes at/within a given distance from a chosen centre. These results are important in their own right but their primary significance here is to assist in the construction of new and more realistic analytical models of wormhole-routed k-ary n-cubes. An accurate analytical model for wormhole-routed k-ary n-cubes with adaptive routing and uniform traffic is then developed, incorporating the use of virtual channels and the effect of locality in the traffic pattern. New models are constructed for wormhole k-ary n-cubes, with the ability to simulate behaviour under adaptive routing and non-uniform communication workloads, such as hotspot traffic, matrix-transpose and digit-reversal permutation patterns. The models are equally applicable to unidirectional and bidirectional k-ary n-cubes and are significantly more realistic than any in use up to now. With this level of accuracy, the effect of each important network parameter on the overall network performance can be investigated in a more comprehensive manner than before. Finally, k-ary n-cubes of different dimensionality are compared using the new models. The comparison takes account of various traffic patterns and implementation costs, using both pin-out and bisection bandwidth as metrics. Networks with both normal and pipelined channels are considered. While previous similar studies have only taken account of network channel costs, our model incorporates router costs as well thus generating more realistic results. In fact the results of this work differ markedly from those yielded by earlier studies which assumed deterministic routing and uniform traffic, illustrating the importance of using accurate models to conduct such analyses

    A Data Mining Methodology for Vehicle Crashworthiness Design

    Get PDF
    This study develops a systematic design methodology based on data mining theory for decision-making in the development of crashworthy vehicles. The new data mining methodology allows the exploration of a large crash simulation dataset to discover the underlying relationships among vehicle crash responses and design variables at multiple levels and to derive design rules based on the whole-vehicle safety requirements to make decisions about component-level and subcomponent-level design. The method can resolve a major issue with existing design approaches related to vehicle crashworthiness: that is, limited abilities to explore information from large datasets, which may hamper decision-making in the design processes. At the component level, two structural design approaches were implemented for detailed component design with the data mining method: namely, a dimension-based approach and a node-based approach to handle structures with regular and irregular shapes, respectively. These two approaches were used to design a thin-walled vehicular structure, the S-shaped beam, against crash loading. A large number of design alternatives were created, and their responses under loading were evaluated by finite element simulations. The design variables and computed responses formed a large design dataset. This dataset was then mined to build a decision tree. Based on the decision tree, the interrelationships among the design parameters were revealed, and design rules were generated to produce a set of good designs. After the data mining, the critical design parameters were identified and the design space was reduced, which can simplify the design process. To partially replace the expensive finite element simulations, a surrogate model was used to model the relationships between design variables and response. Four machine learning algorithms, which can be used for surrogate model development, were compared. Based on the results, Gaussian process regression was determined to be the most suitable technique in the present scenario, and an optimization process was developed to tune the algorithm’s hyperparameters, which govern the model structure and training process. To account for engineering uncertainty in the data mining method, a new decision tree for uncertain data was proposed based on the joint probability in uncertain spaces, and it was implemented to again design the S-beam structure. The findings show that the new decision tree can produce effective decision-making rules for engineering design under uncertainty. To evaluate the new approaches developed in this work, a comprehensive case study was conducted by designing a vehicle system against the frontal crash. A publicly available vehicle model was simplified and validated. Using the newly developed approaches, new component designs in this vehicle were generated and integrated back into the vehicle model so their crash behavior could be simulated. Based on the simulation results, one can conclude that the designs with the new method can outperform the original design in terms of measures of mass, intrusion and peak acceleration. Therefore, the performance of the new design methodology has been confirmed. The current study demonstrates that the new data mining method can be used in vehicle crashworthiness design, and it has the potential to be applied to other complex engineering systems with a large amount of design data

    Design of a Meta-Material with Targeted Nonlinear Deformation Response

    Get PDF
    The M1 Abrams tank contains track pads consist of a high density rubber. This rubber fails prematurely due to heat buildup caused by the hysteretic nature of elastomers. It is therefore desired to replace this elastomer by a meta-material that has equivalent nonlinear deformation characteristics without this primary failure mode. A meta-material is an artificial material in the form of a periodic structure that exhibits behavior that differs from its constitutive material. After a thorough literature review, topology optimization was found as the only method used to design meta-materials. Further investigation determined topology optimization as an infeasible method to design meta-materials with the targeted nonlinear deformation characteristics. Therefore, a method was developed in this thesis to logically and systematically design meta-material unit cells using engineering principles to achieve the desired nonlinear response. This method, called the Unit Cell Synthesis Method, requires the designer to have a fundamental understanding of the geometric nonlinearity of an elemental geometry. One or more of these elemental geometries are then systematically combined into a unit cell. A size optimization is performed on promising unit cell concepts to tune the geometry and converge its response towards that of the target. Application of this method was successful in generating a meta-material to meet the response of the rubber pad. The method represented in this thesis is meant to serve as a framework for future designers to develop meta-materials for nonlinear targeted responses

    Multi-objective and multi-model shape optimization of turbocharger turbines over real-world drive cycles for low carbon vehicles

    Get PDF
    Turbocharging is the established method for downsizing internal combustion (IC) engines to lower CO2 emissions and fuel consumption while meeting the desired performance. Turbochargers for automotive engines commonly utilize radial turbines for exhaust energy extraction. However, the design of a turbocharger turbine is subject to conflicting requirements. A crucial consideration when matching a turbocharger to an engine is the ability to meet the specified low-end torque target while minimizing the turbine inlet pressure (particularly at high engine speed) to reduce the engine pumping work. Conventionally, the matching procedure used in the industry relies on experimentally measured compressor and turbine performance maps to model turbocharger operation within engine cycle simulation software. In this way, the compressor and turbine configuration that best meets the specified customer requirements is down-selected. Thus, only existing turbine geometries can be evaluated during the conventional matching process. This makes it a passive process as the turbine aerodynamic performance and inertia cannot be modified during the matching evaluations. Ideally, what is needed is a framework that physically models both the turbine and engine with sufficient accuracy and allows turbine geometric changes to be accounted for. To this end, the objective of this work is to establish a novel and fast-running framework that allows turbine shape optimization based on engine-level objectives and constraints, and understand from a fluid dynamic perspective why a given turbine design is better for the engine. An in-house reduced-order model (meanline code) to estimate aerodynamic performance and a neural network-based inertia prediction tool for radial turbines are developed. These are integrated in a validated engine model to provide a framework for modelling the engine-turbine interaction using a numerically inexpensive technique. It allows the effect of turbine geometric changes on inertia and aerodynamic performance to be reflected in the exhaust boundary conditions and thereby in the overall performance of the engine. A genetic algorithm is employed within the framework, providing an opportunity for single-objective (for example, weighted cycle-average BSFC) or multi-objective (for example, weighted cycle-averaged BSFC and engine transient response) shape optimization of turbine meridional geometry. The framework has been applied to a Renault 1.2L turbocharged gasoline engine to minimize the fuel consumption and therefore CO2 emissions, while meeting a sensible transient response constraint. Turbine shape optimization was carried out over a cluster of weighted part-load operating points that represent the World harmonized Light vehicles Test Cycle (WLTC). The design candidates lying on the Pareto front present improvements of up to 0.4% in the weighted cycle-averaged fuel consumption, and up to 8% in transient response. Dynamic vehicle simulations over the WLTC are used to confirm the improvement observed in fuel consumption. Based on the meridional parameters obtained from the 1D optimization, 3D designs are created for both the turbine housing and the rotor. Finally, CFD evaluation and experimental testing are performed to verify the performance of optimized designs. 3D CFD predictions showed good agreement with experimental results, lying within the range of experimental uncertainty. The CFD analysis also showed a significant reduction in secondary flow features in the optimized design compared with the baseline turbine. While the developed framework can be used to improve existing turbine designs, it also facilitates the development and optimization of `tailor-made' turbines for new low carbon engine projects. Even though, for the particular case described, the optimization process indicates a moderate 0.2--0.4% reduction in the weighted cycle-averaged BSFC, this would translate to a reduction of at least 270,000 tonnes of CO2 considering the lifetime of all GDI engines manufactured each year in the EU. Thus, the developed turbine optimization framework has a massive potential, especially because it requires no new or additional technology.Open Acces

    Investigation of an Adaptable Crash Energy Management System to Enhance Vehicle Crashworthiness

    Get PDF
    The crashworthiness enhancement of vehicle structures is a very challenging task during the vehicle design process due to complicated nature of vehicle design structures that need to comply with different conflicting design task requirements. Although different safety agencies have issued and modified standardized crash tests to guarantee structural integrity and occupant survivability, there is continued rise of fatalities in vehicle crashes especially the passenger cars. This dissertation research explores the applicability of a crash energy management system of providing variable energy absorbing properties as a function of the impact speed to achieve enhanced occupant safety. The study employs an optimal crash pulse to seek designs of effective energy absorption mechanisms for reducing the occupant impact severity. The study is conducted in four different phases, where the performance potentials of different concepts in add-on energy absorbing/dissipating elements are investigated in the initial phase using a simple lumped-parameter model. For this purpose, a number of performance measures related to crash safety are defined, particular those directly related to occupant deceleration and compartment intrusion. Moreover, the effects of the linear, quadratic and cubic damping properties of the add-on elements are investigated in view of structure deformation and occupant`s Head Injury Criteria (HIC). In the second phase of this study, optimal design parameters of the proposed add-on energy absorber concept are identified through solutions of single- and weighted multi-objective minimization functions using different methods, namely sequential quadratic programming (SQP), genetic algorithms (GA) and hybrid genetic algorithms. The solutions obtained suggest that conducting multiobjective optimization of conflicting functions via genetic algorithms could yield an improved design compromise over a wider range of impact speeds. The effectiveness of the optimal add-on energy absorber configurations are subsequently investigated through its integration to a full-scale vehicle model in the third phase. The elasto-plastic stress-strain and force-deflection properties of different substructures are incorporated in the full-scale vehicle model integrating the absorber concept. A scaling method is further proposed to adapt the vehicle model to sizes of current automobile models. The influences of different design parameters on the crash energy management safety performance measures are studied through a comprehensive sensitivity analysis. In the final phase, the proposed add-on absorber concept is implemented in a high fidelity nonlinear finite element (FE) model of a small passenger car in the LS-DYNA platform. The simulation results of the model with add-on system, obtained at different impact speeds, are compared with those of the baseline model to illustrate the crashworthiness enhancement and energy management properties of the proposed concept. The results show that vehicle crashworthiness can be greatly enhanced using the proposed add-on crash energy management system, which can be implemented in conjunction with the crush elements
    corecore