17 research outputs found

    Symmetric rearrangeable networks and algorithms

    Get PDF
    A class of symmetric rearrangeable nonblocking networks has been considered in this thesis. A particular focus of this thesis is on Benes networks built with 2 x 2 switching elements. Symmetric rearrangeable networks built with larger switching elements have also being considered. New applications of these networks are found in the areas of System on Chip (SoC) and Network on Chip (NoC). Deterministic routing algorithms used in NoC applications suffer low scalability and slow execution time. On the other hand, faster algorithms are blocking and thus limit throughput. This will be an acceptable trade-off for many applications where achieving ”wire speed” on the on-chip network would require extensive optimisation of the attached devices. In this thesis I designed an algorithm that has much lower blocking probabilities than other suboptimal algorithms but a much faster execution time than deterministic routing algorithms. The suboptimal method uses the looping algorithm in its outermost stages and then in the two distinct subnetworks deeper in the switch uses a fast but suboptimal path search method to find available paths. The worst case time complexity of this new routing method is O(NlogN) using a single processor, which matches the best known results reported in the literature. Disruption of the ongoing communications in this class of networks during rearrangements is an open issue. In this thesis I explored a modification of the topology of these networks which gives rise to what is termed as repackable networks. A repackable topology allows rearrangements of paths without intermittently losing connectivity by breaking the existing communication paths momentarily. The repackable network structure proposed in this thesis is efficient in its use of hardware when compared to other proposals in the literature. As most of the deterministic algorithms designed for Benes networks implement a permutation of all inputs to find the routing tags for the requested inputoutput pairs, I proposed a new algorithm that can work for partial permutations. If the network load is defined as ρ, the mean number of active inputs in a partial permutation is, m = ρN, where N is the network size. This new method is based on mapping the network stages into a set of sub-matrices and then determines the routing tags for each pair of requests by populating the cells of the sub-matrices without creating a blocking state. Overall the serial time complexity of this method is O(NlogN) and O(mlogN) where all N inputs are active and with m < N active inputs respectively. With minor modification to the serial algorithm this method can be made to work in the parallel domain. The time complexity of this routing algorithm in a parallel machine with N completely connected processors is O(log^2 N). With m active requests the time complexity goes down to (logmlogN), which is better than the O(log^2 m + logN), reported in the literature for 2^0.5((log^2 -4logN)^0.5-logN)<= ρ <= 1. I also designed multistage symmetric rearrangeable networks using larger switching elements and implement a new routing algorithm for these classes of networks. The network topology and routing algorithms presented in this thesis should allow large scale networks of modest cost, with low setup times and moderate blocking rates, to be constructed. Such switching networks will be required to meet the bandwidth requirements of future communication networks

    Spacecraft in switch matrix for wide band service applicatons in 30/20 GHz communications satellite systems

    Get PDF
    Bandwidth, switching speed, off-state isolation, and reliability over a ten-year mission were factors in determining the optimum available technology for satellite communications switching in 1982. A proof of concept model for a 20 x 20 coupled crossbar switch matrix designed with FET devices for microwave switching and with high speed CMOS LIS for switch crosspoint addressing was fabricated and tested. Results show the design is feasible for application in a multichannel SS-TDMA communications system. Expandibility can readily be achieved with this design. A conceptual design study for a 100 x 100 switch matrix utilizing a coupled crossbar architecture implemented with a monolithic microwave integrated circuits revealed technology needs for high capacity switch matrices

    Second year technical report on-board processing for future satellite communications systems

    Get PDF
    Advanced baseband and microwave switching techniques for large domestic communications satellites operating in the 30/20 GHz frequency bands are discussed. The nominal baseband processor throughput is one million packets per second (1.6 Gb/s) from one thousand T1 carrier rate customer premises terminals. A frequency reuse factor of sixteen is assumed by using 16 spot antenna beams with the same 100 MHz bandwidth per beam and a modulation with a one b/s per Hz bandwidth efficiency. Eight of the beams are fixed on major metropolitan areas and eight are scanning beams which periodically cover the remainder of the U.S. under dynamic control. User signals are regenerated (demodulated/remodulated) and message packages are reformatted on board. Frequency division multiple access and time division multiplex are employed on the uplinks and downlinks, respectively, for terminals within the coverage area and dwell interval of a scanning beam. Link establishment and packet routing protocols are defined. Also described is a detailed design of a separate 100 x 100 microwave switch capable of handling nonregenerated signals occupying the remaining 2.4 GHz bandwidth with 60 dB of isolation, at an estimated weight and power consumption of approximately 400 kg and 100 W, respectively

    High-speed, economical design implementation of transit network router

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 88-90).by Kazuhiro Hara.M.S

    Optimizing Communication for Massively Parallel Processing

    Get PDF
    The current trends in high performance computing show that large machines with tens of thousands of processors will soon be readily available. The IBM Bluegene-L machine with 128k processors (which is currently being deployed) is an important step in this direction. In this scenario, it is going to be a significant burden for the programmer to manually scale his applications. This task of scaling involves addressing issues like load-imbalance and communication overhead. In this thesis, we explore several communication optimizations to help parallel applications to easily scale on a large number of processors. We also present automatic runtime techniques to relieve the programmer from the burden of optimizing communication in his applications. This thesis explores processor virtualization to improve communication performance in applications. With processor virtualization, the computation is mapped to virtual processors (VPs). After one VP has finished computation and is waiting for responses to its messages, another VP can compute, thus overlapping communication with computation. This overlap is only effective if the processor overhead of the communication operation is a small fraction of the total communication time. Fortunately, with network interfaces having co-processors, this happens to be true and processor virtualization has a natural advantage on such interconnects. The communication optimizations we present in this thesis, are motivated by applications such as NAMD (a classical molecular dynamics application) and CPAIMD (a quantum chemistry application). Applications like NAMD and CPAIMD consume a fair share of the time available on supercomputers. So, improving their performance would be of great value. We have successfully scaled NAMD to 1TF of peak performance on 3000 processors of PSC Lemieux, using the techniques presented in this thesis. We study both point-to-point communication and collective communication (specifically all-to-all communication). On a large number of processors all-to-all communication can take several milli-seconds to finish. With synchronous collectives defined in MPI, the processor idles while the collective messages are in flight. Therefore, we demonstrate an asynchronous collective communication framework, to let the CPU compute while the all-to-all messages are in flight. We also show that the best strategy for all-to-all communication depends on the message size, number of processors and other dynamic parameters. This suggests that these parameters can be observed at runtime and used to choose the optimal strategy for all-to-all communication. In this thesis, we demonstrate adaptive strategy switching for all-to-all communication. The communication optimization framework presented in this thesis, has been designed to optimize communication in the context of processor virtualization and dynamic migrating objects. We present the streaming strategy to optimize fine grained object-to-object communication. In this thesis, we motivate the need for hardware collectives, as processor based collectives can be delayed by intermediate that processors busy with computation. We explore a next generation interconnect that supports collectives in the switching hardware. We show the performance gains of hardware collectives through synthetic benchmarks

    Cutting Edge Nanotechnology

    Get PDF
    The main purpose of this book is to describe important issues in various types of devices ranging from conventional transistors (opening chapters of the book) to molecular electronic devices whose fabrication and operation is discussed in the last few chapters of the book. As such, this book can serve as a guide for identifications of important areas of research in micro, nano and molecular electronics. We deeply acknowledge valuable contributions that each of the authors made in writing these excellent chapters

    On-board processing for future satellite communications systems: Satellite-Routed FDMA

    Get PDF
    A frequency division multiple access (FDMA) 30/20 GHz satellite communications architecture without on-board baseband processing is investigated. Conceptual system designs are suggested for domestic traffic models totaling 4 Gb/s of customer premises service (CPS) traffic and 6 Gb/s of trunking traffic. Emphasis is given to the CPS portion of the system which includes thousands of earth terminals with digital traffic ranging from a single 64 kb/s voice channel to hundreds of channels of voice, data, and video with an aggregate data rate of 33 Mb/s. A unique regional design concept that effectively smooths the non-uniform traffic distribution and greatly simplifies the satellite design is employed. The satellite antenna system forms thirty-two 0.33 deg beam on both the uplinks and the downlinks in one design. In another design matched to a traffic model with more dispersed users, there are twenty-four 0.33 deg beams and twenty-one 0.7 deg beams. Detailed system design techniques show that a single satellite producing approximately 5 kW of dc power is capable of handling at least 75% of the postulated traffic. A detailed cost model of the ground segment and estimated system costs based on current information from manufacturers are presented

    34th Midwest Symposium on Circuits and Systems-Final Program

    Get PDF
    Organized by the Naval Postgraduate School Monterey California. Cosponsored by the IEEE Circuits and Systems Society. Symposium Organizing Committee: General Chairman-Sherif Michael, Technical Program-Roberto Cristi, Publications-Michael Soderstrand, Special Sessions- Charles W. Therrien, Publicity: Jeffrey Burl, Finance: Ralph Hippenstiel, and Local Arrangements: Barbara Cristi

    Devices and networks for optical switching

    Get PDF
    This thesis is concerned with some aspects of the application of optics to switching and computing. Two areas are dealt with: the design of switching networks which use optical interconnects, and the development and application of the t-SEED optical logic device. The work on optical interconnects looks at the multistage interconnection network which has been proposed as a hybrid switch using both electronics and optics. It is shown that the architecture can be mapped from one dimensional to two dimensional format, so that the machine makes full use of the space available to the optics. Other mapping rules are described which allow the network to make optimum use of the optical interconnects, and the endpoint is a hybrid optical-electronic machine which should be able to outperform an all-electronic equivalent. The development of the t-SEED optical logic device is described, which is the integration of a phototransistor with a multiple quantum well optical modulator. It is found to be important to have the modulator underneath rather than on top of the transistor to avoid unwanted thyristor action. In order for the transistor to have a high gain the collector must have a low doping level, the exit window in the substrate must be etched all the way to the emitter layer, and the etch must not damage the emitter-base junction. A real optical gain of 1.6 has been obtained, which is higher than has ever been reached before but is not as high as should be possible. Improvements to the device are suggested. A new model of the Fabry-Perot cavity is introduced which helps considerably in the interpretation of experimental measurements made on the quantum well modulators. Also a method of improving the contrast of the multiple quantum well modulator by grading the well widths is proposed which may find application in long wavelength transmission modulators. Some systems which make use of the t-SEED are considered. It is shown that the t-SEED device has the right characteristics for use as a neuron element in the optical implementation of a neural network. A new image processing network for clutter removal in binary images is introduced which uses the t-SEED, and a brief performance analysis suggests that the network may be superior to an all-electronic machine

    Simulation and analytical performance studies of generic atm switch fabrics.

    Get PDF
    As technology improves exciting new services such as video phone become possible and economically viable but their deployment is hampered by the inability of the present networks to carry them. The long term vision is to have a single network able to carry all present and future services. Asynchronous Transfer Mode, ATM, is the versatile new packet -based switching and multiplexing technique proposed for the single network. Interest in ATM is currently high as both industrial and academic institutions strive to understand more about the technique. Using both simulation and analysis, this research has investigated how the performance of ATM switches is affected by architectural variations in the switch fabric design and how the stochastic nature of ATM affects the timing of constant bit rate services. As a result the research has contributed new ATM switch performance data, a general purpose ATM switch simulator and analytic models that further research may utilise and has uncovered a significant timing problem of the ATM technique. The thesis will also be of interest and assistance to anyone planning on using simulation as a research tool to model an ATM switch
    corecore