Search CORE

1,922 research outputs found

Speeding up multiprocessor machines with reconfigurable optical interconnects - art. no. 61240K

Author: ARTUNDO I
Dambre Joni
DEBAES C
DESMET L
Heirman Wim
Thienpont Hugo
Van Campenhout Jan
Publication venue: SPIE-INT SOCIETY OPTICAL ENGINEERING
Publication date: 01/01/2006
Field of study

Low Cost Quality of Service Multicast Routing in High Speed Networks

Author: Crawford John
Waters A. Gill
Publication venue: University of Kent at Canterbury
Publication date: 01/12/1997
Field of study

Many of the services envisaged for high speed networks, such as B-ISDN/ATM, will support real-time applications with large numbers of users. Examples of these types of application range from those used by closed groups, such as private video meetings or conferences, where all participants must be known to the sender, to applications used by open groups, such as video lectures, where partcipants need not be known by the sender. These types of application will require high volumes of network resources in addition to the real-time delay constraints on data delivery. For these reasons, several multicast routing heuristics have been proposed to support both interactive and distribution multimedia services, in high speed networks. The objective of such heuristics is to minimise the multicast tree cost while maintaining a real-time bound on delay. Previous evaluation work has compared the relative average performance of some of these heuristics and concludes that they are generally efficient, although some perform better for small multicast groups and others perform better for larger groups. Firstly, we present a detailed analysis and evaluation of some of these heuristics which illustrates that in some situations their average performance is reversed; a heuristic that in general produces efficient solutions for small multicasts may sometimes produce a more efficient solution for a particular large multicast, in a specific network. Also, in a limited number of cases using Dijkstra's algorithm produces the best result. We conclude that the efficiency of a heuristic solution depends on the topology of both the network and the multicast, and that it is difficult to predict. Because of this unpredictability we propose the integration of two heuristics with Dijkstra's shortest path tree algorithm to produce a hybrid that consistently generates efficient multicast solutions for all possible multicast groups in any network. These heuristics are based on Dijkstra's algorithm which maintains acceptable time complexity for the hybrid, and they rarely produce inefficient solutions for the same network/multicast. The resulting performance attained is generally good and in the rare worst cases is that of the shortest path tree. The performance of our hybrid is supported by our evaluation results. Secondly, we examine the stability of multicast trees where multicast group membership is dynamic. We conclude that, in general, the more efficient the solution of a heuristic is, the less stable the multicast tree will be as multicast group membership changes. For this reason, while the hybrid solution we propose might be suitable for use with closed user group multicasts, which are likely to be stable, we need a different approach for open user group multicasting, where group membership may be highly volatile. We propose an extension to an existing heuristic that ensures multicast tree stability where multicast group membership is dynamic. Although this extension decreases the efficiency of the heuristics solutions, its performance is significantly better than that of the worst case, a shortest path tree. Finally, we consider how we might apply the hybrid and the extended heuristic in current and future multicast routing protocols for the Internet and for ATM Networks.

Kent Academic Repository

Multilayer optical learning networks

Author: Psaltis Demetri
Wagner Kelvin
Publication venue: Optical Society of America
Publication date: 01/12/1987
Field of study

A new approach to learning in a multilayer optical neural network based on holographically interconnected nonlinear devices is presented. The proposed network can learn the interconnections that form a distributed representation of a desired pattern transformation operation. The interconnections are formed in an adaptive and self-aligning fashioias volume holographic gratings in photorefractive crystals. Parallel arrays of globally space-integrated inner products diffracted by the interconnecting hologram illuminate arrays of nonlinear Fabry-Perot etalons for fast thresholding of the transformed patterns. A phase conjugated reference wave interferes with a backward propagating error signal to form holographic interference patterns which are time integrated in the volume of a photorefractive crystal to modify slowly and learn the appropriate self-aligning interconnections. This multilayer system performs an approximate implementation of the backpropagation learning procedure in a massively parallel high-speed nonlinear optical network

Caltech Authors

Reconfiguration for Fault Tolerance and Performance Analysis

Author: Kollmeier Harold Henry
Publication venue: ScholarlyCommons
Publication date: 01/11/1987
Field of study

Architecture reconfiguration, the ability of a system to alter the active interconnection among modules, has a history of different purposes and strategies. Its purposes develop from the relatively simple desire to formalize procedures that all processes have in common to reconfiguration for the improvement of fault-tolerance, to reconfiguration for performance enhancement, either through the simple maximizing of system use or by sophisticated notions of wedding topology to the specific needs of a given process. Strategies range from straightforward redundancy by means of an identical backup system to intricate structures employing multistage interconnection networks. The present discussion surveys the more important contributions to developments in reconfigurable architecture. The strategy here is in a sense to approach the field from an historical perspective, with the goal of developing a more coherent theory of reconfiguration. First, the Turing and von Neumann machines are discussed from the perspective of system reconfiguration, and it is seen that this early important theoretical work contains little that anticipates reconfiguration. Then some early developments in reconfiguration are analyzed, including the work of Estrin and associates on the fixed plus variable restructurable computer system, the attempt to theorize about configurable computers by Miller and Cocke, and the work of Reddi and Feustel on their restructable computer system. The discussion then focuses on the most sustained systems for fault tolerance and performance enhancement that have been proposed. An attempt will be made to define fault tolerance and to investigate some of the strategies used to achieve it. By investigating four different systems, the Tandern computer, the C.vmp system, the Extra Stage Cube, and the Gamma network, the move from dynamic redundancy to reconfiguration is observed. Then reconfiguration for performance enhancement is discussed. A survey of some proposals is attempted, then the discussion focuses on the most sustained systems that have been proposed: PASM, the DC architecture, the Star local network, and the NYU Ultracomputer. The discussion is organized around a comparison of control, scheduling, communication, and network topology. Finally, comparisons are drawn between fault tolerance and performance enhancement, in order to clarify the notion of reconfiguration and to reveal the common ground of fault tolerance and performance enhancement as well as the areas in which they diverge. An attempt is made in the conclusion to derive from this survey and analysis some observations on the nature of reconfiguration, as well as some remarks on necessary further areas of research

CiteSeerX

ScholarlyCommons@Penn

Interconnect architectures for dynamically partially reconfigurable systems

Author: Bui Thanh Thi Thanh
Publication venue
Publication date: 01/01/2017
Field of study

Dynamically partially reconfigurable FPGAs (Field-Programmable Gate Arrays) allow hardware modules to be placed and removed at runtime while other parts of the system keep working. With their potential benefits, they have been the topic of a great deal of research over the last decade. To exploit the partial reconfiguration capability of FPGAs, there is a need for efficient, dynamically adaptive communication infrastructure that automatically adapts as modules are added to and removed from the system. Many bus and network-on-chip (NoC) architectures have been proposed to exploit this capability on FPGA technology. However, few realizations have been reported in the public literature to demonstrate or compare their performance in real world applications. While partial reconfiguration can offer many benefits, it is still rarely exploited in practical applications. Few full realizations of partially reconfigurable systems in current FPGA technologies have been published. More application experiments are required to understand the benefits and limitations of implementing partially reconfigurable systems and to guide their further development. The motivation of this thesis is to fill this research gap by providing empirical evidence of the cost and benefits of different interconnect architectures. The results will provide a baseline for future research and will be directly useful for circuit designers who must make a well-reasoned choice between the alternatives. This thesis contains the results of experiments to compare different NoC and bus interconnect architectures for FPGA-based designs in general and dynamically partially reconfigurable systems. These two interconnect schemes are implemented and evaluated in terms of performance, area and power consumption using FFT (Fast Fourier Transform) andANN(Artificial Neural Network) systems as benchmarks. Conclusions drawn from these results include recommendations concerning the interconnect approach for different kinds of applications. It is found that a NoC provides much better performance than a single channel bus and similar performance to a multi-channel bus in both parallel and parallel-pipelined FFT systems. This suggests that a NoC is a better choice for systems with multiple simultaneous communications like the FFT. Bus-based interconnect achieves better performance and consume less area and power than NoCbased scheme for the fully-connected feed-forward NN system. This suggests buses are a better choice for systems that do not require many simultaneous communications or systems with broadcast communications like a fully-connected feed-forward NN. Results from the experiments with dynamic partial reconfiguration demonstrate that buses have the advantages of better resource utilization and smaller reconfiguration time and memory than NoCs. However, NoCs are more flexible and expansible. They have the advantage of placing almost all of the communication infrastructure in the dynamic reconfiguration region. This means that different applications running on the FPGA can use different interconnection strategies without the overhead of fixed bus resources in the static region. Another objective of the research is to examine the partial reconfiguration process and reconfiguration overhead with current FPGA technologies. Partial reconfiguration allows users to efficiently change the number of running PEs to choose an optimal powerperformance operating point at the minimum cost of reconfiguration. However, this brings drawbacks including resource utilization inefficiency, power consumption overhead and decrease in system operating frequency. The experimental results report a 50% of resource utilization inefficiency with a power consumption overhead of less than 5% and a decrease in frequency of up to 32% compared to a static implementation. The results also show that most of the drawbacks of partial reconfiguration implementation come from the restrictions and limitations of partial reconfiguration design flow. If these limitations can be addressed, partial reconfiguration should still be considered with its potential benefits.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201

Adelaide Research & Scholarship

Efficient parallel processing with optical interconnections

Author: Hai Lili
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/1997
Field of study

With the advances in VLSI technology, it is now possible to build chips which can each contain thousands of processors. The efficiency of such chips in executing parallel algorithms heavily depends on the interconnection topology of the processors. It is not possible to build a fully interconnected network of processors with constant fan-in/fan-out using electrical interconnections. Free space optics is a remedy to this limitation. Qualities exclusive to the optical medium are its ability to be directed for propagation in free space and the property that optical channels can cross in space without any interference. In this thesis, we present an electro-optical interconnected architecture named Optical Reconfigurable Mesh (ORM). It is based on an existing optical model of computation. There are two layers in the architecture. The processing layer is a reconfigurable mesh and the deflecting layer contains optical devices to deflect light beams. ORM provides three types of communication mechanisms. The first is for arbitrary planar connections among sets of locally connected processors using the reconfigurable mesh. The second is for arbitrary connections among N of the processors using the electrical buses on the processing layer and N2 fixed passive deflecting units on the deflection layer. The third is for arbitrary connections among any of the N2 processors using the N2 mechanically reconfigurable deflectors in the deflection layer. The third type of communication mechanisms is significantly slower than the other two. Therefore, it is desirable to avoid reconfiguring this type of communication during the execution of the algorithms. Instead, the optical reconfiguration can be done before the execution of each algorithm begins. Determining a right configuration that would be suitable for the entire configuration of a task execution is studied in this thesis. The basic data movements for each of the mechanisms are studied. Finally, to show the power of ORM, we use all three types of communication mechanisms in the first O(logN) time algorithm for finding the convex hulls of all figures in an N x N binary image presented in this thesis

Digital Commons @ New Jersey Institute of Technology (NJIT)

Performance Evaluation of Large Reconfigurable Interconnects for Multiprocessor Systems

Author: ARTUNDO I
BUI VIET K
Dambre Joni
DEBAES C
Heirman Wim
PHAM DOAN T
Thienpont Hugo
Van Campenhout Jan
Publication venue
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography