Search CORE

191 research outputs found

Performance Study of Multilayered Multistage Interconnection Networks under Hotspot Traffic Conditions

Author: Costas Vassilakis
Dimitris Vasiliadis
George Rizos
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2010
Field of study

The performance of Multistage Interconnection Networks (MINs) under hotspot traffic, where some percentage of the traffic is targeted at single nodes, which are also called hot spots, is of crucial interest. The prioritizing of packets has already been proposed at previous works as alleviation to the tree saturation problem, leading to a scheme that natively supports 2-class priority traffic. In order to prevent hotspot traffic from degrading uniform traffic we expand previous studies by introducing multilayer Switching Elements (SEs) at last stages in an attempt to balance between MIN performance and cost. In this paper the performance evaluation of dual-priority, double-buffered, multilayer MINs under single hotspot setups is presented and analyzed using simulation experiments. The findings of this paper can be used by MIN designers to optimally configure their networks

Crossref

Directory of Open Access Journals

Performance Evaluation of Routing Protocols in Wireless Sensor Networks

Author: Wagan Raja Asif
Publication venue
Publication date: 01/01/2012
Field of study

The growing field of information technology enhanced the capabilities of the wireless communication. The large usage of WSN in the various fields of the real world is scaling with the wide variety of roles for wireless sensor network performance is challenging tasks. The issues of performance in the wireless sensor networks in many literatures, yet more studies are being done on the performance because the user and application needs are keep increasing,to encounter the challenges of the performance issues are studied here by digging out the routing protocols performance in WSN. To conduct the study and analysis on performance of WSN protocols the there are various performance metrics used for the evaluation of performance in WSN. This study will be carried out to come up with the simulation experiments over the directed diffusion (DD) and LEACH routing protocols in terms of energy consumption, congestion and reliability in the wireless sensor networks (WSN) environment with the low power consumptions. The simulation experiments in this study are based on the reliability, delay and other constraints to compare the speed, reliability and electricity saving data communication in the wireless sensor networks (WSN). The discussion of the conducted simulation experiments describes the steps which are pertaining to the protocols and tradeoffs and complexity of the data traffic for the efficiency. The NS2 simulation is used for the simulation based experiments for performance of wireless sensor network (WSN) communications which is demonstrating the comparative effectiveness of the routing protocols in the recent concepts. The results of the simulation are lightening the ways for the minimization of the delay and enhancement in the reliability issues in wireless sensor networks (WSN)

Universiti Utara Malaysia: UUM eTheses

Concepts for 18/30 GHz satellite communication system, volume 1

Author: Baker M.
Cuccia L.
Davies R.
Jorasch R.
Mitchell C.
Publication venue
Publication date
Field of study

Concepts for 18/30 GHz satellite communication systems are presented. Major terminal trunking as well as direct-to-user configurations were evaluated. Critical technologies in support of millimeter wave satellite communications were determined

NASA Technical Reports Server

A low-cost high-speed twin-prefetching DSP-based shared-memory system for real-time image processing applications

Author: Christou Charalambos Stephanou
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/1998
Field of study

This dissertation introduces, investigates, and evaluates a low-cost high-speed twin-prefetching DSP-based bus-interconnected shared-memory system for real-time image processing applications. The proposed architecture can effectively support 32 DSPs in contrast to a maximum of 4 DSPs supported by existing DSP-based bus- interconnected systems. This significant enhancement is achieved by introducing two small programmable fast memories (Twins) between the processor and the shared bus interconnect. While one memory is transferring data from/to the shared memory, the other is supplying the core processor with data. The elimination of the traditional direct linkage of the shared bus and processor data bus makes feasible the utilization of a wider shared bus i.e., shared bus width becomes independent of the data bus width of the processors. The fast prefetching memories and the wider shared bus provide additional bus bandwidth into the system, which eliminates large memory latencies; such memory latencies constitute the major drawback for the performance of shared-memory multiprocessors. Furthermore, in contrast to existing DSP-based uniprocessor or multiprocessor systems the proposed architecture does not require all data to be placed on on-chip or off-chip expensive fast memory in order to reach or maintain peak performance. Further, it can maintain peak performance regardless of whether the processed image is small or large. The performance of the proposed architecture has been extensively investigated executing computationally intensive applications such as real-time high-resolution image processing. The effect of a wide variety of hardware design parameters on performance has been examined. More specifically tables and graphs comprehensively analyze the performance of 1, 2, 4, 8, 16, 32 and 64 DSP-based systems, for a wide variety of shared data interconnect widths such as 32, 64, 128, 256 and 512. In addition, the effect of the wide variance of temporal and spatial locality (present in different applications) on the multiprocessor\u27s execution time is investigated and analyzed. Finally, the prefetching cache-size was varied from a few kilobytes to 4 Mbytes and the corresponding effect on the execution time was investigated. Our performance analysis has clearly showed that the execution time converges to a shallow minimum i.e., it is not sensitive to the size of the prefetching cache. The significance of this observation is that near optimum performance can be achieved with a small (16 to 300 Kbytes) amount of prefetching cache

Digital Commons @ New Jersey Institute of Technology (NJIT)

34th Midwest Symposium on Circuits and Systems-Final Program

Author
Publication venue
Publication date: 01/05/1991
Field of study

Organized by the Naval Postgraduate School Monterey California. Cosponsored by the IEEE Circuits and Systems Society. Symposium Organizing Committee: General Chairman-Sherif Michael, Technical Program-Roberto Cristi, Publications-Michael Soderstrand, Special Sessions- Charles W. Therrien, Publicity: Jeffrey Burl, Finance: Ralph Hippenstiel, and Local Arrangements: Barbara Cristi

Calhoun, Institutional Archive of the Naval Postgraduate School

Design and simulation of an MIMD shared memory multiprocessor with interleaved instruction streams

Author: Stiemerling Thomas R.
Publication venue: The University of Edinburgh
Publication date: 01/01/1991
Field of study

Edinburgh Research Archive

A formalism for describing and simulating systems with interacting components.

Author: Corr Glenn A.
Publication venue
Publication date: 31/05/1996
Field of study

This thesis addresses the problem of descriptive complexity presented by systems involving a high number of interacting components. It investigates the evaluation measure of performability and its application to such systems. A new description and simulation language, ICE and it's application to performability modelling is presented. ICE (Interacting ComponEnts) is based upon an earlier description language which was first proposed for defining reliability problems. ICE is declarative in style and has a limited number of keywords. The ethos in the development of the language has been to provide an intuitive formalism with a powerful descriptive space. The full syntax of the language is presented with discussion as to its philosophy. The implementation of a discrete event simulator using an ICE interface is described, with use being made of examples to illustrate the functionality of the code and the semantics of the language. Random numbers are used to provide the required stochastic behaviour within the simulator. The behaviour of an industry standard generator within the simulator and different methods of number allocation are shown. A new generator is proposed that is a development of a fast hardware shift register generator and is demonstrated to possess good statistical properties and operational speed. For the purpose of providing a rigorous description of the language and clarification of its semantics, a computational model is developed using the formalism of extended coloured Petri nets. This model also gives an indication of the language's descriptive power relative to that of a recognised and well developed technique. Some recognised temporal and structural problems of system event modelling are identified. and ICE solutions given. The growing research area of ATM communication networks is introduced and a sophisticated top down model of an ATM switch presented. This model is simulated and interesting results are given. A generic ICE framework for performability modelling is developed and demonstrated. This is considered as a positive contribution to the general field of performability research

Open Access Institutional Repository at Robert Gordon University

Compile-time optimization of near-neighbor communication for scalable shared-memory multiprocessors

Author: Abraham Santosh G.
Hudak David E.
Publication venue: 'Elsevier BV'
Publication date: 01/08/1992
Field of study

Scalable shared-memory multiprocessor systems are typically NUMA (nonuniform memory access) machines, where the exploitation of the memory hierarchy is critical to achieving high performance. Iterative data parallel loops with near-neighbor communication account for many important numerical applications. In such loops, the communication of partial results stresses the memory system performance. In this paper, we develop data placement schemes that minimize communication time where the near-neighbor interaction is determined by a stencil. Under a given loop partition, our compile-time algorithm partitions global data into four classes for each processor, with each class requiring specific consistency maintenance requirements. The ADAPT (Automatic Data Allocation and Partitioning Tool) system was implemented to automatically partition parallel code segments for the BBN TC2000, a scalable shared-memory multiprocessor. ADAPT caches global arrays and maintains data consistency in software through instructions that flush data from private caches. Restructuring of a fluid flow code segment by ADAPT improved performance by a factor of more than 3 on the BBN TC2000. Features in current generation pipelined processors with multiple functional units permit the overlap of memory accesses with computation. Our experiments on the BBN TC2000 show that the degree of overlap is limited by architectural parameters, such as the number of CPU registers.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/30342/1/0000744.pd

Deep Blue Documents at the University of Michigan

Three Highly Parallel Computer Architectures and Their Suitability for Three Representative Artificial Intelligence Problems

Author: Katriel Ron
Publication venue: ScholarlyCommons
Publication date: 28/09/1987
Field of study

Virtually all current Artificial Intelligence (AI) applications are designed to run on sequential (von Neumann) computer architectures. As a result, current systems do not scale up. As knowledge is added to these systems, a point is reached where their performance quickly degrades. The performance of a von Neumann machine is limited by the bandwidth between memory and processor (the von Neumann bottleneck). The bottleneck is avoided by distributing the processing power across the memory of the computer. In this scheme the memory becomes the processor (a smart memory ). This paper highlights the relationship between three representative AI application domains, namely knowledge representation, rule-based expert systems, and vision, and their parallel hardware realizations. Three machines, covering a wide range of fundamental properties of parallel processors, namely module granularity, concurrency control, and communication geometry, are reviewed: the Connection Machine (a fine-grained SIMD hypercube), DADO (a medium-grained MIMD/SIMD/MSIMD tree-machine), and the Butterfly (a coarse-grained MIMD Butterflyswitch machine)

ScholarlyCommons@Penn