66 research outputs found
Differentiated ABR: A new architecture for flow control and service differentiation in optical burst switched networks
In this paper, we study a new control plane protocol, called Differentiated ABR (D-ABR), for flow control and service differentiation in optical burst switched networks. Using D-ABR, we show using simulations that the optical network can be designed to work at any desired burst blocking probability by the flow control service of the proposed architecture. This architecture requires certain modifications to the existing control plane mechanisms as well as incorporation of certain scheduling mechanisms at the ingress nodes; however we do not make any specific assumptions on the data plane for the optical core nodes. Moreover, with this protocol, it is possible to almost perfectly isolate high priority and low priority traffic throughout the optical network as in the strict priority-based service differentiation in electronically switched networks. © 2005 IEEE
A Survey of Techniques for Architecting TLBs
“Translation lookaside buffer” (TLB) caches virtual to physical address translation information and is used
in systems ranging from embedded devices to high-end servers. Since TLB is accessed very frequently
and a TLB miss is extremely costly, prudent management of TLB is important for improving performance
and energy efficiency of processors. In this paper, we present a survey of techniques for architecting and
managing TLBs. We characterize the techniques across several dimensions to highlight their similarities and
distinctions. We believe that this paper will be useful for chip designers, computer architects and system
engineers
Application of advanced on-board processing concepts to future satellite communications systems: Bibliography
Abstracts are presented of a literature survey of reports concerning the application of signal processing concepts. Approximately 300 references are included
Design of a scheduling mechanism for an ATM switch
Includes bibliographical references.In this dissenation, the candidate proposes the use of a ratio to multiply the weights used in the matching algorithm to control the delay that individual connections encounter. We demonstrate the improved characteristics of a switch using a ratio presenting results from simulations. The candidate also proposes a novel scheduling mechanism for an input queued ATM switch. In order to evaluate the performance of the scheduling mechanism in terms of throughput and fairness, the use of various metrics, initially proposed in the literature to evaluate output buffered switches are evaluated, adjusted and applied to input scheduling. In particular the Worst-case Fairness Index (WFl) which measures the maximum delay a connection will encounter is derived for use in input queued switches
Flow control and service differentiation in optical burst switching networks
Cataloged from PDF version of article.Optical Burst Switching (OBS) is being considered as a candidate architecture
for the next generation optical Internet. The central idea behind OBS is the assembly
of client packets into longer bursts at the edge of an OBS domain and the
promise of optical technologies to enable switch reconfiguration at the burst level
therefore providing a near-term optical networking solution with finer switching
granularity in the optical domain. In conventional OBS, bursts are injected to
the network immediately after their assembly irrespective of the loading on the
links, which in turn leads to uncontrolled burst losses and deteriorating performance
for end users. Another key concern related to OBS is the difficulty of
supporting QoS (Quality of Service) in the optical domain whereas support of
differentiated services via per-class queueing is very common in current electronically
switched networks. In this thesis, we propose a new control plane protocol,
called Differentiated ABR (D-ABR), for flow control (i.e., burst shaping) and
service differentiation in optical burst switching networks. Using D-ABR, we
show with the aid of simulations that the optical network can be designed to
work at any desired burst blocking probability by the flow control service of the proposed architecture. The proposed architecture requires certain modifications
to the existing control plane mechanisms as well as incorporation of advanced
scheduling mechanisms at the ingress nodes; however we do not make any specific
assumptions on the data plane of the optical nodes. With this protocol, it is
possible to almost perfectly isolate high priority and low priority traffic throughout
the optical network as in the strict priority-based service differentiation in
electronically switched networks. Moreover, the proposed architecture moves the
congestion away from the OBS domain to the edges of the network where it is
possible to employ advanced queueing and buffer management mechanisms. We
also conjecture that such a controlled OBS architecture may reduce the number
of costly Wavelength Converters (WC) and Fiber Delay Lines (FDL) that are
used for contention resolution inside an OBS domain.Boyraz, HakanM.S
Recommended from our members
Performance analysis and improvement of InfiniBand networks. Modelling and effective Quality-of-Service mechanisms for interconnection networks in cluster computing systems.
The InfiniBand Architecture (IBA) network has been proposed as a new
industrial standard with high-bandwidth and low-latency suitable for constructing
high-performance interconnected cluster computing systems. This architecture
replaces the traditional bus-based interconnection with a switch-based network for
the server Input-Output (I/O) and inter-processor communications. The efficient
Quality-of-Service (QoS) mechanism is fundamental to ensure the import at QoS
metrics, such as maximum throughput and minimum latency, leaving aside other
aspects like guarantee to reduce the delay, blocking probability, and mean queue
length, etc.
Performance modelling and analysis has been and continues to be of great
theoretical and practical importance in the design and development of
communication networks. This thesis aims to investigate efficient and cost-effective
QoS mechanisms for performance analysis and improvement of InfiniBand
networks in cluster-based computing systems.
Firstly, a rate-based source-response link-by-link admission and congestion
control function with improved Explicit Congestion Notification (ECN) packet
marking scheme is developed. This function adopts the rate control to reduce
congestion of multiple-class traffic. Secondly, a credit-based flow control scheme is
presented to reduce the mean queue length, throughput and response time of the system. In order to evaluate the performance of this scheme, a new queueing
network model is developed. Theoretical analysis and simulation experiments show
that these two schemes are quite effective and suitable for InfiniBand networks.
Finally, to obtain a thorough and deep understanding of the performance attributes
of InfiniBand Architecture network, two efficient threshold function flow control
mechanisms are proposed to enhance the QoS of InfiniBand networks; one is Entry
Threshold that sets the threshold for each entry in the arbitration table, and other is
Arrival Job Threshold that sets the threshold based on the number of jobs in each
Virtual Lane. Furthermore, the principle of Maximum Entropy is adopted to analyse
these two new mechanisms with the Generalized Exponential (GE)-Type
distribution for modelling the inter-arrival times and service times of the input traffic.
Extensive simulation experiments are conducted to validate the accuracy of the
analytical models
Optimizing Communication for Massively Parallel Processing
The current trends in high performance computing show that large machines with tens of thousands of processors will soon be readily available. The IBM Bluegene-L machine with 128k processors (which is currently being deployed) is an important step in this direction. In this scenario, it is going to be a significant burden for the programmer to manually scale his applications. This task of scaling involves addressing issues like load-imbalance and communication overhead. In this thesis, we explore several communication optimizations to help parallel applications to easily scale on a large number of processors. We also present automatic runtime techniques to relieve the programmer from the burden of optimizing communication in his applications.
This thesis explores processor virtualization to improve communication performance in applications. With processor virtualization, the computation is mapped to virtual processors (VPs). After one VP has finished computation and is waiting for responses to its messages, another VP can compute, thus overlapping communication with computation. This overlap is only effective if the processor overhead of the communication operation is a small fraction of the total communication time. Fortunately, with network interfaces having co-processors, this happens to be true and processor virtualization has a natural advantage on such interconnects.
The communication optimizations we present in this thesis, are motivated by applications such as NAMD (a classical molecular dynamics application) and CPAIMD (a quantum chemistry application). Applications like NAMD and CPAIMD consume a fair share of the time available on supercomputers. So, improving their performance would be of great value. We have successfully scaled NAMD to 1TF of peak performance on 3000 processors of PSC Lemieux, using the techniques presented in this thesis.
We study both point-to-point communication and collective communication (specifically all-to-all communication). On a large number of processors all-to-all communication can take several milli-seconds to finish. With synchronous collectives defined in MPI, the processor idles while the collective messages are in flight. Therefore, we demonstrate an asynchronous collective communication framework, to let the CPU compute while the all-to-all messages are in flight. We also show that the best strategy for all-to-all communication depends on the message size, number of processors and other dynamic parameters. This suggests that these parameters can be observed at runtime and used to choose the optimal strategy for all-to-all communication. In this thesis, we demonstrate adaptive strategy switching for all-to-all communication.
The communication optimization framework presented in this thesis, has been designed to optimize communication in the context of processor virtualization and dynamic migrating objects. We present the streaming strategy to optimize fine grained object-to-object communication.
In this thesis, we motivate the need for hardware collectives, as processor based collectives can be delayed by intermediate that processors busy with computation. We explore a next generation interconnect that supports collectives in the switching hardware. We show the performance gains of hardware collectives through synthetic benchmarks
- …