163 research outputs found
Recommended from our members
Cross-Layer Platform for Dynamic, Energy-Efficient Optical Networks
The design of the next-generation Internet infrastructure is driven by the need to sustain the massive growth in bandwidth demands. Novel, energy-efficient, optical networking technologies and architectures are required to effectively meet the stringent performance requirements with low cost and ultrahigh energy efficiencies. In this thesis, a cross-layer communications platform is proposed to enable greater intelligence and functionality on the physical layer. Providing the optical layer with advanced networking capabilities will facilitate the dynamic management and optimization of optical switching based on performance monitoring measurements and higher-layer attributes. The cross-layer platform aims to create a new framework for networks to incorporate packet-scale measurement subsystems and techniques for monitoring the health of the optical channel. This will allow for quality-of-service- and energy-aware routing schemes, as well as an enhanced awareness of the optical data signals. This thesis first presents the design and development of an optical packet switching fabric. Leveraging a networking test-bed environment to validate networking hypotheses, advanced switching functionalities are demonstrated, including the support for quality-of-service based routing and packet multicasting. The investigated cross-layering is based on emerging optical technologies, enabling packet protection techniques and packet-rate switching fabric reconfiguration. Coupled with fast performance monitoring, the platform will achieve significant performance gains within the endeavor of all-optical switching. Allowing for a more intelligent, programmable optical layer aims to support greater flexibility with respect to bandwidth allocation and potentially a significant reduction in the network's energy consumption. The ultimate deliverable of this work is a high-performance, cross-layer enabled optical network node. The experimental demonstration of an initial prototype creates a dynamic network element with distributed control plane management, featuring fast packet-rate optical switching capabilities and embedded physical-layer performance monitoring modules. The cross-layer box enables an intelligent traffic delivery system that can dynamically manipulate optical switching on a packet-granular scale. With the goal of achieving advanced multi-layer routing and control algorithms, the network node requires an intelligent co-optimization across all the layers. The proposed cross-layer design should drive optical technologies and architectures in an innovative way, in order to fulfill the void between the design of basic photonic devices and the networking protocols that use them. The performance of the entire network -- from the optical components, to the routing algorithms and user applications -- should be optimized in concert. This contribution to the area of cross-layer network design creates an adaptable optical pipe that is extremely flexible and intelligent aware of both the physical optical signals and higher-layer requirements. The impact of this work will be seen in the realization of dynamic, energy-efficient optical communication links in future networking infrastructures
Multi-Terabit/s IP Switching with Guaranteed Service for Streaming Traffic
traffic on the Internet continues to grow exponentially, there is a real need to solve transmission and switching scalability. Moreover, future Internet traffic will be dominated by streaming media flows, such as video-telephony, video-conferencing, 3D video, virtual reality, and many more. Consequently, network solutions will need to offer quality of service and traffic engineering together with the above mentioned scalability - i.e., over-provisioning is not likely be a viable solution to accommodate streaming media traffic. This paper describes the architecture of a ultra-scalable IP switch and the first experiments with a prototypal implementation. The switch scalability is a consequence of it operating pipeline forwarding of packets, which also results in quality of service guarantees for UDP-based streaming applications, while preserving elastic TCP-based traffic as is, i.e., without affecting any existing applications based on "best- effort" services. Moreover, the prototype demonstrates the low complexity of pipeline forwarding implementation as the deployed network gear was realized from off-the-shelf components in only nine months through the design, implementation, and testing efforts of the authors
An efficient design space exploration framework to optimize power-efficient heterogeneous many-core multi-threading embedded processor architectures
By the middle of this decade, uniprocessor architecture performance had hit a roadblock due to a combination of factors, such as excessive power dissipation due to high operating frequencies, growing memory access latencies, diminishing returns on deeper instruction pipelines, and a saturation of available instruction level parallelism in applications. An attractive and viable alternative embraced by all the processor vendors was multi-core architectures where throughput is improved by using micro-architectural features such as multiple processor cores, interconnects and low latency shared caches integrated on a single chip. The individual cores are often simpler than uniprocessor counterparts, use hardware multi-threading to exploit thread-level parallelism and latency hiding and typically achieve better performance-power figures. The overwhelming success of the multi-core microprocessors in both high performance and embedded computing platforms motivated chip architects to dramatically scale the multi-core processors to many-cores which will include hundreds of cores on-chip to further improve throughput. With such complex large scale architectures however, several key design issues need to be addressed. First, a wide range of micro- architectural parameters such as L1 caches, load/store queues, shared cache structures and interconnection topologies and non-linear interactions between them define a vast non-linear multi-variate micro-architectural design space of many-core processors; the traditional method of using extensive in-loop simulation to explore the design space is simply not practical. Second, to accurately evaluate the performance (measured in terms of cycles per instruction (CPI)) of a candidate design, the contention at the shared cache must be accounted in addition to cycle-by-cycle behavior of the large number of cores which superlinearly increases the number of simulation cycles per iteration of the design exploration. Third, single thread performance does not scale linearly with number of hardware threads per core and number of cores due to memory wall effect. This means that at every step of the design process designers must ensure that single thread performance is not unacceptably slowed down while increasing overall throughput. While all these factors affect design decisions in both high performance and embedded many-core processors, the design of embedded processors required for complex embedded applications such as networking, smart power grids, battlefield decision-making, consumer electronics and biomedical devices to name a few, is fundamentally different from its high performance counterpart because of the need to consider (i) low power and (ii) real-time operations. This implies the design objective for embedded many-core processors cannot be to simply maximize performance, but improve it in such a way that overall power dissipation is minimized and all real-time constraints are met. This necessitates additional power estimation models right at the design stage to accurately measure the cost and reliability of all the candidate designs during the exploration phase.
In this dissertation, a statistical machine learning (SML) based design exploration framework is presented which employs an execution-driven cycle- accurate simulator to accurately measure power and performance of embedded many-core processors. The embedded many-core processor domain is Network Processors (NePs) used to processed network IP packets. Future generation NePs required to operate at terabits per second network speeds captures all the aspects of a complex embedded application consisting of shared data structures, large volume of compute-intensive and data-intensive real-time bound tasks and a high level of task (packet) level parallelism. Statistical machine learning (SML) is used to efficiently model performance and power of candidate designs in terms of wide ranges of micro-architectural parameters. The method inherently minimizes number of in-loop simulations in the exploration framework and also efficiently captures the non-linear interactions between the micro-architectural design parameters. To ensure scalability, the design space is partitioned into (i) core-level micro-architectural parameters to optimize single core architectures subject to the real-time constraints and (ii) shared memory level micro- architectural parameters to explore the shared interconnection network and shared cache memory architectures and achieves overall optimality. The cost function of our exploration algorithm is the total power dissipation which is minimized, subject to the constraints of real-time throughput (as determined from the terabit optical network router line-speed) required in IP packet processing embedded application
Enhancing QoS provisioning and granularity in next generation internet
Next Generation IP technology has the potential to prevail, both in the access and in the core networks, as we are moving towards a multi-service, multimedia and high-speed networking environment. Many new applications, including the multimedia applications, have been developed and deployed, and demand Quality of Service (QoS) support from the Internet, in addition to the current best effort service. Therefore, QoS provisioning techniques in the Internet to guarantee some specific QoS parameters are more a requirement than a desire. Due to the large amount of data flows and bandwidth demand, as well as the various QoS requirements, scalability and fine granularity in QoS provisioning are required. In this dissertation, the end-to-end QoS provisioning mechanisms are mainly studied, in order to provide scalable services with fine granularity to the users, so that both users and network service providers can achieve more benefits from the QoS provisioned in the network.
To provide the end-to-end QoS guarantee, single-node QoS provisioning schemes have to be deployed at each router, and therefore, in this dissertation, such schemes are studied prior to the study of the end-to-end QoS provisioning mechanisms. Specifically, the effective sharing of the output bandwidth among the large amount of data flows is studied, so that fairness in the bandwidth allocation among the flows can be achieved in a scalable fashion. A dual-rate grouping architecture is proposed in this dissertation, in which the granularity in rate allocation can be enhanced, while the scalability of the one-rate grouping architecture is still maintained. It is demonstrated that the dual-rate grouping architecture approximates the ideal per-flow based PFQ architecture better than the one-rate grouping architecture, and provides better immunity capability.
On the end-to-end QoS provisioning, a new Endpoint Admission Control scheme for Diffserv networks, referred to as Explicit Endpoint Admission Control (EEAC), is proposed, in which the admission control decision is made by the end hosts based on the end-to-end performance of the network. A novel concept, namely the service vector, is introduced, by which an end host can choose different services at different routers along its data path. Thus, the proposed service provisioning paradigm decouples the end-to-end QoS provisioning from the service provisioning at each router, and the end-to-end QoS granularity in the Diffserv networks can be enhanced, while the implementation complexity of the Diffserv model is maintained. Furthermore, several aspects of the implementation of the EEAC and service vector paradigm, referred to as EEAC-SV, in the Diffserv architecture are also investigated. The performance analysis and simulation results demonstrate that the proposed EEAC-SV scheme, not only increases the benefit to the service users, but also enhances the benefit to the network service provider in terms of network resource utilization. The study also indicates that the proposed EEAC-SV scheme can provide a compatible and friendly networking environment to the conventional TCP flows, and the scheme can be deployed in the current Internet in an incremental and gradual fashion
- âŠ