1,053 research outputs found

    Branch Prediction For Network Processors

    Get PDF
    Originally designed to favour flexibility over packet processing performance, the future of the programmable network processor is challenged by the need to meet both increasing line rate as well as providing additional processing capabilities. To meet these requirements, trends within networking research has tended to focus on techniques such as offloading computation intensive tasks to dedicated hardware logic or through increased parallelism. While parallelism retains flexibility, challenges such as load-balancing limit its scope. On the other hand, hardware offloading allows complex algorithms to be implemented at high speed but sacrifice flexibility. To this end, the work in this thesis is focused on a more fundamental aspect of a network processor, the data-plane processing engine. Performing both system modelling and analysis of packet processing functions; the goal of this thesis is to identify and extract salient information regarding the performance of multi-processor workloads. Following on from a traditional software based analysis of programme workloads, we develop a method of modelling and analysing hardware accelerators when applied to network processors. Using this quantitative information, this thesis proposes an architecture which allows deeply pipelined micro-architectures to be implemented on the data-plane while reducing the branch penalty associated with these architectures

    Effective Mobile Routing Through Dynamic Addressing

    Get PDF
    Military communications has always been an important factor in military victory and will surely play an important part in future combat. In modern warfare, military units are usually deployed without existing network infrastructure. The IP routing protocol, designed for hierarchical networks cannot easily be applied in military networks due to the dynamic topology expected in military environments. Mobile ad-hoc networks (MANETs) represent an appropriate network for small military networks. But, most ad-hoc routing protocols suffer from the problem of scalability for large networks. Hierarchical routing schemes based on the IP address structure are more scalable than ad-hoc routing but are not flexible for a network with very dynamic topology. This research seeks a compromise between the two; a hybrid routing structure which combines mobile ad-hoc network routing with hierarchical network routing using pre-planned knowledge about where the various military units will be located and probable connections available. This research evaluates the performance of the hybrid routing and compares that routing with a flat ad-hoc routing protocol, namely the Ad-hoc On-demand Distance Vector (AODV) routing protocol with respect to goodput ratio, packet end to- end delay, and routing packet overhead. It shows that hybrid routing generates lower routing control overhead, better goodput ratio, and lower end-to-end packet delay than AODV routing protocol in situations where some a-priori knowledge is available

    Network architecture for large-scale distributed virtual environments

    Get PDF
    Distributed Virtual Environments (DVEs) provide 3D graphical computer generated environments with stereo sound, supporting real-time collaboration between potentially large numbers of users distributed around the world. Early DVEs has been used over local area networks (LANs). Recently with the Internet's development into the most common embedding for DVEs these distributed applications have been moved towards an exploiting IP networks. This has brought the scalability challenges into the DVEs evolution. The network bandwidth resource is the more limited resource of the DVE system and to improve the DVE's scalability it is necessary to manage carefully this resource. To achieve the saving in the network bandwidth the different types of the network traffic that is produced by the DVEs have to be considered. DVE applications demand· exchange of the data that forms different types of traffic such as a computer data type, video and audio, and a 3D data type to keep the consistency of the application's state. The problem is that the meeting of the QoS requirements of both control and continuous media traffic already have been covered by the existing research. But QoS for transfer of the 3D information has not really been considered. The 3D DVE geometry traffic is very bursty in nature and places a high demands on the network for short intervals of time due to the quite large size of the 3D models and the DVE application requirements to transmit a 3D data as quick as possible. The main motivation in carrying out the work presented in this thesis is to find a solution to improve the scalability of the DVE applications by a consideration the QoS requirements of the 3D DVE geometrical data type. In this work we are investigating the possibility to decrease the network bandwidth utilization by the 3D DVE traffic using the level of detail (LOD) concept and the active networking approach. The background work of the thesis surveys the DVE applications and the scalability requirements of the DVE systems. It also discusses the active networks and multiresolution representation and progressive transmission of the 3D data. The new active networking approach to the transmission of the 3D geometry data within the DVE systems is proposed in this thesis. This approach enhances the currently applied peer-to-peer DVE architecture by adding to the peer-to-peer multicast neny_ork layer filtering of the 3D flows an application level filtering on the active intermediate nodes. The active router keeps the application level information about the placements of users. This information is used by active routers to prune more detailed 3D data flows (higher LODs) in the multicast tree arches that are linked to the distance DVE participants. The exploration of possible benefits of exploiting the proposed active approach through the comparison with the non-active approach is carried out using the simulation­based performance modelling approach. Complex interactions between participants in DVE application and a large number of analyzed variables indicate that flexible simulation is more appropriate than mathematical modelling. To build a test bed will not be feasible. Results from the evaluation demonstrate that the proposed active approach shows potential benefits to the improvement of the DVE's scalability but the degree of improvement depends on the users' movement pattern. Therefore, other active networking methods to support the 3D DVE geometry transmission may also be required

    Towards Terabit Carrier Ethernet and Energy Efficient Optical Transport Networks

    Get PDF

    An efficient design space exploration framework to optimize power-efficient heterogeneous many-core multi-threading embedded processor architectures

    Get PDF
    By the middle of this decade, uniprocessor architecture performance had hit a roadblock due to a combination of factors, such as excessive power dissipation due to high operating frequencies, growing memory access latencies, diminishing returns on deeper instruction pipelines, and a saturation of available instruction level parallelism in applications. An attractive and viable alternative embraced by all the processor vendors was multi-core architectures where throughput is improved by using micro-architectural features such as multiple processor cores, interconnects and low latency shared caches integrated on a single chip. The individual cores are often simpler than uniprocessor counterparts, use hardware multi-threading to exploit thread-level parallelism and latency hiding and typically achieve better performance-power figures. The overwhelming success of the multi-core microprocessors in both high performance and embedded computing platforms motivated chip architects to dramatically scale the multi-core processors to many-cores which will include hundreds of cores on-chip to further improve throughput. With such complex large scale architectures however, several key design issues need to be addressed. First, a wide range of micro- architectural parameters such as L1 caches, load/store queues, shared cache structures and interconnection topologies and non-linear interactions between them define a vast non-linear multi-variate micro-architectural design space of many-core processors; the traditional method of using extensive in-loop simulation to explore the design space is simply not practical. Second, to accurately evaluate the performance (measured in terms of cycles per instruction (CPI)) of a candidate design, the contention at the shared cache must be accounted in addition to cycle-by-cycle behavior of the large number of cores which superlinearly increases the number of simulation cycles per iteration of the design exploration. Third, single thread performance does not scale linearly with number of hardware threads per core and number of cores due to memory wall effect. This means that at every step of the design process designers must ensure that single thread performance is not unacceptably slowed down while increasing overall throughput. While all these factors affect design decisions in both high performance and embedded many-core processors, the design of embedded processors required for complex embedded applications such as networking, smart power grids, battlefield decision-making, consumer electronics and biomedical devices to name a few, is fundamentally different from its high performance counterpart because of the need to consider (i) low power and (ii) real-time operations. This implies the design objective for embedded many-core processors cannot be to simply maximize performance, but improve it in such a way that overall power dissipation is minimized and all real-time constraints are met. This necessitates additional power estimation models right at the design stage to accurately measure the cost and reliability of all the candidate designs during the exploration phase. In this dissertation, a statistical machine learning (SML) based design exploration framework is presented which employs an execution-driven cycle- accurate simulator to accurately measure power and performance of embedded many-core processors. The embedded many-core processor domain is Network Processors (NePs) used to processed network IP packets. Future generation NePs required to operate at terabits per second network speeds captures all the aspects of a complex embedded application consisting of shared data structures, large volume of compute-intensive and data-intensive real-time bound tasks and a high level of task (packet) level parallelism. Statistical machine learning (SML) is used to efficiently model performance and power of candidate designs in terms of wide ranges of micro-architectural parameters. The method inherently minimizes number of in-loop simulations in the exploration framework and also efficiently captures the non-linear interactions between the micro-architectural design parameters. To ensure scalability, the design space is partitioned into (i) core-level micro-architectural parameters to optimize single core architectures subject to the real-time constraints and (ii) shared memory level micro- architectural parameters to explore the shared interconnection network and shared cache memory architectures and achieves overall optimality. The cost function of our exploration algorithm is the total power dissipation which is minimized, subject to the constraints of real-time throughput (as determined from the terabit optical network router line-speed) required in IP packet processing embedded application

    Segment Routing: a Comprehensive Survey of Research Activities, Standardization Efforts and Implementation Results

    Full text link
    Fixed and mobile telecom operators, enterprise network operators and cloud providers strive to face the challenging demands coming from the evolution of IP networks (e.g. huge bandwidth requirements, integration of billions of devices and millions of services in the cloud). Proposed in the early 2010s, Segment Routing (SR) architecture helps face these challenging demands, and it is currently being adopted and deployed. SR architecture is based on the concept of source routing and has interesting scalability properties, as it dramatically reduces the amount of state information to be configured in the core nodes to support complex services. SR architecture was first implemented with the MPLS dataplane and then, quite recently, with the IPv6 dataplane (SRv6). IPv6 SR architecture (SRv6) has been extended from the simple steering of packets across nodes to a general network programming approach, making it very suitable for use cases such as Service Function Chaining and Network Function Virtualization. In this paper we present a tutorial and a comprehensive survey on SR technology, analyzing standardization efforts, patents, research activities and implementation results. We start with an introduction on the motivations for Segment Routing and an overview of its evolution and standardization. Then, we provide a tutorial on Segment Routing technology, with a focus on the novel SRv6 solution. We discuss the standardization efforts and the patents providing details on the most important documents and mentioning other ongoing activities. We then thoroughly analyze research activities according to a taxonomy. We have identified 8 main categories during our analysis of the current state of play: Monitoring, Traffic Engineering, Failure Recovery, Centrally Controlled Architectures, Path Encoding, Network Programming, Performance Evaluation and Miscellaneous...Comment: SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIAL

    Efficient traffic trajectory error detection

    Get PDF
    Our recent survey on publicly reported router bugs shows that many router bugs, once triggered, can cause various traffic trajectory errors including traffic deviating from its intended forwarding paths, traffic being mistakenly dropped and unauthorized traffic bypassing packet filters. These traffic trajectory errors are serious problems because they may cause network applications to fail and create security loopholes for network intruders to exploit. Therefore, traffic trajectory errors must be quickly and efficiently detected so that the corrective action can be performed in a timely fashion. Detecting traffic trajectory errors requires the real-time tracking of the control states (e.g., forwarding tables, packet filters) of routers and the scalable monitoring of the actual traffic trajectories in the network. Traffic trajectory errors can then be detected by efficiently comparing the observed traffic trajectories against the intended control states. Making such trajectory error detection efficient and practical for large-scale high speed networks requires us to address many challenges. First, existing traffic trajectory monitoring algorithms require the simultaneously monitoring of all network interfaces in a network for the packets of interest, which will cause a daunting monitoring overhead. To improve the efficiency of traffic trajectory monitoring, we propose the router group monitoring technique that only monitors the periphery interfaces of a set of selected router groups. We analyze a large number of real network topologies and show that effective router groups with high trajectory error detection rates exist in all cases. We then develop an analytical model for quickly and accurately estimating the detection rates of different router groups. Based on this model, we propose an algorithm to select a set of router groups that can achieve complete error detection and low monitoring overhead. Second, maintaining the control states of all the routers in the network requires a significant amount of memory. However, there exist no studies on how to efficiently store multiple complex packet filters. We propose to store multiple packet filters using a shared Hyper- Cuts decision tree. To help decide which subset of packet filters should share a HyperCuts decision tree, we first identify a number of important factors that collectively impact the efficiency of the resulting shared HyperCuts decision tree. Based on the identified factors, we then propose to use machine learning techniques to predict whether any pair of packet filters should share a tree. Given the pair-wise prediction matrix, a greedy heuristic algorithm is used to classify packet filters into a number of shared HyperCuts decision trees. Our experiments using both real packet filters and synthetic packet filters show that our shared HyperCuts decision trees require considerably less memory while having the same or a slightly higher average height than separate trees. In addition, the shared HyperCuts decision trees enable concurrent lookup of multiple packet filters sharing the same tree. Finally, based on the two proposed techniques, we have implemented a complete prototype system that is compatible with Juniper's JUNOS. We have shown in the thesis that, to detect traffic trajectory errors, it is sufficient to only selectively implement a small set of key functions of a full-fletched router on our prototype, which makes our prototype simpler and less error prone. We conduct both Emulab experiments and micro-benchmark experiments to show that the system can efficiently track router control states, monitor traffic trajectories and detect traffic trajectory errors

    Energy management in communication networks: a journey through modelling and optimization glasses

    Full text link
    The widespread proliferation of Internet and wireless applications has produced a significant increase of ICT energy footprint. As a response, in the last five years, significant efforts have been undertaken to include energy-awareness into network management. Several green networking frameworks have been proposed by carefully managing the network routing and the power state of network devices. Even though approaches proposed differ based on network technologies and sleep modes of nodes and interfaces, they all aim at tailoring the active network resources to the varying traffic needs in order to minimize energy consumption. From a modeling point of view, this has several commonalities with classical network design and routing problems, even if with different objectives and in a dynamic context. With most researchers focused on addressing the complex and crucial technological aspects of green networking schemes, there has been so far little attention on understanding the modeling similarities and differences of proposed solutions. This paper fills the gap surveying the literature with optimization modeling glasses, following a tutorial approach that guides through the different components of the models with a unified symbolism. A detailed classification of the previous work based on the modeling issues included is also proposed
    corecore