1,070 research outputs found

    Energy Saving and Virtualization Technologies in Switching

    Get PDF
    Switching is the key functionality for many devices like electronic Router and Switch, optical Router, Network on Chips (NoCs) and so on. Basically, switching is responsible for moving data unit from one port/location to another (or multiple) port(s)/location(s). In past years, the high capacity, low delay were the main concerns when designing high-end switching unit. As new demands, requests and technologies emerge, flexibility and low power cost switching design become to weight the same as throughput and delay. On one hand, highly flexible (i.e, programming ability) switching can cope with variable needs stem from new applications (i.e, VoIP) and popular user behavior (i.e, p2p downloading); on the other hand, reduce the energy and power dissipation for switching could not only save bills and build echo system but also expand components life time. Many research efforts have been devoted to increase switching flexibility and reduce its power cost. In this thesis work, we consider to exploit virtualization as the main technique to build flexible software router in the first part, then in the second part we draw our attention on energy saving in NoC (i.e, a switching fabric designed to handle the on chip data transmission) and software router. In the first part of the thesis, we consider the virtualization inside Software Routers (SRs). SR, i.e, routers running in commodity Personal Computers (PCs), become an appealing solution compared to traditional Proprietary Routing Devices (PRD) for various reasons such as cost (the multi-vendor hardware used by SRs can be cheap, while the equipment needed by PRDs is more expensive and their training cost is higher), openness (SRs can make use of a large number of open source networking applications, while PRDs are more closed) and flexibility. The forwarding performance provided by SRs has been an obstacle to their deployment in real networks. For this reason, we proposed to aggregate multiple routing units that form an powerful SR known as the Multistage Software Router (MSR) to overcome the performance limitation for a single SR. Our results show that the throughput can increase almost linearly as the number of the internal routing devices. But some other features related to flexibility (such as power saving, programmability, router migration or easy management) have been investigated less than performance previously. We noticed that virtualization techniques become reality thanks to the quick development of the PC architectures, which are now able to easily support several logical PCs running in parallel on the same hardware. Virtualization could provide many flexible features like hardware and software decoupling, encapsulation of virtual machine state, failure recovery and security, to name a few. Virtualization permits to build multiple SRs inside one physical host and a multistage architecture exploiting only logical devices. By doing so, physical resources can be used in a more efficient way, energy savings features (switching on and off device when needed) can be introduced and logical resources could be rented on-demand instead of being owned. Since virtualization techniques are still difficult to deploy, several challenges need to be faced when trying to integrate them into routers. The main aim of the first part in this thesis is to find out the feasibility of the virtualization approach, to build and test virtualized SR (VSR), to implement the MSR exploiting logical, i.e. virtualized, resources, to analyze virtualized routing performance and to propose improvement techniques to VSR and virtual MSR (VMSR). More specifically, we considered different virtualization solutions like VMware, XEN, KVM to build VSR and VMSR, being VMware a closed source solution but with higher performance and XEN/KVM open source solutions. Firstly we built and tested each single component of our multistage architecture (i.e, back-end router, load balancer )inside the virtual infrastructure, then and we extended the performance experiments with more complex scenarios like multiple Back-end Router (BR) or Load Balancer (LB) which cooperate to route packets. Our results show that virtualization could introduce 40~\% performance penalty compare with the hardware only solution. Keep the performance limitation in mind, we developed the whole VMSR and we obtained low throughput with 64B packet flow as expected. To increase the VMSR throughput, two directions could be considered, the first one is to improve the single component ( i.e, VSR) performance and the other is to work from the topology (i.e, best allocation of the VMs into the hardware ) point of view. For the first method, we considered to tune the VSR inside the KVM and we studied closely such as Linux driver, scheduler, interconnect methodology which could impact the performance significantly with proper configuration; then we proposed two ways for the VMs allocation into physical servers to enhance the VMSR performance. Our results show that with good tuning and allocation of VMs, we could minimize the virtualization penalty and get reasonable throughput for running SRs inside virtual infrastructure and add flexibility functionalities into SRs easily. In the second part of the thesis, we consider the energy efficient switching design problem and we focus on two main architecture, the NoC and MSR. As many research works suggest, the energy cost in the Communication Technologies ( ICT ) is constantly increasing. Among the main ICT sectors, a large portion of the energy consumption is contributed by the telecommunication infrastructure and their devices, i.e, router, switch, cell phone, ip TV settle box, storage home gateway etc. More in detail, the linecards, links, System on Chip (SoC) including the transmitter/receiver on these variate devices are the main power consuming units. We firstly present the work on the power reduction of the data transmission in SoC, which is carried out by the NoC. NoC is an approach to design the communication subsystem between different Processing Units (PEs) in a SoC. PEs could be different elements such as CPU, memory, digital signal/analog signal processor etc. Different PEs performs specific tasks depending on the applications running on the chip. Different tasks need to exchange data information among each other, thus flits ( chopped packet with limited header information ) are generated by PEs. The flits are injected into the NoC by the proper interface and routed until reach the destination PEs. For the whole procedure, the NoC behaves as a packet switch network. Studies show that in general the information processing in the PEs only consume 60~\% energy while the remaining 40~\% are consumed by the NoC. More importantly, as the current network designing principle, the NoC capacity is devised to handle the peak load. This is a clear sign for energy saving when the network load is low. In our work, we considered to exploit Dynamic Voltage and Frequency Scaling (DVFS) technique, which can jointly decrease or increase the system voltage and frequency when necessary, i.e, decrease the voltage and frequency at low load scenario to save energy and reduce power dissipation. More precisely, we studied two different NoC architectures for energy saving, namely single plane chip and multi-plane chip architecture. In both cases we have a very strict constraint to be that all the links and transmitter/receivers on the same plane work at the same frequency/voltage to avoid synchronization problem. This is the main difference with many existing works in the literature which usually assume different links can work at different frequency, that is hard to be implemented in reality. For the single plane NoC, we exploited different routing schemas combined with DVFS to reduce the power for the whole chip. Our results haven been compared with the optimal value obtained by modeling the power saving formally as a quadratic programming problem. Results suggest that just by using simple load balancing routing algorithm, we can save considerable energy for the single chip NoC architecture. Furthermore, we noticed that in the single plane NoC architecture, the bottleneck link could limit the DVFS effectiveness. Then we discovered that multiplane NoC architecture is fairly easy to be implemented and it could help with the energy saving. Thus we focus on the multiplane architecture and we found out that DVFS could be more efficient when we concentrate more traffic into one plane and send the remaining flows to other planes. We compared load concentration and load balancing with different power modeling and all simulation results show that load concentration is better compared with load balancing for multiplan NoC architecture. Finally, we also present one of the the energy efficient MSR design technique, which permits the MSR to follow the day-night traffic pattern more efficiently with our on-line energy saving algorithm

    Fast Scheduling of Robot Teams Performing Tasks With Temporospatial Constraints

    Get PDF
    The application of robotics to traditionally manual manufacturing processes requires careful coordination between human and robotic agents in order to support safe and efficient coordinated work. Tasks must be allocated to agents and sequenced according to temporal and spatial constraints. Also, systems must be capable of responding on-the-fly to disturbances and people working in close physical proximity to robots. In this paper, we present a centralized algorithm, named 'Tercio,' that handles tightly intercoupled temporal and spatial constraints. Our key innovation is a fast, satisficing multi-agent task sequencer inspired by real-time processor scheduling techniques and adapted to leverage a hierarchical problem structure. We use this sequencer in conjunction with a mixed-integer linear program solver and empirically demonstrate the ability to generate near-optimal schedules for real-world problems an order of magnitude larger than those reported in prior art. Finally, we demonstrate the use of our algorithm in a multirobot hardware testbed

    Real time stream processing for Internet of things and sensing environments

    Get PDF
    Includes bibliographical references.2015 Fall.Improvements in miniaturization and networking capabilities of sensors have contributed to the proliferation of Internet of Things (IoT) and continuous sensing environments. Data streams generated in such settings must keep pace with generation rates and be processed in real time. Challenges in accomplishing this include: high data arrival rates, buffer overflows, context-switches during processing, and object creation overheads. We propose a holistic framework that addresses the CPU, memory, network, and kernel issues involved in stream processing. Our prototype, Neptune, builds on the Granules cloud runtime and leverages its support for scheduling packets and communications based on publish/subscribe, peer to peer, and point-to-point. The framework maximizes bandwidth utilization in the presence of small messages via the use of buffering and dynamic compactions of packets based on their entropy. Our use of thread-pools and batched processing reduces context switches and improves effective CPU utilizations. The framework alleviates memory pressure that can lead to swapping, page faults, and thrashing through efficient reuse of objects. To cope with buffer overflows we rely on flow control and throttling the preceding stages of a processing pipeline. Our correctness criteria included deadlock/livelock avoidance, and ordered and exactly-once processing. Our benchmarks demonstrate the suitability of the Granules/Neptune combination and we contrast our performance with Apache Storm, the dominant stream-processing framework developed by Twitter. At a single node, we are able to achieve a processing rate of ~2 million stream packets per-second. In a distributed cluster setup, we are able to achieve a processing rate of ~100 million stream packets per-second with a near-optimal bandwidth utilization

    Optimum Allocation of Inspection Stations in Multistage Manufacturing Processes by Using Max-Min Ant System

    Get PDF
    In multistage manufacturing processes it is common to locate inspection stations after some or all of the processing workstations. The purpose of the inspection is to reduce the total manufacturing cost, resulted from unidentified defective items being processed unnecessarily through subsequent manufacturing operations. This total cost is the sum of the costs of production, inspection and failures (during production and after shipment). Introducing inspection stations into a serial multistage manufacturing process, although constituting an additional cost, is expected to be a profitable course of action. Specifically, at some positions the associated inspection costs will be recovered from the benefits realised through the detection of defective items, before wasting additional cost by continuing to process them. In this research, a novel general cost modelling for allocating a limited number of inspection stations in serial multistage manufacturing processes is formulated. In allocation of inspection station (AOIS) problem, as the number of workstations increases, the number of inspection station allocation possibilities increases exponentially. To identify the appropriate approach for the AOIS problem, different optimisation methods are investigated. The MAX-MIN Ant System (MMAS) algorithm is proposed as a novel approach to explore AOIS in serial multistage manufacturing processes. MMAS is an ant colony optimisation algorithm that was designed originally to begin an explorative search phase and, subsequently, to make a slow transition to the intensive exploitation of the best solutions found during the search, by allowing only one ant to update the pheromone trails. Two novel heuristics information for the MMAS algorithm are created. The heuristic information for the MMAS algorithm is exploited as a novel means to guide ants to build reasonably good solutions from the very beginning of the search. To improve the performance of the MMAS algorithm, six local search methods which are well-known and suitable for the AOIS problem are used. Selecting relevant parameter values for the MMAS algorithm can have a great impact on the algorithm’s performance. As a result, a method for tuning the most influential parameter values for the MMAS algorithm is developed. The contribution of this research is, for the first time, a methodology using MMAS to solve the AOIS problem in serial multistage manufacturing processes has been developed. The methodology takes into account the constraints on inspection resources, in terms of a limited number of inspection stations. As a result, the total manufacturing cost of a product can be reduced, while maintaining the quality of the product. Four numerical experiments are conducted to assess the MMAS algorithm for the AOIS problem. The performance of the MMAS algorithm is compared with a number of other methods this includes the complete enumeration method (CEM), rule of thumb, a pure random search algorithm, particle swarm optimisation, simulated annealing and genetic algorithm. The experimental results show that the effectiveness of the MMAS algorithm lies in its considerably shorter execution time and robustness. Further, in certain conditions results obtained by the MMAS algorithm are identical to the CEM. In addition, the results show that applying local search to the MMAS algorithm has significantly improved the performance of the algorithm. Also the results demonstrate that it is essential to use heuristic information with the MMAS algorithm for the AOIS problem, in order to obtain a high quality solution. It was found that the main parameters of MMAS include the pheromone trail intensity, heuristic information and evaporation of pheromone are less sensitive within the specified range as the number of workstations is significantly increased
    • …
    corecore