1,453 research outputs found

    Intra-Domain Pathlet Routing

    Full text link
    Internal routing inside an ISP network is the foundation for lots of services that generate revenue from the ISP's customers. A fine-grained control of paths taken by network traffic once it enters the ISP's network is therefore a crucial means to achieve a top-quality offer and, equally important, to enforce SLAs. Many widespread network technologies and approaches (most notably, MPLS) offer limited (e.g., with RSVP-TE), tricky (e.g., with OSPF metrics), or no control on internal routing paths. On the other hand, recent advances in the research community are a good starting point to address this shortcoming, but miss elements that would enable their applicability in an ISP's network. We extend pathlet routing by introducing a new control plane for internal routing that has the following qualities: it is designed to operate in the internal network of an ISP; it enables fine-grained management of network paths with suitable configuration primitives; it is scalable because routing changes are only propagated to the network portion that is affected by the changes; it supports independent configuration of specific network portions without the need to know the configuration of the whole network; it is robust thanks to the adoption of multipath routing; it supports the enforcement of QoS levels; it is independent of the specific data plane used in the ISP's network; it can be incrementally deployed and it can nicely coexist with other control planes. Besides formally introducing the algorithms and messages of our control plane, we propose an experimental validation in the simulation framework OMNeT++ that we use to assess the effectiveness and scalability of our approach.Comment: 13 figures, 1 tabl

    An SDN-based firewall shunt for data-intensive science applications

    Get PDF
    A dissertation submitted to the Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science in Engineering, 2016Data-intensive research computing requires the capability to transfer les over long distances at high throughput. Stateful rewalls introduce su cient packet loss to prevent researchers from fully exploiting high bandwidth-delay network links [25]. To work around this challenge, the science DMZ design [19] trades o stateful packet ltering capability for loss-free forwarding via an ordinary Ethernet switch. We propose a novel extension to the science DMZ design, which uses an SDN-based rewall. This report introduces NFShunt, a rewall based on Linux's Net lter combined with OpenFlow switching. Implemented as an OpenFlow 1.0 controller coupled to Net lter's connection tracking, NFShunt allows the bypass-switching policy to be expressed as part of an iptables rewall rule-set. Our implementation is described in detail, and latency of the control-plane mechanism is reported. TCP throughput and packet loss is shown at various round-trip latencies, with comparisons to pure switching, as well as to a high-end Cisco rewall. Cost, as well as operations and maintenance aspects, are compared and analysed. The results support reported observations regarding rewall introduced packet-loss, and indicate that the SDN design of NFShunt is a technically viable and cost-e ective approach to enhancing a traditional rewall to meet the performance needs of data-intensive researchersGS201

    Massively-Parallel Feature Selection for Big Data

    Full text link
    We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as columns (features). By employing the concepts of pp-values of conditional independence tests and meta-analysis techniques PFBP manages to rely only on computations local to a partition while minimizing communication costs. Then, it employs powerful and safe (asymptotically sound) heuristics to make early, approximate decisions, such as Early Dropping of features from consideration in subsequent iterations, Early Stopping of consideration of features within the same iteration, or Early Return of the winner in each iteration. PFBP provides asymptotic guarantees of optimality for data distributions faithfully representable by a causal network (Bayesian network or maximal ancestral graph). Our empirical analysis confirms a super-linear speedup of the algorithm with increasing sample size, linear scalability with respect to the number of features and processing cores, while dominating other competitive algorithms in its class

    Serverless computing for the Internet of Things

    Get PDF
    Cloud-based services have evolved significantly over the years. Cloud computing models such as IaaS, PaaS and SaaS are serving as an alternative to traditional in-house infrastructure-based approach. Furthermore, serverless computing is a cloud computing model for ephemeral, stateless and event-driven applications that scale up and down instantly. In contrast to the infinite resources of cloud computing, the Internet of Things is the network of resource-constrained, heterogeneous and intelligent devices that generate a significant amount of data. Due to the resource-constrained nature of IoT devices, cloud resources are used to process data generated by IoT devices. However, data processing in the cloud also has few limitations such as latency and privacy concerns. These limitations arise a requirement of local processing of data generated by IoT devices. A serverless platform can be deployed on a cluster of IoT devices using software containers to enable local processing of the sensor data. This work proposes a hybrid multi-layered architecture that not only establishes the possibility of local processing of sensor data but also considers the issues such as heterogeneity, resource constraint nature of IoT devices. We use software containers, and multi-layered architecture to provide the high availability and fault tolerance in our proposed solution

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud

    Get PDF
    While Cloud Computing has transformed how we solve many computing tasks, some scientific and many-task applications are not efficiently executed on cloud resources. Decentralized scheduling, as studied in grid computing, can provide a scalable system to organize cloud resources and schedule a variety of work. By measuring simulations of two algorithms, the fully decentralized Organic Grid, and the partially decentralized Air Traffic Controller from IBM, we establish that decentralization is a workable approach, and that there are bottlenecks that can impact partially centralized algorithms. Through measurements in the cloud, we verify that our simulation approach is sound, and assess the variable performance of cloud resources. We propose a scheduler that measures the capabilities of the resources available to execute a task and distributes work dynamically at run time. Our scheduling algorithm is evaluated experimentally, and we show that performance-aware scheduling in a cloud environment can provide improvements in execution time. This provides a framework by which a variety of parameters can be weighed to make job-specific and context-aware scheduling decisions. Our measurements examine the usefulness of benchmarking as a metric used to measure a node\u27s performance, and drive scheduling. Benchmarking provides an advantage over simple queue-based scheduling on distributed systems whose members vary in actual performance, but the NAS benchmark we use does not always correlate perfectly with actual performance. The utilized hardware is examined, as are enforced performance variations, and we observe changes in performance that result in running on a system in which different workers receive different CPU allocations. As we see that performance metrics are useful near the end of the execution of a large job, we create a new metric from historical data of partially completed work, and use that to drive execution time down further. Interdependent task graph work is introduced and described as a next step in improving cloud scheduling. Realistic task graph problems are defined and a scheduling approach is introduced. This dissertation lays the groundwork to expand the types of problems that can be solved efficiently in the cloud environment

    Runtime MPI Correctness Checking with a Scalable Tools Infrastructure

    Get PDF
    Increasing computational demand of simulations motivates the use of parallel computing systems. At the same time, this parallelism poses challenges to application developers. The Message Passing Interface (MPI) is a de-facto standard for distributed memory programming in high performance computing. However, its use also enables complex parallel programing errors such as races, communication errors, and deadlocks. Automatic tools can assist application developers in the detection and removal of such errors. This thesis considers tools that detect such errors during an application run and advances them towards a combination of both precise checks (neither false positives nor false negatives) and scalability. This includes novel hierarchical checks that provide scalability, as well as a formal basis for a distributed deadlock detection approach. At the same time, the development of parallel runtime tools is challenging and time consuming, especially if scalability and portability are key design goals. Current tool development projects often create similar tool components, while component reuse remains low. To provide a perspective towards more efficient tool development, which simplifies scalable implementations, component reuse, and tool integration, this thesis proposes an abstraction for a parallel tools infrastructure along with a prototype implementation. This abstraction overcomes the use of multiple interfaces for different types of tool functionality, which limit flexible component reuse. Thus, this thesis advances runtime error detection tools and uses their redesign and their increased scalability requirements to apply and evaluate a novel tool infrastructure abstraction. The new abstraction ultimately allows developers to focus on their tool functionality, rather than on developing or integrating common tool components. The use of such an abstraction in wide ranges of parallel runtime tool development projects could greatly increase component reuse. Thus, decreasing tool development time and cost. An application study with up to 16,384 application processes demonstrates the applicability of both the proposed runtime correctness concepts and of the proposed tools infrastructure
    corecore