23,555 research outputs found

    Practical resource monitoring for robust high throughput computing

    Get PDF
    Abstract-Robust high throughput computing requires effective monitoring and enforcement of a variety of resources including CPU cores, memory, disk, and network traffic. Without effective monitoring and enforcement, it is easy to overload machines, causing failures and slowdowns, or underutilize machines, which results in wasted opportunities. This paper explores how to describe, measure, and enforce resources used by computational tasks. We focus on tasks running in distributed execution systems, in which a task requests the resources it needs, and the execution system ensures the availability of such resources. This presents two non-trivial problems: how to measure the resources consumed by a task, and how to monitor and report resource exhaustion in a robust and timely manner. For both of these tasks, operating systems have a variety of mechanisms with different degrees of availability, accuracy, overhead, and intrusiveness. We describe various forms of monitoring and the available mechanisms in contemporary operating systems. We then present two specific monitoring tools that choose different tradeoffs in overhead and accuracy, and evaluate them on a selection of benchmarks

    The Motivation, Architecture and Demonstration of Ultralight Network Testbed

    Get PDF
    In this paper we describe progress in the NSF-funded Ultralight project and a recent demonstration of Ultralight technologies at SuperComputing 2005 (SC|05). The goal of the Ultralight project is to help meet the data-intensive computing challenges of the next generation of particle physics experiments with a comprehensive, network-focused approach. Ultralight adopts a new approach to networking: instead of treating it traditionally, as a static, unchanging and unmanaged set of inter-computer links, we are developing and using it as a dynamic, configurable, and closely monitored resource that is managed from end-to-end. Thus we are constructing a next-generation global system that is able to meet the data processing, distribution, access and analysis needs of the particle physics community. In this paper we present the motivation for, and an overview of, the Ultralight project. We then cover early results in the various working areas of the project. The remainder of the paper describes our experiences of the Ultralight network architecture, kernel setup, application tuning and configuration used during the bandwidth challenge event at SC|05. During this Challenge, we achieved a record-breaking aggregate data rate in excess of 150 Gbps while moving physics datasets between many sites interconnected by the Ultralight backbone network. The exercise highlighted the benefits of Ultralight's research and development efforts that are enabling new and advanced methods of distributed scientific data analysis

    Understanding the Computational Requirements of Virtualized Baseband Units using a Programmable Cloud Radio Access Network Testbed

    Full text link
    Cloud Radio Access Network (C-RAN) is emerging as a transformative architecture for the next generation of mobile cellular networks. In C-RAN, the Baseband Unit (BBU) is decoupled from the Base Station (BS) and consolidated in a centralized processing center. While the potential benefits of C-RAN have been studied extensively from the theoretical perspective, there are only a few works that address the system implementation issues and characterize the computational requirements of the virtualized BBU. In this paper, a programmable C-RAN testbed is presented where the BBU is virtualized using the OpenAirInterface (OAI) software platform, and the eNodeB and User Equipment (UEs) are implemented using USRP boards. Extensive experiments have been performed in a FDD downlink LTE emulation system to characterize the performance and computing resource consumption of the BBU under various conditions. It is shown that the processing time and CPU utilization of the BBU increase with the channel resources and with the Modulation and Coding Scheme (MCS) index, and that the CPU utilization percentage can be well approximated as a linear increasing function of the maximum downlink data rate. These results provide real-world insights into the characteristics of the BBU in terms of computing resource and power consumption, which may serve as inputs for the design of efficient resource-provisioning and allocation strategies in C-RAN systems.Comment: In Proceedings of the IEEE International Conference on Autonomic Computing (ICAC), July 201

    The Design and Demonstration of the Ultralight Testbed

    Get PDF
    In this paper we present the motivation, the design, and a recent demonstration of the UltraLight testbed at SC|05. The goal of the Ultralight testbed is to help meet the data-intensive computing challenges of the next generation of particle physics experiments with a comprehensive, network- focused approach. UltraLight adopts a new approach to networking: instead of treating it traditionally, as a static, unchanging and unmanaged set of inter-computer links, we are developing and using it as a dynamic, configurable, and closely monitored resource that is managed from end-to-end. To achieve its goal we are constructing a next-generation global system that is able to meet the data processing, distribution, access and analysis needs of the particle physics community. In this paper we will first present early results in the various working areas of the project. We then describe our experiences of the network architecture, kernel setup, application tuning and configuration used during the bandwidth challenge event at SC|05. During this Challenge, we achieved a record-breaking aggregate data rate in excess of 150 Gbps while moving physics datasets between many Grid computing sites

    Benchmarking Distributed Stream Data Processing Systems

    Full text link
    The need for scalable and efficient stream analysis has led to the development of many open-source streaming data processing systems (SDPSs) with highly diverging capabilities and performance characteristics. While first initiatives try to compare the systems for simple workloads, there is a clear gap of detailed analyses of the systems' performance characteristics. In this paper, we propose a framework for benchmarking distributed stream processing engines. We use our suite to evaluate the performance of three widely used SDPSs in detail, namely Apache Storm, Apache Spark, and Apache Flink. Our evaluation focuses in particular on measuring the throughput and latency of windowed operations, which are the basic type of operations in stream analytics. For this benchmark, we design workloads based on real-life, industrial use-cases inspired by the online gaming industry. The contribution of our work is threefold. First, we give a definition of latency and throughput for stateful operators. Second, we carefully separate the system under test and driver, in order to correctly represent the open world model of typical stream processing deployments and can, therefore, measure system performance under realistic conditions. Third, we build the first benchmarking framework to define and test the sustainable performance of streaming systems. Our detailed evaluation highlights the individual characteristics and use-cases of each system.Comment: Published at ICDE 201

    NOMA based resource allocation and mobility enhancement framework for IoT in next generation cellular networks

    Get PDF
    With the unprecedented technological advances witnessed in the last two decades, more devices are connected to the internet, forming what is called internet of things (IoT). IoT devices with heterogeneous characteristics and quality of experience (QoE) requirements may engage in dynamic spectrum market due to scarcity of radio resources. We propose a framework to efficiently quantify and supply radio resources to the IoT devices by developing intelligent systems. The primary goal of the paper is to study the characteristics of the next generation of cellular networks with non-orthogonal multiple access (NOMA) to enable connectivity to clustered IoT devices. First, we demonstrate how the distribution and QoE requirements of IoT devices impact the required number of radio resources in real time. Second, we prove that using an extended auction algorithm by implementing a series of complementary functions, enhance the radio resource utilization efficiency. The results show substantial reduction in the number of sub-carriers required when compared to conventional orthogonal multiple access (OMA) and the intelligent clustering is scalable and adaptable to the cellular environment. Ability to move spectrum usages from one cluster to other clusters after borrowing when a cluster has less user or move out of the boundary is another soft feature that contributes to the reported radio resource utilization efficiency. Moreover, the proposed framework provides IoT service providers cost estimation to control their spectrum acquisition to achieve required quality of service (QoS) with guaranteed bit rate (GBR) and non-guaranteed bit rate (Non-GBR)
    corecore