9,828 research outputs found

    An Analysis of Linux Scalability to Many Cores

    Get PDF
    URL to paper from conference siteThis paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48- core computer. Except for gmake, all applications trigger scalability bottlenecks inside a recent Linux kernel. Using mostly standard parallel programming techniques— this paper introduces one new technique, sloppy counters— these bottlenecks can be removed from the kernel or avoided by changing the applications slightly. Modifying the kernel required in total 3002 lines of code changes. A speculative conclusion from this analysis is that there is no scalability reason to give up on traditional operating system organizations just yet.Quanta Computer (Firm)National Science Foundation (U.S.) (0834415)National Science Foundation (U.S.) (0915164)Microsoft Research (Fellowship)Irwin Mark Jacobs and Joan Klein Jacobs Presidential Fellowshi

    An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors

    Full text link
    The emergence of multicore and manycore processors is set to change the parallel computing world. Applications are shifting towards increased parallelism in order to utilise these architectures efficiently. This leads to a situation where every application creates its desirable number of threads, based on its parallel nature and the system resources allowance. Task scheduling in such a multithreaded multiprogramming environment is a significant challenge. In task scheduling, not only the order of the execution, but also the mapping of threads to the execution resources is of a great importance. In this paper we state and discuss some fundamental rules based on results obtained from selected applications of the BOTS benchmarks on the 64-core TILEPro64 processor. We demonstrate how previously efficient mapping policies such as those of the SMP Linux scheduler become inefficient when the number of threads and cores grows. We propose a novel, low-overhead technique, a heuristic based on the amount of time spent by each CPU doing some useful work, to fairly distribute the workloads amongst the cores in a multiprogramming environment. Our novel approach could be implemented as a pragma similar to those in the new task-based OpenMP versions, or can be incorporated as a distributed thread mapping mechanism in future manycore programming frameworks. We show that our thread mapping scheme can outperform the native GNU/Linux thread scheduler in both single-programming and multiprogramming environments.Comment: ParCo Conference, Munich, Germany, 201

    Towards Loosely-Coupled Programming on Petascale Systems

    Full text link
    We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new-and potentially far larger-class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.Comment: IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SuperComputing/SC) 200

    Performance Evaluation of Microservices Architectures using Containers

    Get PDF
    Microservices architecture has started a new trend for application development for a number of reasons: (1) to reduce complexity by using tiny services; (2) to scale, remove and deploy parts of the system easily; (3) to improve flexibility to use different frameworks and tools; (4) to increase the overall scalability; and (5) to improve the resilience of the system. Containers have empowered the usage of microservices architectures by being lightweight, providing fast start-up times, and having a low overhead. Containers can be used to develop applications based on monolithic architectures where the whole system runs inside a single container or inside a microservices architecture where one or few processes run inside the containers. Two models can be used to implement a microservices architecture using containers: master-slave, or nested-container. The goal of this work is to compare the performance of CPU and network running benchmarks in the two aforementioned models of microservices architecture hence provide a benchmark analysis guidance for system designers.Comment: Submitted to the 14th IEEE International Symposium on Network Computing and Applications (IEEE NCA15). Partially funded by European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 639595) - HiEST Projec

    The state of SQL-on-Hadoop in the cloud

    Get PDF
    Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving companies a quick entry and on-demand deployment of ready SQL-like solutions for their big data needs. This study evaluates cloud services from an end-user perspective, comparing providers including: Microsoft Azure, Amazon Web Services, Google Cloud, and Rackspace. The study focuses on performance, readiness, scalability, and cost-effectiveness of the different solutions at entry/test level clusters sizes. Results are based on over 15,000 Hive queries derived from the industry standard TPC-H benchmark. The study is framed within the ALOJA research project, which features an open source benchmarking and analysis platform that has been recently extended to support SQL-on-Hadoop engines. The ALOJA Project aims to lower the total cost of ownership (TCO) of big data deployments and study their performance characteristics for optimization. The study benchmarks cloud providers across a diverse range instance types, and uses input data scales from 1GB to 1TB, in order to survey the popular entry-level PaaS SQL-on-Hadoop solutions, thereby establishing a common results-base upon which subsequent research can be carried out by the project. Initial results already show the main performance trends to both hardware and software configuration, pricing, similarities and architectural differences of the evaluated PaaS solutions. Whereas some providers focus on decoupling storage and computing resources while offering network-based elastic storage, others choose to keep the local processing model from Hadoop for high performance, but reducing flexibility. Results also show the importance of application-level tuning and how keeping up-to-date hardware and software stacks can influence performance even more than replicating the on-premises model in the cloud.This work is partially supported by the Microsoft Azure for Research program, the European Research Council (ERC) under the EUs Horizon 2020 programme (GA 639595), the Spanish Ministry of Education (TIN2015-65316-P), and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Distributed Hybrid Simulation of the Internet of Things and Smart Territories

    Full text link
    This paper deals with the use of hybrid simulation to build and compose heterogeneous simulation scenarios that can be proficiently exploited to model and represent the Internet of Things (IoT). Hybrid simulation is a methodology that combines multiple modalities of modeling/simulation. Complex scenarios are decomposed into simpler ones, each one being simulated through a specific simulation strategy. All these simulation building blocks are then synchronized and coordinated. This simulation methodology is an ideal one to represent IoT setups, which are usually very demanding, due to the heterogeneity of possible scenarios arising from the massive deployment of an enormous amount of sensors and devices. We present a use case concerned with the distributed simulation of smart territories, a novel view of decentralized geographical spaces that, thanks to the use of IoT, builds ICT services to manage resources in a way that is sustainable and not harmful to the environment. Three different simulation models are combined together, namely, an adaptive agent-based parallel and distributed simulator, an OMNeT++ based discrete event simulator and a script-language simulator based on MATLAB. Results from a performance analysis confirm the viability of using hybrid simulation to model complex IoT scenarios.Comment: arXiv admin note: substantial text overlap with arXiv:1605.0487
    • …
    corecore