3,665 research outputs found

    A load-sharing architecture for high performance optimistic simulations on multi-core machines

    Get PDF
    In Parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. The latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well

    Building Computing-As-A-Service Mobile Cloud System

    Get PDF
    The last five years have witnessed the proliferation of smart mobile devices, the explosion of various mobile applications and the rapid adoption of cloud computing in business, governmental and educational IT deployment. There is also a growing trends of combining mobile computing and cloud computing as a new popular computing paradigm nowadays. This thesis envisions the future of mobile computing which is primarily affected by following three trends: First, servers in cloud equipped with high speed multi-core technology have been the main stream today. Meanwhile, ARM processor powered servers is growingly became popular recently and the virtualization on ARM systems is also gaining wide ranges of attentions recently. Second, high-speed internet has been pervasive and highly available. Mobile devices are able to connect to cloud anytime and anywhere. Third, cloud computing is reshaping the way of using computing resources. The classic pay/scale-as-you-go model allows hardware resources to be optimally allocated and well-managed. These three trends lend credence to a new mobile computing model with the combination of resource-rich cloud and less powerful mobile devices. In this model, mobile devices run the core virtualization hypervisor with virtualized phone instances, allowing for pervasive access to more powerful, highly-available virtual phone clones in the cloud. The centralized cloud, powered by rich computing and memory recourses, hosts virtual phone clones and repeatedly synchronize the data changes with virtual phone instances running on mobile devices. Users can flexibly isolate different computing environments. In this dissertation, we explored the opportunity of leveraging cloud resources for mobile computing for the purpose of energy saving, performance augmentation as well as secure computing enviroment isolation. We proposed a framework that allows mo- bile users to seamlessly leverage cloud to augment the computing capability of mobile devices and also makes it simpler for application developers to run their smartphone applications in the cloud without tedious application partitioning. This framework was built with virtualization on both server side and mobile devices. It has three building blocks including agile virtual machine deployment, efficient virtual resource management, and seamless mobile augmentation. We presented the design, imple- mentation and evaluation of these three components and demonstrated the feasibility of the proposed mobile cloud model

    Load sharing for optimistic parallel simulations on multicore machines

    Get PDF
    Parallel Discrete Event Simulation (PDES) is based on the partitioning of the simulation model into distinct Logical Processes (LPs), each one modeling a portion of the entire system, which are allowed to execute simulation events concurrently. This allows exploiting parallel computing architectures to speedup model execution, and to make very large models tractable. In this article we cope with the optimistic approach to PDES, where LPs are allowed to concurrently process their events in a speculative fashion, and rollback/ recovery techniques are used to guarantee state consistency in case of causality violations along the speculative execution path. Particularly, we present an innovative load sharing approach targeted at optimizing resource usage for fruitful simulation work when running an optimistic PDES environment on top of multi-processor/multi-core machines. Beyond providing the load sharing model, we also define a load sharing oriented architectural scheme, based on a symmetric multi-threaded organization of the simulation platform. Finally, we present a real implementation of the load sharing architecture within the open source ROme OpTimistic Simulator (ROOT-Sim) package. Experimental data for an assessment of both viability and effectiveness of our proposal are presented as well. Copyright is held by author/owner(s)

    Lock-free Concurrent Data Structures

    Full text link
    Concurrent data structures are the data sharing side of parallel programming. Data structures give the means to the program to store data, but also provide operations to the program to access and manipulate these data. These operations are implemented through algorithms that have to be efficient. In the sequential setting, data structures are crucially important for the performance of the respective computation. In the parallel programming setting, their importance becomes more crucial because of the increased use of data and resource sharing for utilizing parallelism. The first and main goal of this chapter is to provide a sufficient background and intuition to help the interested reader to navigate in the complex research area of lock-free data structures. The second goal is to offer the programmer familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and Distributed Computin

    Verification of the Tree-Based Hierarchical Read-Copy Update in the Linux Kernel

    Full text link
    Read-Copy Update (RCU) is a scalable, high-performance Linux-kernel synchronization mechanism that runs low-overhead readers concurrently with updaters. Production-quality RCU implementations for multi-core systems are decidedly non-trivial. Giving the ubiquity of Linux, a rare "million-year" bug can occur several times per day across the installed base. Stringent validation of RCU's complex behaviors is thus critically important. Exhaustive testing is infeasible due to the exponential number of possible executions, which suggests use of formal verification. Previous verification efforts on RCU either focus on simple implementations or use modeling languages, the latter requiring error-prone manual translation that must be repeated frequently due to regular changes in the Linux kernel's RCU implementation. In this paper, we first describe the implementation of Tree RCU in the Linux kernel. We then discuss how to construct a model directly from Tree RCU's source code in C, and use the CBMC model checker to verify its safety and liveness properties. To our best knowledge, this is the first verification of a significant part of RCU's source code, and is an important step towards integration of formal verification into the Linux kernel's regression test suite.Comment: This is a long version of a conference paper published in the 2018 Design, Automation and Test in Europe Conference (DATE

    Assessing load-sharing within optimistic simulation platforms

    Get PDF
    The advent of multi-core machines has lead to the need for revising the architecture of modern simulation platforms. One recent proposal we made attempted to explore the viability of load-sharing for optimistic simulators run on top of these types of machines. In this article, we provide an extensive experimental study for an assessment of the effects on run-time dynamics by a load-sharing architecture that has been implemented within the ROOT-Sim package, namely an open source simulation platform adhering to the optimistic synchronization paradigm. This experimental study is essentially aimed at evaluating possible sources of overheads when supporting load-sharing. It has been based on differentiated workloads allowing us to generate different execution profiles in terms of, e.g., granularity/locality of the simulation events. © 2012 IEEE

    Advanced synchronization techniques for task-based runtime systems

    Get PDF
    Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with small granularity tasks remains a challenge, and bottlenecks can manifest in several runtime components. In this paper, we analyze the limiting factors in the scalability of a task-based runtime system and propose individual solutions for each of the challenges, including a wait-free dependency system and a novel scalable scheduler design based on delegation. We evaluate how the optimizations impact the overall performance of the runtime, both individually and in combination. We also compare the resulting runtime against state of the art OpenMP implementations, showing equivalent or better performance, especially for fine-grained tasks.This project is supported by the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No.s 754304 (DEEP-EST), by the Spanish Ministry of Science and Innovation (contract PID2019-107255GB and TIN2015-65316P) and by the Generalitat de Catalunya (2017-SGR-1414).Peer ReviewedPostprint (author's final draft

    Virtual Machine Support for Many-Core Architectures: Decoupling Abstract from Concrete Concurrency Models

    Get PDF
    The upcoming many-core architectures require software developers to exploit concurrency to utilize available computational power. Today's high-level language virtual machines (VMs), which are a cornerstone of software development, do not provide sufficient abstraction for concurrency concepts. We analyze concrete and abstract concurrency models and identify the challenges they impose for VMs. To provide sufficient concurrency support in VMs, we propose to integrate concurrency operations into VM instruction sets. Since there will always be VMs optimized for special purposes, our goal is to develop a methodology to design instruction sets with concurrency support. Therefore, we also propose a list of trade-offs that have to be investigated to advise the design of such instruction sets. As a first experiment, we implemented one instruction set extension for shared memory and one for non-shared memory concurrency. From our experimental results, we derived a list of requirements for a full-grown experimental environment for further research
    corecore