34,780 research outputs found

    Virtual Machine Support for Many-Core Architectures: Decoupling Abstract from Concrete Concurrency Models

    Get PDF
    The upcoming many-core architectures require software developers to exploit concurrency to utilize available computational power. Today's high-level language virtual machines (VMs), which are a cornerstone of software development, do not provide sufficient abstraction for concurrency concepts. We analyze concrete and abstract concurrency models and identify the challenges they impose for VMs. To provide sufficient concurrency support in VMs, we propose to integrate concurrency operations into VM instruction sets. Since there will always be VMs optimized for special purposes, our goal is to develop a methodology to design instruction sets with concurrency support. Therefore, we also propose a list of trade-offs that have to be investigated to advise the design of such instruction sets. As a first experiment, we implemented one instruction set extension for shared memory and one for non-shared memory concurrency. From our experimental results, we derived a list of requirements for a full-grown experimental environment for further research

    Tuning the Level of Concurrency in Software Transactional Memory: An Overview of Recent Analytical, Machine Learning and Mixed Approaches

    Get PDF
    Synchronization transparency offered by Software Transactional Memory (STM) must not come at the expense of run-time efficiency, thus demanding from the STM-designer the inclusion of mechanisms properly oriented to performance and other quality indexes. Particularly, one core issue to cope with in STM is related to exploiting parallelism while also avoiding thrashing phenomena due to excessive transaction rollbacks, caused by excessively high levels of contention on logical resources, namely concurrently accessed data portions. A means to address run-time efficiency consists in dynamically determining the best-suited level of concurrency (number of threads) to be employed for running the application (or specific application phases) on top of the STM layer. For too low levels of concurrency, parallelism can be hampered. Conversely, over-dimensioning the concurrency level may give rise to the aforementioned thrashing phenomena caused by excessive data contention—an aspect which has reflections also on the side of reduced energy-efficiency. In this chapter we overview a set of recent techniques aimed at building “application-specific” performance models that can be exploited to dynamically tune the level of concurrency to the best-suited value. Although they share some base concepts while modeling the system performance vs the degree of concurrency, these techniques rely on disparate methods, such as machine learning or analytic methods (or combinations of the two), and achieve different tradeoffs in terms of the relation between the precision of the performance model and the latency for model instantiation. Implications of the different tradeoffs in real-life scenarios are also discussed

    Achieving Starvation-Freedom with Greater Concurrency in Multi-Version Object-based Transactional Memory Systems

    Full text link
    To utilize the multi-core processors properly concurrent programming is needed. Concurrency control is the main challenge while designing a correct and efficient concurrent program. Software Transactional Memory Systems (STMs) provides ease of multithreading to the programmer without worrying about concurrency issues such as deadlock, livelock, priority inversion, etc. Most of the STMs works on read-write operations known as RWSTMs. Some STMs work at high-level operations and ensure greater concurrency than RWSTMs. Such STMs are known as Object-Based STMs (OSTMs). The transactions of OSTMs can return commit or abort. Aborted OSTMs transactions retry. But in the current setting of OSTMs, transactions may starve. So, we proposed a Starvation-Free OSTM (SF-OSTM) which ensures starvation-freedom in object based STM systems while satisfying the correctness criteria as co-opacity. Databases, RWSTMs and OSTMs say that maintaining multiple versions corresponding to each key of transaction reduces the number of aborts and improves the throughput. So, to achieve greater concurrency, we proposed Starvation-Free Multi-Version OSTM (SF-MVOSTM) which ensures starvation-freedom while storing multiple versions corresponding to each key and satisfies the correctness criteria such as local opacity. To show the performance benefits, We implemented three variants of SF-MVOSTM (SF-MVOSTM, SF-MVOSTM-GC and SF-KOSTM) and compared it with state-of-the-art STMs.Comment: 68 pages, 24 figures. arXiv admin note: text overlap with arXiv:1709.0103

    MGSim - Simulation tools for multi-core processor architectures

    Get PDF
    MGSim is an open source discrete event simulator for on-chip hardware components, developed at the University of Amsterdam. It is intended to be a research and teaching vehicle to study the fine-grained hardware/software interactions on many-core and hardware multithreaded processors. It includes support for core models with different instruction sets, a configurable multi-core interconnect, multiple configurable cache and memory models, a dedicated I/O subsystem, and comprehensive monitoring and interaction facilities. The default model configuration shipped with MGSim implements Microgrids, a many-core architecture with hardware concurrency management. MGSim is furthermore written mostly in C++ and uses object classes to represent chip components. It is optimized for architecture models that can be described as process networks.Comment: 33 pages, 22 figures, 4 listings, 2 table

    Influences on Throughput and Latency in Stream Programs

    Get PDF
    Vu Thien Nga Nguyen and Raimund Kirner, 'Influences on Throughput and Latency in Stream Programs' paper presented at the 2nd Workshop on Feedback-Directed Compiler Optimization for Multi-Core Architectures. Berlin, Germany. 22 January 2013Stream programming is a promising approach to execute programs on parallel hardware such as multi-core systems. It allows to reuse sequential code at component level and to extend such code with concurrency-handling at the communication level. In this paper we investigate in the performance of stream programs in terms of throughput and latency. We identify factors that affect these performance metrics and propose an efficient scheduling approach to obtain the maximal performance

    Sorting as a Streaming Application Executing on Chip Multiprocessors

    Get PDF
    Expressing concurrency in applications has always been a difficult and error-prone endeavor, yet effective utilization of multi-core processors requires that the concurrency in applications be understood. One approach to the expression of concurrency is streaming, which has shown real promise as a safe and effective method for many application classes. Here, we express a classic problem, sorting, in the streaming paradigm and explore the implications of various algorithm and architectural design parameters on the performance of the application
    corecore