2,145 research outputs found

    Performance Debugging and Tuning using an Instruction-Set Simulator

    Get PDF
    Instruction-set simulators allow programmers a detailed level of insight into, and control over, the execution of a program, including parallel programs and operating systems. In principle, instruction set simulation can model any target computer and gather any statistic. Furthermore, such simulators are usually portable, independent of compiler tools, and deterministic-allowing bugs to be recreated or measurements repeated. Though often viewed as being too slow for use as a general programming tool, in the last several years their performance has improved considerably. We describe SIMICS, an instruction set simulator of SPARC-based multiprocessors developed at SICS, in its rôle as a general programming tool. We discuss some of the benefits of using a tool such as SIMICS to support various tasks in software engineering, including debugging, testing, analysis, and performance tuning. We present in some detail two test cases, where we've used SimICS to support analysis and performance tuning of two applications, Penny and EQNTOTT. This work resulted in improved parallelism in, and understanding of, Penny, as well as a performance improvement for EQNTOTT of over a magnitude. We also present some early work on analyzing SPARC/Linux, demonstrating the ability of tools like SimICS to analyze operating systems

    A Massively Scalable Architecture For Instant Messaging & Presence

    Get PDF
    This paper analyzes the scalability of Instant Messaging & Presence (IM&P) architectures. We take a queueing-based modelling and analysis approach to find the bottlenecks of the current IM&P architecture at the Dutch social network Hyves, as well as of alternative architectures. We use the Hierarchical Evaluation Tool (HIT) to create and analyse models analytically. Based on these results, we recommend a new architecture that provides better scalability than the current one. \u

    Tuning the Level of Concurrency in Software Transactional Memory: An Overview of Recent Analytical, Machine Learning and Mixed Approaches

    Get PDF
    Synchronization transparency offered by Software Transactional Memory (STM) must not come at the expense of run-time efficiency, thus demanding from the STM-designer the inclusion of mechanisms properly oriented to performance and other quality indexes. Particularly, one core issue to cope with in STM is related to exploiting parallelism while also avoiding thrashing phenomena due to excessive transaction rollbacks, caused by excessively high levels of contention on logical resources, namely concurrently accessed data portions. A means to address run-time efficiency consists in dynamically determining the best-suited level of concurrency (number of threads) to be employed for running the application (or specific application phases) on top of the STM layer. For too low levels of concurrency, parallelism can be hampered. Conversely, over-dimensioning the concurrency level may give rise to the aforementioned thrashing phenomena caused by excessive data contention—an aspect which has reflections also on the side of reduced energy-efficiency. In this chapter we overview a set of recent techniques aimed at building “application-specific” performance models that can be exploited to dynamically tune the level of concurrency to the best-suited value. Although they share some base concepts while modeling the system performance vs the degree of concurrency, these techniques rely on disparate methods, such as machine learning or analytic methods (or combinations of the two), and achieve different tradeoffs in terms of the relation between the precision of the performance model and the latency for model instantiation. Implications of the different tradeoffs in real-life scenarios are also discussed

    Staring into the abyss: An evaluation of concurrency control with one thousand cores

    Get PDF
    Computer architectures are moving towards an era dominated by many-core machines with dozens or even hundreds of cores on a single chip. This unprecedented level of on-chip parallelism introduces a new dimension to scalability that current database management systems (DBMSs) were not designed for. In particular, as the number of cores increases, the problem of concurrency control becomes extremely challenging. With hundreds of threads running in parallel, the complexity of coordinating competing accesses to data will likely diminish the gains from increased core counts. To better understand just how unprepared current DBMSs are for future CPU architectures, we performed an evaluation of concurrency control for on-line transaction processing (OLTP) workloads on many-core chips. We implemented seven concurrency control algorithms on a main-memory DBMS and using computer simulations scaled our system to 1024 cores. Our analysis shows that all algorithms fail to scale to this magnitude but for different reasons. In each case, we identify fundamental bottlenecks that are independent of the particular database implementation and argue that even state-of-the-art DBMSs suffer from these limitations. We conclude that rather than pursuing incremental solutions, many-core chips may require a completely redesigned DBMS architecture that is built from ground up and is tightly coupled with the hardware.Intel Corporation (Science and Technology Center for Big Data

    Modelling parallel database management systems for performance prediction

    Get PDF
    Abstract unavailable please refer to PD

    De/construction sites: Romans and the digital playground

    No full text
    The Roman world as attested to archaeologically and as interacted with today has its expression in a great many computational and other media. The place of visualisation within this has been paramount. This paper argues that the process of digitally constructing the Roman world and the exploration of the resultant models are useful methods for interpretation and influential factors in the creation of a popular Roman aesthetic. Furthermore, it suggests ways in which novel computational techniques enable the systematic deconstruction of such models, in turn re-purposing the many extant representations of Roman architecture and material culture

    Performance models of concurrency control protocols for transaction processing systems

    Get PDF
    Transaction processing plays a key role in a lot of IT infrastructures. It is widely used in a variety of contexts, spanning from database management systems to concurrent programming tools. Transaction processing systems leverage on concurrency control protocols, which allow them to concurrently process transactions preserving essential properties, as isolation and atomicity. Performance is a critical aspect of transaction processing systems, and it is unavoidably affected by the concurrency control. For this reason, methods and techniques to assess and predict the performance of concurrency control protocols are of interest for many IT players, including application designers, developers and system administrators. The analysis and the proper understanding of the impact on the system performance of these protocols require quantitative approaches. Analytical modeling is a practical approach for building cost-effective computer system performance models, enabling us to quantitatively describe the complex dynamics characterizing these systems. In this dissertation we present analytical performance models of concurrency control protocols. We deal with both traditional transaction processing systems, such as database management systems, and emerging ones, as transactional memories. The analysis focuses on widely used protocols, providing detailed performance models and validation studies. In addition, we propose new modeling approaches, which also broaden the scope of our study towards a more realistic, application-oriented, performance analysis
    corecore