6,495 research outputs found

    uFLIP: Understanding Flash IO Patterns

    Get PDF
    Does the advent of flash devices constitute a radical change for secondary storage? How should database systems adapt to this new form of secondary storage? Before we can answer these questions, we need to fully understand the performance characteristics of flash devices. More specifically, we want to establish what kind of IOs should be favored (or avoided) when designing algorithms and architectures for flash-based systems. In this paper, we focus on flash IO patterns, that capture relevant distribution of IOs in time and space, and our goal is to quantify their performance. We define uFLIP, a benchmark for measuring the response time of flash IO patterns. We also present a benchmarking methodology which takes into account the particular characteristics of flash devices. Finally, we present the results obtained by measuring eleven flash devices, and derive a set of design hints that should drive the development of flash-based systems on current devices.Comment: CIDR 200

    A Cyclic Distributed Garbage Collector for Network Objects

    Get PDF
    This paper presents an algorithm for distributed garbage collection and outlines its implementation within the Network Objects system. The algorithm is based on a reference listing scheme, which is augmented by partial tracing in order to collect distributed garbage cycles. Processes may be dynamically organised into groups, according to appropriate heuristics, to reclaim distributed garbage cycles. The algorithm places no overhead on local collectors and suspends local mutators only briefly. Partial tracing of the distributed graph involves only objects thought to be part of a garbage cycle: no collaboration with other processes is required. The algorithm offers considerable flexibility, allowing expediency and fault-tolerance to be traded against completeness

    Inverse software configuration management

    Get PDF
    Software systems are playing an increasingly important role in almost every aspect of today’s society such that they impact on our businesses, industry, leisure, health and safety. Many of these systems are extremely large and complex and depend upon the correct interaction of many hundreds or even thousands of heterogeneous components. Commensurate with this increased reliance on software is the need for high quality products that meet customer expectations, perform reliably and which can be cost-effectively and safely maintained. Techniques such as software configuration management have proved to be invaluable during the development process to ensure that this is the case. However, there are a very large number of legacy systems which were not developed under controlled conditions, but which still, need to be maintained due to the heavy investment incorporated within them. Such systems are characterised by extremely high program comprehension overheads and the probability that new errors will be introduced during the maintenance process often with serious consequences. To address the issues concerning maintenance of legacy systems this thesis has defined and developed a new process and associated maintenance model, Inverse Software Configuration Management (ISCM). This model centres on a layered approach to the program comprehension process through the definition of a number of software configuration abstractions. This information together with the set of rules for reclaiming the information is stored within an Extensible System Information Base (ESIB) via, die definition of a Programming-in-the- Environment (PITE) language, the Inverse Configuration Description Language (ICDL). In order to assist the application of the ISCM process across a wide range of software applications and system architectures, die PISCES (Proforma Identification Scheme for Configurations of Existing Systems) method has been developed as a series of defined procedures and guidelines. To underpin the method and to offer a user-friendly interface to the process a series of templates, the Proforma Increasing Complexity Series (PICS) has been developed. To enable the useful employment of these techniques on large-scale systems, the subject of automation has been addressed through the development of a flexible meta-CASE environment, the PISCES M4 (MultiMedia Maintenance Manager) system. Of particular interest within this environment is the provision of a multimedia user interface (MUI) to die maintenance process. As a means of evaluating the PISCES method and to provide feedback into die ISCM process a number of practical applications have been modelled. In summary, this research has considered a number of concepts some of which are innovative in themselves, others of which are used in an innovative manner. In combination these concepts may be considered to considerably advance the knowledge and understanding of die comprehension process during the maintenance of legacy software systems. A number of publications have already resulted from the research and several more are in preparation. Additionally a number of areas for further study have been identified some of which are already underway as funded research and development projects

    A Big Data Analyzer for Large Trace Logs

    Full text link
    Current generation of Internet-based services are typically hosted on large data centers that take the form of warehouse-size structures housing tens of thousands of servers. Continued availability of a modern data center is the result of a complex orchestration among many internal and external actors including computing hardware, multiple layers of intricate software, networking and storage devices, electrical power and cooling plants. During the course of their operation, many of these components produce large amounts of data in the form of event and error logs that are essential not only for identifying and resolving problems but also for improving data center efficiency and management. Most of these activities would benefit significantly from data analytics techniques to exploit hidden statistical patterns and correlations that may be present in the data. The sheer volume of data to be analyzed makes uncovering these correlations and patterns a challenging task. This paper presents BiDAl, a prototype Java tool for log-data analysis that incorporates several Big Data technologies in order to simplify the task of extracting information from data traces produced by large clusters and server farms. BiDAl provides the user with several analysis languages (SQL, R and Hadoop MapReduce) and storage backends (HDFS and SQLite) that can be freely mixed and matched so that a custom tool for a specific task can be easily constructed. BiDAl has a modular architecture so that it can be extended with other backends and analysis languages in the future. In this paper we present the design of BiDAl and describe our experience using it to analyze publicly-available traces from Google data clusters, with the goal of building a realistic model of a complex data center.Comment: 26 pages, 10 figure

    Speedy Transactions in Multicore In-Memory Databases

    Get PDF
    Silo is a new in-memory database that achieves excellent performance and scalability on modern multicore machines. Silo was designed from the ground up to use system memory and caches efficiently. For instance, it avoids all centralized contention points, including that of centralized transaction ID assignment. Silo's key contribution is a commit protocol based on optimistic concurrency control that provides serializability while avoiding all shared-memory writes for records that were only read. Though this might seem to complicate the enforcement of a serial order, correct logging and recovery is provided by linking periodically-updated epochs with the commit protocol. Silo provides the same guarantees as any serializable database without unnecessary scalability bottlenecks or much additional latency. Silo achieves almost 700,000 transactions per second on a standard TPC-C workload mix on a 32-core machine, as well as near-linear scalability. Considered per core, this is several times higher than previously reported results.Engineering and Applied Science

    Beltway: Getting Around Garbage Collection Gridlock

    Get PDF
    We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts independently, and collects increments on a belt in first-in-first-out order. We show that Beltway configurations, selected by command line options, act and perform the same as semi-space, generational, and older-first collectors, and encompass all previous copying collectors of which we are aware. The increasing reliance on garbage collected languages such as Java requires that the collector perform well. We show that the generality of Beltway enables us to design and implement new collectors that are robust to variations in heap size and improve total execution time over the best generational copying collectors of which we are aware by up to 40%, and on average by 5 to 10%, for small to moderate heap sizes. New garbage collection algorithms are rare, and yet we define not just one, but a new family of collectors that subsumes previous work. This generality enables us to explore a larger design space and build better collectors
    corecore