2,979 research outputs found

    Tupleware: Redefining Modern Analytics

    Full text link
    There is a fundamental discrepancy between the targeted and actual users of current analytics frameworks. Most systems are designed for the data and infrastructure of the Googles and Facebooks of the world---petabytes of data distributed across large cloud deployments consisting of thousands of cheap commodity machines. Yet, the vast majority of users operate clusters ranging from a few to a few dozen nodes, analyze relatively small datasets of up to a few terabytes, and perform primarily compute-intensive operations. Targeting these users fundamentally changes the way we should build analytics systems. This paper describes the design of Tupleware, a new system specifically aimed at the challenges faced by the typical user. Tupleware's architecture brings together ideas from the database, compiler, and programming languages communities to create a powerful end-to-end solution for data analysis. We propose novel techniques that consider the data, computations, and hardware together to achieve maximum performance on a case-by-case basis. Our experimental evaluation quantifies the impact of our novel techniques and shows orders of magnitude performance improvement over alternative systems

    The C Object System: Using C as a High-Level Object-Oriented Language

    Full text link
    The C Object System (Cos) is a small C library which implements high-level concepts available in Clos, Objc and other object-oriented programming languages: uniform object model (class, meta-class and property-metaclass), generic functions, multi-methods, delegation, properties, exceptions, contracts and closures. Cos relies on the programmable capabilities of the C programming language to extend its syntax and to implement the aforementioned concepts as first-class objects. Cos aims at satisfying several general principles like simplicity, extensibility, reusability, efficiency and portability which are rarely met in a single programming language. Its design is tuned to provide efficient and portable implementation of message multi-dispatch and message multi-forwarding which are the heart of code extensibility and reusability. With COS features in hand, software should become as flexible and extensible as with scripting languages and as efficient and portable as expected with C programming. Likewise, Cos concepts should significantly simplify adaptive and aspect-oriented programming as well as distributed and service-oriented computingComment: 18

    Prototyping Formal System Models with Active Objects

    Full text link
    We propose active object languages as a development tool for formal system models of distributed systems. Additionally to a formalization based on a term rewriting system, we use established Software Engineering concepts, including software product lines and object orientation that come with extensive tool support. We illustrate our modeling approach by prototyping a weak memory model. The resulting executable model is modular and has clear interfaces between communicating participants through object-oriented modeling. Relaxations of the basic memory model are expressed as self-contained variants of a software product line. As a modeling language we use the formal active object language ABS which comes with an extensive tool set. This permits rapid formalization of core ideas, early validity checks in terms of formal invariant proofs, and debugging support by executing test runs. Hence, our approach supports the prototyping of formal system models with early feedback.Comment: In Proceedings ICE 2018, arXiv:1810.0205

    Exploring Caching for Efficient Collection Operations

    Get PDF
    Many useful programs operate on collection types. Extensive libraries are available in many programming languages, such as the C++ Standard Template Library, which make programming with collections convenient. Extending programming languages to provide collection queries as first class constructs in the language would not only allow programmers to write queries explicitly in their programs but it would also allow compilers to leverage the wealth of experience available from the database domain to optimize such queries. This paper describes an approach to reducing the run time of programs involving explicit collection queries by leveraging a cache to store previously computed results. We propose caching the results of join (sub)queries which allows queries that miss the cache entirely to be answered partially from the cache thereby improving the query execution time. We also describe an effective cache policy to determine which join (sub)queries to cache. the cache is maintained incrementally, when the underlying collections change, and use of the cache space is optimized by a cache replacement policy. © 2011 IEEE
    • …
    corecore