2,979 research outputs found
Tupleware: Redefining Modern Analytics
There is a fundamental discrepancy between the targeted and actual users of
current analytics frameworks. Most systems are designed for the data and
infrastructure of the Googles and Facebooks of the world---petabytes of data
distributed across large cloud deployments consisting of thousands of cheap
commodity machines. Yet, the vast majority of users operate clusters ranging
from a few to a few dozen nodes, analyze relatively small datasets of up to a
few terabytes, and perform primarily compute-intensive operations. Targeting
these users fundamentally changes the way we should build analytics systems.
This paper describes the design of Tupleware, a new system specifically aimed
at the challenges faced by the typical user. Tupleware's architecture brings
together ideas from the database, compiler, and programming languages
communities to create a powerful end-to-end solution for data analysis. We
propose novel techniques that consider the data, computations, and hardware
together to achieve maximum performance on a case-by-case basis. Our
experimental evaluation quantifies the impact of our novel techniques and shows
orders of magnitude performance improvement over alternative systems
The C Object System: Using C as a High-Level Object-Oriented Language
The C Object System (Cos) is a small C library which implements high-level
concepts available in Clos, Objc and other object-oriented programming
languages: uniform object model (class, meta-class and property-metaclass),
generic functions, multi-methods, delegation, properties, exceptions, contracts
and closures. Cos relies on the programmable capabilities of the C programming
language to extend its syntax and to implement the aforementioned concepts as
first-class objects. Cos aims at satisfying several general principles like
simplicity, extensibility, reusability, efficiency and portability which are
rarely met in a single programming language. Its design is tuned to provide
efficient and portable implementation of message multi-dispatch and message
multi-forwarding which are the heart of code extensibility and reusability.
With COS features in hand, software should become as flexible and extensible as
with scripting languages and as efficient and portable as expected with C
programming. Likewise, Cos concepts should significantly simplify adaptive and
aspect-oriented programming as well as distributed and service-oriented
computingComment: 18
Prototyping Formal System Models with Active Objects
We propose active object languages as a development tool for formal system
models of distributed systems. Additionally to a formalization based on a term
rewriting system, we use established Software Engineering concepts, including
software product lines and object orientation that come with extensive tool
support. We illustrate our modeling approach by prototyping a weak memory
model. The resulting executable model is modular and has clear interfaces
between communicating participants through object-oriented modeling.
Relaxations of the basic memory model are expressed as self-contained variants
of a software product line. As a modeling language we use the formal active
object language ABS which comes with an extensive tool set. This permits rapid
formalization of core ideas, early validity checks in terms of formal invariant
proofs, and debugging support by executing test runs. Hence, our approach
supports the prototyping of formal system models with early feedback.Comment: In Proceedings ICE 2018, arXiv:1810.0205
Exploring Caching for Efficient Collection Operations
Many useful programs operate on collection types. Extensive libraries are available in many programming languages, such as the C++ Standard Template Library, which make programming with collections convenient. Extending programming languages to provide collection queries as first class constructs in the language would not only allow programmers to write queries explicitly in their programs but it would also allow compilers to leverage the wealth of experience available from the database domain to optimize such queries. This paper describes an approach to reducing the run time of programs involving explicit collection queries by leveraging a cache to store previously computed results. We propose caching the results of join (sub)queries which allows queries that miss the cache entirely to be answered partially from the cache thereby improving the query execution time. We also describe an effective cache policy to determine which join (sub)queries to cache. the cache is maintained incrementally, when the underlying collections change, and use of the cache space is optimized by a cache replacement policy. © 2011 IEEE
- …