1,178 research outputs found

    Distributed Concurrent Persistent Languages: An Experimental Design and Implementation

    Get PDF
    A universal persistent object store is a logical space of persistent objects whose localities span over machines reachable over networks. It provides a conceptual framework in which, on one hand, the distribution of data is transparent to application programmers and, on the other, store semantics of conventional languages is preserved. This means the manipulation of persistent objects on remote machines is both syntactically and semantically the same as in the case of local data. Consequently, many aspects of distributed programming in which computation tasks cooperate over different processors and different stores can be addressed within the confines of persistent programming. The work reported in this thesis is a logical generalization of the notion of persistence in the context of distribution. The concept of a universal persistent store is founded upon a universal addressing mechanism which augments existing addressing mechanisms. The universal addressing mechanism is realized based upon remote pointers which although containing more locality information than ordinary pointers, do not require architectural changes. Moreover, these remote pointers are transparent to the programmers. A language, Distributed PS-algol, is designed to experiment with this idea. The novel features of the language include: lightweight processes with a flavour of distribution, mutexes as the store-based synchronization primitive, and a remote procedure call mechanism as the message-based interprocess communication mechanism. Furthermore, the advantages of shared store programming and network architecture are obtained with the introduction of the programming concept of locality in an unobtrusive manner. A characteristic of the underlying addressing mechanism is that data are never copied to satisfy remote demands except where efficiency can be attained without compromising the semantics of data. A remote store operation model is described to effect remote updates. It is argued that such a choice is the most natural given that remote store operations resemble remote procedure calls

    Integrating Naming and Addressing of Persistent data in Programming Language and Operating System Contexts

    Get PDF
    There exist a number of desirable transparencies in distributed computing, viz., name transparency: having a uniform way of naming entities in the system, regardless of their type or physical make up; location transparency: having a uniform way of addressing entities, regardless of their physical location; representation transparency: having a uniform way of representing data, which simplifies sharing data between applications written in different highlevel languages and running on different hardware architectures (interoperability) and finally invocation transparency: having a uniform way of invoking operations on entities. The advent of persistency in programming language contexts has created a need for the integration of these four important concepts, viz., naming, addressing, representation and manipulation of data in programming language and operating system contexts. This paper attempts to address the first three transparencies, postponing the fourth to a later paper. First, we make up a list of things that are needed to construct a persistent programming environment and relate this list to existing persistent object models, revealing their inadequacies. We then describe a new model which merges programming language and operating system naming contexts into a global name space which, while enforcing uniformity through the use of globally unique names, still allows the application of personal nicknames. Furthermore, we explain how persistent data is stored and retrieved using a client/server model of interaction, and how it could be acted upon correctly, through the concept of typed data. We conclude by checking how well our model scores on the wish list, listing the current status and future directions for research

    The Architecture of a Worldwide Distributed System

    Get PDF

    Architectural Principles for Database Systems on Storage-Class Memory

    Get PDF
    Database systems have long been optimized to hide the higher latency of storage media, yielding complex persistence mechanisms. With the advent of large DRAM capacities, it became possible to keep a full copy of the data in DRAM. Systems that leverage this possibility, such as main-memory databases, keep two copies of the data in two different formats: one in main memory and the other one in storage. The two copies are kept synchronized using snapshotting and logging. This main-memory-centric architecture yields nearly two orders of magnitude faster analytical processing than traditional, disk-centric ones. The rise of Big Data emphasized the importance of such systems with an ever-increasing need for more main memory. However, DRAM is hitting its scalability limits: It is intrinsically hard to further increase its density. Storage-Class Memory (SCM) is a group of novel memory technologies that promise to alleviate DRAM’s scalability limits. They combine the non-volatility, density, and economic characteristics of storage media with the byte-addressability and a latency close to that of DRAM. Therefore, SCM can serve as persistent main memory, thereby bridging the gap between main memory and storage. In this dissertation, we explore the impact of SCM as persistent main memory on database systems. Assuming a hybrid SCM-DRAM hardware architecture, we propose a novel software architecture for database systems that places primary data in SCM and directly operates on it, eliminating the need for explicit IO. This architecture yields many benefits: First, it obviates the need to reload data from storage to main memory during recovery, as data is discovered and accessed directly in SCM. Second, it allows replacing the traditional logging infrastructure by fine-grained, cheap micro-logging at data-structure level. Third, secondary data can be stored in DRAM and reconstructed during recovery. Fourth, system runtime information can be stored in SCM to improve recovery time. Finally, the system may retain and continue in-flight transactions in case of system failures. However, SCM is no panacea as it raises unprecedented programming challenges. Given its byte-addressability and low latency, processors can access, read, modify, and persist data in SCM using load/store instructions at a CPU cache line granularity. The path from CPU registers to SCM is long and mostly volatile, including store buffers and CPU caches, leaving the programmer with little control over when data is persisted. Therefore, there is a need to enforce the order and durability of SCM writes using persistence primitives, such as cache line flushing instructions. This in turn creates new failure scenarios, such as missing or misplaced persistence primitives. We devise several building blocks to overcome these challenges. First, we identify the programming challenges of SCM and present a sound programming model that solves them. Then, we tackle memory management, as the first required building block to build a database system, by designing a highly scalable SCM allocator, named PAllocator, that fulfills the versatile needs of database systems. Thereafter, we propose the FPTree, a highly scalable hybrid SCM-DRAM persistent B+-Tree that bridges the gap between the performance of transient and persistent B+-Trees. Using these building blocks, we realize our envisioned database architecture in SOFORT, a hybrid SCM-DRAM columnar transactional engine. We propose an SCM-optimized MVCC scheme that eliminates write-ahead logging from the critical path of transactions. Since SCM -resident data is near-instantly available upon recovery, the new recovery bottleneck is rebuilding DRAM-based data. To alleviate this bottleneck, we propose a novel recovery technique that achieves nearly instant responsiveness of the database by accepting queries right after recovering SCM -based data, while rebuilding DRAM -based data in the background. Additionally, SCM brings new failure scenarios that existing testing tools cannot detect. Hence, we propose an online testing framework that is able to automatically simulate power failures and detect missing or misplaced persistence primitives. Finally, our proposed building blocks can serve to build more complex systems, paving the way for future database systems on SCM

    Supporting persistent C++ objects in a distributed storage system

    Get PDF
    technical reportWe have designed and implemented a C++ object layer for Khazana, a distributed persistent storage system that exports a flat shared address space as its basic abstraction. The C++ layer described herein lets programmers use familiar C++ idioms to allocate, manipulate, and deallocate persistent shared data structures. It handles the tedious details involved in accessing this shared data, replicating it, maintaining consistency, converting data representations between persistent and in-memory representations, associating type information including methods with objects, etc. To support the C++ object layer on top of Khazana's flat storage abstraction, we have developed a language-specific preprocessor that generates support code to manage the user-specified persistent C++ structures. We describe the design of the C++ object layer and the compiler and runtime mechanisms needed to support it

    An Architecture for the Compilation of Persistent Polymorphic Reflective Higher-Order Languages

    Get PDF
    Persistent Application Systems are potentially very large and long-lived application systems which use information technology: computers, communications, networks, software and databases. They are vital to the organisations that depend on them and have to be adaptable to organisational and technological changes and evolvable without serious interruption of service. Persistent Programming Languages are a promising technology that facilitate the task of incrementally building and maintaining persistent application systems. This thesis identifies a number of technical challenges in making persistent programming languages scalable, with adequate performance and sufficient longevity and in amortising costs by providing general services. A new architecture to support the compilation of long-lived, large-scale applications is proposed. This architecture comprises an intermediate language to be used by front-ends, high-level and machine independent optimisers, low-level optimisers and code generators of target machine code. The intermediate target language, TPL, has been designed to allow compiler writers to utilise common technology for several different orthogonally persistent higher-order reflective languages. The goal is to reuse optimisation and code-generation or interpretation technology with a variety of front-ends. A subsidiary goal is to provide an experimental framework for those investigating optimisation and code generation. TPL has a simple, clean type system and will support orthogonally persistent, reflective, higher-order, polymorphic languages. TPL allows code generation and the abstraction over details of the underlying software and hardware layers. An experiment to build a prototype of the proposed architecture was designed, developed and evaluated. The experimental work includes a language processor and examples of its use are presented in this dissertation. The design space was covered by describing the implications of the goals of supporting the class of languages anticipated while ensuring long-term persistence of data and programs, and sufficient efficiency. For each of the goals, the design decisions were evaluated in face of the results

    SWI-Prolog and the Web

    Get PDF
    Where Prolog is commonly seen as a component in a Web application that is either embedded or communicates using a proprietary protocol, we propose an architecture where Prolog communicates to other components in a Web application using the standard HTTP protocol. By avoiding embedding in external Web servers development and deployment become much easier. To support this architecture, in addition to the transfer protocol, we must also support parsing, representing and generating the key Web document types such as HTML, XML and RDF. This paper motivates the design decisions in the libraries and extensions to Prolog for handling Web documents and protocols. The design has been guided by the requirement to handle large documents efficiently. The described libraries support a wide range of Web applications ranging from HTML and XML documents to Semantic Web RDF processing. To appear in Theory and Practice of Logic Programming (TPLP)Comment: 31 pages, 24 figures and 2 tables. To appear in Theory and Practice of Logic Programming (TPLP
    • …
    corecore