6,920 research outputs found

    Persistent Memory Programming Abstractions in Context of Concurrent Applications

    Full text link
    The advent of non-volatile memory (NVM) technologies like PCM, STT, memristors and Fe-RAM is believed to enhance the system performance by getting rid of the traditional memory hierarchy by reducing the gap between memory and storage. This memory technology is considered to have the performance like that of DRAM and persistence like that of disks. Thus, it would also provide significant performance benefits for big data applications by allowing in-memory processing of large data with the lowest latency to persistence. Leveraging the performance benefits of this memory-centric computing technology through traditional memory programming is not trivial and the challenges aggravate for parallel/concurrent applications. To this end, several programming abstractions have been proposed like NVthreads, Mnemosyne and intel's NVML. However, deciding upon a programming abstraction which is easier to program and at the same time ensures the consistency and balances various software and architectural trade-offs is openly debatable and active area of research for NVM community. We study the NVthreads, Mnemosyne and NVML libraries by building a concurrent and persistent set and open addressed hash-table data structure application. In this process, we explore and report various tradeoffs and hidden costs involved in building concurrent applications for persistence in terms of achieving efficiency, consistency and ease of programming with these NVM programming abstractions. Eventually, we evaluate the performance of the set and hash-table data structure applications. We observe that NVML is easiest to program with but is least efficient and Mnemosyne is most performance friendly but involves significant programming efforts to build concurrent and persistent applications.Comment: Accepted in HiPC SRS 201

    A Peer-to-Peer Middleware Framework for Resilient Persistent Programming

    Get PDF
    The persistent programming systems of the 1980s offered a programming model that integrated computation and long-term storage. In these systems, reliable applications could be engineered without requiring the programmer to write translation code to manage the transfer of data to and from non-volatile storage. More importantly, it simplified the programmer's conceptual model of an application, and avoided the many coherency problems that result from multiple cached copies of the same information. Although technically innovative, persistent languages were not widely adopted, perhaps due in part to their closed-world model. Each persistent store was located on a single host, and there were no flexible mechanisms for communication or transfer of data between separate stores. Here we re-open the work on persistence and combine it with modern peer-to-peer techniques in order to provide support for orthogonal persistence in resilient and potentially long-running distributed applications. Our vision is of an infrastructure within which an application can be developed and distributed with minimal modification, whereupon the application becomes resilient to certain failure modes. If a node, or the connection to it, fails during execution of the application, the objects are re-instantiated from distributed replicas, without their reference holders being aware of the failure. Furthermore, we believe that this can be achieved within a spectrum of application programmer intervention, ranging from minimal to totally prescriptive, as desired. The same mechanisms encompass an orthogonally persistent programming model. We outline our approach to implementing this vision, and describe current progress.Comment: Submitted to EuroSys 200

    Dataflow development of medium-grained parallel software

    Get PDF
    PhD ThesisIn the 1980s, multiple-processor computers (multiprocessors) based on conven- tional processing elements emerged as a popular solution to the continuing demand for ever-greater computing power. These machines offer a general-purpose parallel processing platform on which the size of program units which can be efficiently executed in parallel - the "grain size" - is smaller than that offered by distributed computing environments, though greater than that of some more specialised architectures. However, programming to exploit this medium-grained parallelism remains difficult. Concurrent execution is inherently complex, yet there is a lack of programming tools to support parallel programming activities such as program design, implementation, debugging, performance tuning and so on. In helping to manage complexity in sequential programming, visual tools have often been used to great effect, which suggests one approach towards the goal of making parallel programming less difficult. This thesis examines the possibilities which the dataflow paradigm has to offer as the basis for a set of visual parallel programming tools, and presents a dataflow notation designed as a framework for medium-grained parallel programming. The implementation of this notation as a programming language is discussed, and its suitability for the medium-grained level is examinedScience and Engineering Research Council of Great Britain EC ERASMUS schem

    Fine-Grain Checkpointing with In-Cache-Line Logging

    Full text link
    Non-Volatile Memory offers the possibility of implementing high-performance, durable data structures. However, achieving performance comparable to well-designed data structures in non-persistent (transient) memory is difficult, primarily because of the cost of ensuring the order in which memory writes reach NVM. Often, this requires flushing data to NVM and waiting a full memory round-trip time. In this paper, we introduce two new techniques: Fine-Grained Checkpointing, which ensures a consistent, quickly recoverable data structure in NVM after a system failure, and In-Cache-Line Logging, an undo-logging technique that enables recovery of earlier state without requiring cache-line flushes in the normal case. We implemented these techniques in the Masstree data structure, making it persistent and demonstrating the ease of applying them to a highly optimized system and their low (5.9-15.4\%) runtime overhead cost.Comment: In 2019 Architectural Support for Programming Languages and Operating Systems (ASPLOS 19), April 13, 2019, Providence, RI, US

    Open Programming Language Interpreters

    Get PDF
    Context: This paper presents the concept of open programming language interpreters and the implementation of a framework-level metaobject protocol (MOP) to support them. Inquiry: We address the problem of dynamic interpreter adaptation to tailor the interpreter's behavior on the task to be solved and to introduce new features to fulfill unforeseen requirements. Many languages provide a MOP that to some degree supports reflection. However, MOPs are typically language-specific, their reflective functionality is often restricted, and the adaptation and application logic are often mixed which hardens the understanding and maintenance of the source code. Our system overcomes these limitations. Approach: We designed and implemented a system to support open programming language interpreters. The prototype implementation is integrated in the Neverlang framework. The system exposes the structure, behavior and the runtime state of any Neverlang-based interpreter with the ability to modify it. Knowledge: Our system provides a complete control over interpreter's structure, behavior and its runtime state. The approach is applicable to every Neverlang-based interpreter. Adaptation code can potentially be reused across different language implementations. Grounding: Having a prototype implementation we focused on feasibility evaluation. The paper shows that our approach well addresses problems commonly found in the research literature. We have a demonstrative video and examples that illustrate our approach on dynamic software adaptation, aspect-oriented programming, debugging and context-aware interpreters. Importance: To our knowledge, our paper presents the first reflective approach targeting a general framework for language development. Our system provides full reflective support for free to any Neverlang-based interpreter. We are not aware of any prior application of open implementations to programming language interpreters in the sense defined in this paper. Rather than substituting other approaches, we believe our system can be used as a complementary technique in situations where other approaches present serious limitations

    Memory Subsystems for Security, Consistency, and Scalability

    Get PDF
    In response to the continuous demand for the ability to process ever larger datasets, as well as discoveries in next-generation memory technologies, researchers have been vigorously studying memory-driven computing architectures that shall allow data-intensive applications to access enormous amounts of pooled non-volatile memory. As applications continue to interact with increasing amounts of components and datasets, existing systems struggle to eÿciently enforce the principle of least privilege for security. While non-volatile memory can retain data even after a power loss and allow for large main memory capacity, programmers have to bear the burdens of maintaining the consistency of program memory for fault tolerance as well as handling huge datasets with traditional yet expensive memory management interfaces for scalability. Today’s computer systems have become too sophisticated for existing memory subsystems to handle many design requirements. In this dissertation, we introduce three memory subsystems to address challenges in terms of security, consistency, and scalability. Specifcally, we propose SMVs to provide threads with fne-grained control over access privileges for a partially shared address space for security, NVthreads to allow programmers to easily leverage nonvolatile memory with automatic persistence for consistency, and PetaMem to enable memory-centric applications to freely access memory beyond the traditional process boundary with support for memory isolation and crash recovery for security, consistency, and scalability

    Enabling preemptive multiprogramming on GPUs

    Get PDF
    GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments from mobile systems to cloud computing. These systems are usually running multiple applications, from one or several users. However GPUs do not provide the support for resource sharing traditionally expected in these scenarios. Thus, such systems are unable to provide key multiprogrammed workload requirements, such as responsiveness, fairness or quality of service. In this paper, we propose a set of hardware extensions that allow GPUs to efficiently support multiprogrammed GPU workloads. We argue for preemptive multitasking and design two preemption mechanisms that can be used to implement GPU scheduling policies. We extend the architecture to allow concurrent execution of GPU kernels from different user processes and implement a scheduling policy that dynamically distributes the GPU cores among concurrently running kernels, according to their priorities. We extend the NVIDIA GK110 (Kepler) like GPU architecture with our proposals and evaluate them on a set of multiprogrammed workloads with up to eight concurrent processes. Our proposals improve execution time of high-priority processes by 15.6x, the average application turnaround time between 1.5x to 2x, and system fairness up to 3.4x.We would like to thank the anonymous reviewers, Alexan- der Veidenbaum, Carlos Villavieja, Lluis Vilanova, Lluc Al- varez, and Marc Jorda on their comments and help improving our work and this paper. This work is supported by Euro- pean Commission through TERAFLUX (FP7-249013), Mont- Blanc (FP7-288777), and RoMoL (GA-321253) projects, NVIDIA through the CUDA Center of Excellence program, Spanish Government through Programa Severo Ochoa (SEV-2011-0067) and Spanish Ministry of Science and Technology through TIN2007-60625 and TIN2012-34557 projects.Peer ReviewedPostprint (author’s final draft
    corecore