Search CORE

985 research outputs found

mxkernel: a novel system software stack for data processing on modern hardware

Author: Mühlig Jan
Müller Michael
Spinczyk Olaf
Teubner Jens
Publication venue
Publication date: 06/10/2020
Field of study

Emerging hardware platforms are characterized by large degrees of parallelism, complex memory hierarchies, and increasing hardware heterogeneity. Their theoretical peak data processing performance can only be unleashed if the different pieces of systems software collaborate much more closely and if their traditional dependencies and interfaces are redesigned. We have developed the key concepts and a prototype implementation of a novel system software stack named mxkernel. For MxKernel, efficient large scale data processing capabilities are a primary design goal. To achieve this, heterogeneity and parallelism become first-class citizens and deep memory hierarchies are considered from the very beginning. Instead of a classical “thread” model, mxkernel provides a simpler control flow abstraction: mxtasks model closed units of work, for which mxkernel will guarantee the required execution semantics, such exclusive access to a specific object in memory. They can be a very elegant abstraction also for heterogeneity and resource sharing. Furthermore, mxtasks are annotated with metadata, such as code variants (to support heterogeneity), memory access behavior (to improve cache efficiency and support memory hierarchies), or dependencies between mxtasks (to improve scheduling and avoid synchronization cost). With precisely the required metadata available, mxkernel can provide a lightweight, yet highly efficient form of resource management, even across applications, operating system, and database. Based on the mxkernel prototype we present preliminary results from this ambitious undertaking. We argue that threads are an ill-suited control flow abstraction for our modern computer architectures and that a task-based execution model is to be favored

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Concurrent Access Algorithms for Different Data Structures: A Research Review

Author: Dr. Pushpa Rani Suri
Ms. Ranjeet Kaur
Publication venue: Global Journals Inc. (US)
Publication date: 14/05/2014
Field of study

Algorithms for concurrent data structure have gained attention in recent years as multi-core processors have become ubiquitous. Several features of shared-memory multiprocessors make concurrent data structures significantly more difficult to design and to verify as correct than their sequential counterparts. The primary source of this additional difficulty is concurrency. This paper provides an overview of the some concurrent access algorithms for different data structures

Global Journal of Computer Science and Technology (GJCST)

MxTasks: a novel processing model to support data processing on modern hardware

Author: Mühlig Jan
Publication venue
Publication date: 01/01/2023
Field of study

The hardware landscape has changed rapidly in recent years. Modern hardware in today's servers is characterized by many CPU cores, multiple sockets, and vast amounts of main memory structured in NUMA hierarchies. In order to benefit from these highly parallel systems, the software has to adapt and actively engage with newly available features. However, the processing models forming the foundation for many performance-oriented applications have remained essentially unchanged. Threads, which serve as the central processing abstractions, can be considered a "black box" that hardly allows any transparency between the application and the system underneath. On the one hand, applications are aware of the knowledge that could assist the system in optimizing the execution, such as accessed data objects and access patterns. On the other hand, the limited opportunities for information exchange cause operating systems to make assumptions about the applications' intentions to optimize their execution, e.g., for local data access. Applications, on the contrary, implement optimizations tailored to specific situations, such as sophisticated synchronization mechanisms and hardware-conscious data structures. This work presents MxTasking, a task-based runtime environment that assists the design of data structures and applications for contemporary hardware. MxTasking rethinks the interfaces between performance-oriented applications and the execution substrate, streamlining the information exchange between both layers. By breaking patterns of processing models designed with past generations of hardware in mind, MxTasking creates novel opportunities to manage resources in a hardware- and application-conscious way. Accordingly, we question the granularity of "conventional" threads and show that fine-granular MxTasks are a viable abstraction unit for characterizing and optimizing the execution in a general way. Using various demonstrators in the context of database management systems, we illustrate the practical benefits and explore how challenges like memory access latencies and error-prone synchronization of concurrency can be addressed straightforwardly and effectively

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Cache craftiness for fast multicore key-value storage

Author: Kohler Eddie
Mao Yandong
Morris Robert Tappan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2012
Field of study

We present Masstree, a fast key-value database designed for SMP machines. Masstree keeps all data in memory. Its main data structure is a trie-like concatenation of B+-trees, each of which handles a fixed-length slice of a variable-length key. This structure effectively handles arbitrary-length possiblybinary keys, including keys with long shared prefixes. [superscript +]-tree fanout was chosen to minimize total DRAM delay when descending the tree and prefetching each tree node. Lookups use optimistic concurrency control, a read-copy-update-like technique, and do not write shared data structures; updates lock only affected nodes. Logging and checkpointing provide consistency and durability. Though some of these ideas appear elsewhere, Masstree is the first to combine them. We discuss design variants and their consequences. On a 16-core machine, with logging enabled and queries arriving over a network, Masstree executes more than six million simple queries per second. This performance is comparable to that of memcached, a non-persistent hash table server, and higher (often much higher) than that of VoltDB, MongoDB, and Redis.National Science Foundation (U.S.). (Award 0834415)National Science Foundation (U.S.). (Award 0915164)Quanta Computer (Firm

DSpace@MIT

Crossref

Harvard University - DASH

Transactional support for adaptive indexing

Author: Graefe G.
Halim F
Idreos S. (Stratos)
Kuno H.
Manegold S. (Stefan)
Sa J.N. (Joao) de
Seeger B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2014
Field of study

Adaptive indexing initializes and optimizes indexes incrementally, as a side effect of query processing. The goal is to achieve the benefits of indexes while hiding or minimizing the costs of index creation. However, index-optimizing side effects seem to turn read-only queries into update transactions that might, for example, create lock contention. This paper studies concurrency contr

CWI's Institutional Repository

Coarse-Grained, Fine-Grained, and Lock-Free Concurrency Approaches for Self-Balancing B-Tree

Author: Jorgensen Edward R., II
Publication venue: Digital Scholarship@UNLV
Publication date: 01/08/2019
Field of study

This dissertation examines the concurrency approaches for a standard, unmodified B-Tree which is one of the more complex data structures. This includes the coarse grained, fine-grained locking, and the lock-free approaches. The basic industry standard coarse-grained approach is used as a base-line for comparison to the more advanced fine-grained and lock-free approaches. The fine-grained approach is explored and algorithms are presented for the fine-grained B-Tree insertion and deletion. The lock-free approach is addressed and an algorithm for a lock-free B- Tree insertion is provided. The issues associated with a lock-free deletion are discussed. Comparison trade-offs are presented and discussed. As a final part of this effort, specific testing processes are discussed and presented

University of Nevada, Las Vegas Repository