77 research outputs found
Practical Fine-grained Privilege Separation in Multithreaded Applications
An inherent security limitation with the classic multithreaded programming
model is that all the threads share the same address space and, therefore, are
implicitly assumed to be mutually trusted. This assumption, however, does not
take into consideration of many modern multithreaded applications that involve
multiple principals which do not fully trust each other. It remains challenging
to retrofit the classic multithreaded programming model so that the security
and privilege separation in multi-principal applications can be resolved.
This paper proposes ARBITER, a run-time system and a set of security
primitives, aimed at fine-grained and data-centric privilege separation in
multithreaded applications. While enforcing effective isolation among
principals, ARBITER still allows flexible sharing and communication between
threads so that the multithreaded programming paradigm can be preserved. To
realize controlled sharing in a fine-grained manner, we created a novel
abstraction named ARBITER Secure Memory Segment (ASMS) and corresponding OS
support. Programmers express security policies by labeling data and principals
via ARBITER's API following a unified model. We ported a widely-used, in-memory
database application (memcached) to ARBITER system, changing only around 100
LOC. Experiments indicate that only an average runtime overhead of 5.6% is
induced to this security enhanced version of application
Memory Subsystems for Security, Consistency, and Scalability
In response to the continuous demand for the ability to process ever larger datasets, as well as discoveries in next-generation memory technologies, researchers have been vigorously studying memory-driven computing architectures that shall allow data-intensive applications to access enormous amounts of pooled non-volatile memory. As applications continue to interact with increasing amounts of components and datasets, existing systems struggle to eĂżciently enforce the principle of least privilege for security. While non-volatile memory can retain data even after a power loss and allow for large main memory capacity, programmers have to bear the burdens of maintaining the consistency of program memory for fault tolerance as well as handling huge datasets with traditional yet expensive memory management interfaces for scalability. Todayâs computer systems have become too sophisticated for existing memory subsystems to handle many design requirements. In this dissertation, we introduce three memory subsystems to address challenges in terms of security, consistency, and scalability. Specifcally, we propose SMVs to provide threads with fne-grained control over access privileges for a partially shared address space for security, NVthreads to allow programmers to easily leverage nonvolatile memory with automatic persistence for consistency, and PetaMem to enable memory-centric applications to freely access memory beyond the traditional process boundary with support for memory isolation and crash recovery for security, consistency, and scalability
State-of-the-Art in Parallel Computing with R
R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly useful for general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems four different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix
Incremental parallel and distributed systems
Incremental computation strives for efficient successive runs of applications by re-executing only those parts of the computation that are affected by a given input change instead of recomputing everything from scratch. To realize the benefits of incremental computation, researchers and practitioners are developing new systems where the application programmer can provide an efficient update mechanism for changing application data. Unfortunately, most of the existing solutions are limiting because they not only depart from existing programming models, but also require programmers to devise an incremental update mechanism (or a dynamic algorithm) on a per-application basis.
In this thesis, we present incremental parallel and distributed systems that enable existing real-world applications to automatically benefit from efficient incremental updates. Our approach neither requires departure from current models of programming, nor the design and implementation of dynamic algorithms.
To achieve these goals, we have designed and built the following incremental systems: (i) Incoop â a system for incremental MapReduce computation; (ii) Shredder â a GPU-accelerated system for incremental storage; (iii) Slider â a stream processing platform for incremental sliding window analytics; and (iv) iThreads â a threading library for parallel incremental computation. Our experience with these systems shows that significant performance can be achieved for existing applications without requiring any additional effort from programmers.Inkrementelle Berechnungen ermoÌglichen die effizientere AusfuÌhrung aufeinanderfolgender Anwendungsaufrufe, indem nur die Teilbereiche der Anwendung erneut ausgefuÌrt werden, die von den AÌnderungen der Eingabedaten betroffen sind. Dieses Berechnungsverfahren steht dem konventionellen und vollstaÌndig neu berechnenden Verfahren gegenuÌber. Um den Vorteil inkrementeller Berechnungen auszunutzen, entwickeln sowohl Wissenschaft als auch Industrie neue Systeme, bei denen der Anwendungsprogrammierer den effizienten Aktualisierungsmechanismus fuÌr die AÌnderung der Anwendungsdaten bereitstellt. Bedauerlicherweise lassen sich existierende LoÌsungen meist nur eingeschraÌnkt anwenden, da sie das konventionelle Programmierungsmodel beibehalten und dadurch die erneute Entwicklung vom Programmierer des inkrementellen Aktualisierungsmechanismus (oder einen dynamischen Algorithmus) fuÌr jede Anwendung verlangen.
Diese Doktorarbeit stellt inkrementelle Parallele- und Verteiltesysteme vor, die es existierenden Real-World-Anwendungen ermoÌglichen vom Vorteil der inkre- mentellen Berechnung automatisch zu profitieren. Unser Ansatz erfordert weder eine Abkehr von gegenwaÌrtigen Programmiermodellen, noch Design und Implementierung von anwendungsspezifischen dynamischen Algorithmen.
Um dieses Ziel zu erreichen, haben wir die folgenden Systeme zur inkrementellen parallelen und verteilten Berechnung entworfen und implementiert: (i) Incoop â ein System fuÌr inkrementelle Map-Reduce-Programme; (ii) Shredder â ein GPU- beschleunigtes System zur inkrementellen Speicherung; (iii) Slider â eine Plat- tform zur Batch-basierten Streamverarbeitung via inkrementeller Sliding-Window- Berechnung; und (iv) iThreads â eine Threading-Bibliothek zur parallelen inkre- mentellen Berechnung. Unsere Erfahrungen mit diesen Systemen zeigen, dass unsere Methoden sehr gute Performanz liefern koÌnnen, und dies ohne weiteren Aufwand des Programmierers
PYDAC: A DISTRIBUTED RUNTIME SYSTEM AND PROGRAMMING MODEL FOR A HETEROGENEOUS MANY-CORE ARCHITECTURE
Heterogeneous many-core architectures that consist of big, fast cores and small, energy-efficient cores are very promising for future high-performance computing (HPC) systems. These architectures offer a good balance between single-threaded perfor- mance and multithreaded throughput. Such systems impose challenges on the design of programming model and runtime system. Specifically, these challenges include (a) how to fully utilize the chipâs performance, (b) how to manage heterogeneous, un- reliable hardware resources, and (c) how to generate and manage a large amount of parallel tasks.
This dissertation proposes and evaluates a Python-based programming framework called PyDac. PyDac supports a two-level programming model. At the high level, a programmer creates a very large number of tasks, using the divide-and-conquer strategy. At the low level, tasks are written in imperative programming style. The runtime system seamlessly manages the parallel tasks, system resilience, and inter- task communication with architecture support. PyDac has been implemented on both an field-programmable gate array (FPGA) emulation of an unconventional het- erogeneous architecture and a conventional multicore microprocessor. To evaluate the performance, resilience, and programmability of the proposed system, several micro-benchmarks were developed. We found that (a) the PyDac abstracts away task communication and achieves programmability, (b) the micro-benchmarks are scalable on the hardware prototype, but (predictably) serial operation limits some micro-benchmarks, and (c) the degree of protection versus speed could be varied in redundant threading that is transparent to programmers
Incremental parallel and distributed systems
Incremental computation strives for efficient successive runs of applications by re-executing only those parts of the computation that are affected by a given input change instead of recomputing everything from scratch. To realize the benefits of incremental computation, researchers and practitioners are developing new systems where the application programmer can provide an efficient update mechanism for changing application data. Unfortunately, most of the existing solutions are limiting because they not only depart from existing programming models, but also require programmers to devise an incremental update mechanism (or a dynamic algorithm) on a per-application basis.
In this thesis, we present incremental parallel and distributed systems that enable existing real-world applications to automatically benefit from efficient incremental updates. Our approach neither requires departure from current models of programming, nor the design and implementation of dynamic algorithms.
To achieve these goals, we have designed and built the following incremental systems: (i) Incoop â a system for incremental MapReduce computation; (ii) Shredder â a GPU-accelerated system for incremental storage; (iii) Slider â a stream processing platform for incremental sliding window analytics; and (iv) iThreads â a threading library for parallel incremental computation. Our experience with these systems shows that significant performance can be achieved for existing applications without requiring any additional effort from programmers.Inkrementelle Berechnungen ermoÌglichen die effizientere AusfuÌhrung aufeinanderfolgender Anwendungsaufrufe, indem nur die Teilbereiche der Anwendung erneut ausgefuÌrt werden, die von den AÌnderungen der Eingabedaten betroffen sind. Dieses Berechnungsverfahren steht dem konventionellen und vollstaÌndig neu berechnenden Verfahren gegenuÌber. Um den Vorteil inkrementeller Berechnungen auszunutzen, entwickeln sowohl Wissenschaft als auch Industrie neue Systeme, bei denen der Anwendungsprogrammierer den effizienten Aktualisierungsmechanismus fuÌr die AÌnderung der Anwendungsdaten bereitstellt. Bedauerlicherweise lassen sich existierende LoÌsungen meist nur eingeschraÌnkt anwenden, da sie das konventionelle Programmierungsmodel beibehalten und dadurch die erneute Entwicklung vom Programmierer des inkrementellen Aktualisierungsmechanismus (oder einen dynamischen Algorithmus) fuÌr jede Anwendung verlangen.
Diese Doktorarbeit stellt inkrementelle Parallele- und Verteiltesysteme vor, die es existierenden Real-World-Anwendungen ermoÌglichen vom Vorteil der inkre- mentellen Berechnung automatisch zu profitieren. Unser Ansatz erfordert weder eine Abkehr von gegenwaÌrtigen Programmiermodellen, noch Design und Implementierung von anwendungsspezifischen dynamischen Algorithmen.
Um dieses Ziel zu erreichen, haben wir die folgenden Systeme zur inkrementellen parallelen und verteilten Berechnung entworfen und implementiert: (i) Incoop â ein System fuÌr inkrementelle Map-Reduce-Programme; (ii) Shredder â ein GPU- beschleunigtes System zur inkrementellen Speicherung; (iii) Slider â eine Plat- tform zur Batch-basierten Streamverarbeitung via inkrementeller Sliding-Window- Berechnung; und (iv) iThreads â eine Threading-Bibliothek zur parallelen inkre- mentellen Berechnung. Unsere Erfahrungen mit diesen Systemen zeigen, dass unsere Methoden sehr gute Performanz liefern koÌnnen, und dies ohne weiteren Aufwand des Programmierers
Investigating tools and techniques for improving software performance on multiprocessor computer systems
The availability of modern commodity multicore processors and multiprocessor computer systems has resulted in the widespread adoption of parallel computers in a variety of environments, ranging from the home to workstation and server environments in particular. Unfortunately, parallel programming is harder and requires more expertise than the traditional sequential programming model. The variety of tools and parallel programming models available to the programmer further complicates the issue. The primary goal of this research was to identify and describe a selection of parallel programming tools and techniques to aid novice parallel programmers in the process of developing efficient parallel C/C++ programs for the Linux platform. This was achieved by highlighting and describing the key concepts and hardware factors that affect parallel programming, providing a brief survey of commonly available software development tools and parallel programming models and libraries, and presenting structured approaches to software performance tuning and parallel programming. Finally, the performance of several parallel programming models and libraries was investigated, along with the programming effort required to implement solutions using the respective models. A quantitative research methodology was applied to the investigation of the performance and programming effort associated with the selected parallel programming models and libraries, which included automatic parallelisation by the compiler, Boost Threads, Cilk Plus, OpenMP, POSIX threads (Pthreads), and Threading Building Blocks (TBB). Additionally, the performance of the GNU C/C++ and Intel C/C++ compilers was examined. The results revealed that the choice of parallel programming model or library is dependent on the type of problem being solved and that there is no overall best choice for all classes of problem. However, the results also indicate that parallel programming models with higher levels of abstraction require less programming effort and provide similar performance compared to explicit threading models. The principle conclusion was that the problem analysis and parallel design are an important factor in the selection of the parallel programming model and tools, but that models with higher levels of abstractions, such as OpenMP and Threading Building Blocks, are favoured
Portable TCP/IP server design
There are a number of known architectural patterns for TCP/IP server design. I present a survey of design choices based on some of the most common of these patterns. I have demonstrated, with working code samples, that most of these architectural patterns are readily portable between UNIX and Windows NT platforms without necessarily incurring significant performance penalties.ComputingM. Sc. (Computer Science
- âŠ