2,901 research outputs found
EbbRT: a framework for building per-application library operating systems
Efficient use of high speed hardware requires operating system components be customized to the application work- load. Our general purpose operating systems are ill-suited for this task. We present EbbRT, a framework for constructing per-application library operating systems for cloud applications. The primary objective of EbbRT is to enable high-performance in a tractable and maintainable fashion. This paper describes the design and implementation of EbbRT, and evaluates its ability to improve the performance of common cloud applications. The evaluation of the EbbRT prototype demonstrates memcached, run within a VM, can outperform memcached run on an unvirtualized Linux. The prototype evaluation also demonstrates an 14% performance improvement of a V8 JavaScript engine benchmark, and a node.js webserver that achieves a 50% reduction in 99th percentile latency compared to it run on Linux
EbbRT: a customizable operating system for cloud applications
Efficient use of hardware requires operating system components be customized to the application workload. Our general purpose operating systems are ill-suited for this task. We present Genesis, a new operating system that enables per-application customizations for cloud applications. Genesis achieves this through a novel heterogeneous distributed structure, a partitioned object model, and an event-driven execution environment. This paper describes the design and prototype implementation of Genesis, and evaluates its ability to improve the performance of common cloud applications. The evaluation of the Genesis prototype demonstrates memcached, run within a VM, can outperform memcached run on an unvirtualized Linux. The prototype evaluation also demonstrates an 14% performance improvement of a V8 JavaScript engine benchmark, and a node.js webserver that achieves a 50% reduction in 99th percentile latency compared to it run on Linux
Emulating and evaluating hybrid memory for managed languages on NUMA hardware
Non-volatile memory (NVM) has the potential to become a mainstream memory technology and challenge DRAM. Researchers evaluating the speed, endurance, and abstractions of hybrid memories with DRAM and NVM typically use simulation, making it easy to evaluate the impact of different hardware technologies and parameters. Simulation is, however, extremely slow, limiting the applications and datasets in the evaluation. Simulation also precludes critical workloads, especially those written in managed languages such as Java and C#. Good methodology embraces a variety of techniques for evaluating new ideas, expanding the experimental scope, and uncovering new insights.
This paper introduces a platform to emulate hybrid memory for managed languages using commodity NUMA servers. Emulation complements simulation but offers richer software experimentation. We use a thread-local socket to emulate DRAM and a remote socket to emulate NVM. We use standard C library routines to allocate heap memory on the DRAM and NVM sockets for use with explicit memory management or garbage collection. We evaluate the emulator using various configurations of write-rationing garbage collectors that improve NVM lifetimes by limiting writes to NVM, using 15 applications and various datasets and workload configurations. We show emulation and simulation confirm each other's trends in terms of writes to NVM for different software configurations, increasing our confidence in predicting future system effects. Emulation brings novel insights, such as the non-linear effects of multi-programmed workloads on NVM writes, and that Java applications write significantly more than their C++ equivalents. We make our software infrastructure publicly available to advance the evaluation of novel memory management schemes on hybrid memories
CUP: Comprehensive User-Space Protection for C/C++
Memory corruption vulnerabilities in C/C++ applications enable attackers to
execute code, change data, and leak information. Current memory sanitizers do
no provide comprehensive coverage of a program's data. In particular, existing
tools focus primarily on heap allocations with limited support for stack
allocations and globals. Additionally, existing tools focus on the main
executable with limited support for system libraries. Further, they suffer from
both false positives and false negatives.
We present Comprehensive User-Space Protection for C/C++, CUP, an LLVM
sanitizer that provides complete spatial and probabilistic temporal memory
safety for C/C++ program on 64-bit architectures (with a prototype
implementation for x86_64). CUP uses a hybrid metadata scheme that supports all
program data including globals, heap, or stack and maintains the ABI. Compared
to existing approaches with the NIST Juliet test suite, CUP reduces false
negatives by 10x (0.1%) compared to the state of the art LLVM sanitizers, and
produces no false positives. CUP instruments all user-space code, including
libc and other system libraries, removing them from the trusted code base
Large-Scale Distributed Internet-based Discovery Mechanism for Dynamic Spectrum Allocation
Scarcity of frequencies and the demand for more bandwidth is likely to
increase the need for devices that utilize the available frequencies more
efficiently. Radios must be able to dynamically find other users of the
frequency bands and adapt so that they are not interfered, even if they use
different radio protocols. As transmitters far away may cause as much
interference as a transmitter located nearby, this mechanism can not be based
on location alone. Central databases can be used for this purpose, but require
expensive infrastructure and planning to scale. In this paper, we propose a
decentralized protocol and architecture for discovering radio devices over the
Internet. The protocol has low resource requirements, making it suitable for
implementation on limited platforms. We evaluate the protocol through
simulation in network topologies with up to 2.3 million nodes, including
topologies generated from population patterns in Norway. The protocol has also
been implemented as proof-of-concept in real Wi-Fi routers.Comment: Accepted for publication at IEEE DySPAN 201
On the Impact of Memory Allocation on High-Performance Query Processing
Somewhat surprisingly, the behavior of analytical query engines is crucially
affected by the dynamic memory allocator used. Memory allocators highly
influence performance, scalability, memory efficiency and memory fairness to
other processes. In this work, we provide the first comprehensive experimental
analysis on the impact of memory allocation for high-performance query engines.
We test five state-of-the-art dynamic memory allocators and discuss their
strengths and weaknesses within our DBMS. The right allocator can increase the
performance of TPC-DS (SF 100) by 2.7x on a 4-socket Intel Xeon server
Picking a CHERI Allocator: Security and Performance Considerations
Several open-source memory allocators have been ported to CHERI, a hardware
capability platform. In this paper we examine the security and performance of
these allocators when run under CheriBSD on Arm's experimental Morello
platform. We introduce a number of security attacks and show that all but one
allocator are vulnerable to some of the attacks - including the default
CheriBSD allocator. We then show that while some forms of allocator performance
are meaningful, comparing the performance of hybrid and pure capability (i.e.
'running in non-CHERI vs. running in CHERI modes') allocators does not appear
to be meaningful. Although we do not fully understand the reasons for this, it
seems to be at least as much due to factors such as immature compiler
toolchains as it is due to the effects of capabilities on hardware
- …