138 research outputs found
Scalable Storage for Digital Libraries
I propose a storage system optimised for digital libraries. Its key features are its heterogeneous scalability; its integration and exploitation of rich semantic metadata associated with digital objects; its use of a name space; and its aggressive performance optimisation in the digital library domain
Recommended from our members
Building Distributed Systems with Non-Volatile Main Memories and RDMA Networks
High-performance, byte-addressable non-volatile main memories (NVMMs) allow application developers to combine storage and memory into a single layer. These high-performance storage systems would be especially useful in large-scale data center environments where data is distributed and replicated across multiple servers.Unfortunately, existing approaches of providing remote storage access rest on the assumption that storage is slow, so the cost of the software and protocols is acceptable. Such assumption no longer holds for the fast NVMM. As a result, taking full advantage of NVMMs’ potential will require changes in system software and networking protocol. This thesis focuses on accessing remote NVMM efficiently using remote direct memory access (RDMA) network. RDMA enables a client to directly access memory on a remote machine without involving its local CPU.This thesis first presents Mojim, a system that provides replicated, reliable, and highly-available NVMM as an operating system service. Applications can access data in Mojim using normal load and store instructions while controlling when and how updates propagate to replicas using system calls. Our evaluation shows Mojim adds little overhead to the un-replicated system and provides 0.4x to 2.7x the throughput of the un-replicated system.This thesis then presents Orion, a distributed file system designed from for NVMM and RDMA networks. Traditional distributed file systems are designed for slower hard drives. These slower media incentivizes complex optimizations (e.g., queuing, striping, and batching) around disk accesses. Orion combines file system functions and network operations into a single layer. It provides low latency metadata accesses and outperforms existing distributed file systems by a large margin.Finally, an NVMM application can map files backed by an NVMM file system into its address space, and accesses them using CPU instructions. In this case, RDMA and NVMM file systems introduce duplication of effort on permissions, naming, and address translation. We introduce two changes to the existing RDMA protocol: the file memory region (FileMR) and range based address translation. By eliminating redundant translations, FileMR minimizes the number of translations done at the NIC, reducing the load on the NIC’s translation cache and resulting in application performance improvement by 1.8x - 2.0x
Recommended from our members
Building Reliable Software for Persistent Memory
Persistent memory (PMEM) technologies preserve data across power cycles and provide performance comparable to DRAM. In emerging computer systems, PMEM will operate on the main memory bus, becoming byte-addressable and cache-coherent. One key feature enabled by persistent memory is to allow software directly accessing durable data using the CPU’s load/store instructions, even from the user-space.However, building reliable software for persistent memory faces new challenges from two aspects: crash consistency and fault tolerance. Maintaining crash consistency requires the ability to recover data integrity in the event of system crashes. Using load/store instructions to access durable data introduces a new programming paradigm, that is prone to new types of programming errors. Fault tolerance involves detecting and recovering from persistent memory errors, including memory media errors and scribbles from software bugs. With direct access, file systems and user-space applications have to explicitly manage these errors, instead of relying on convenient functions from lower I/O stacks.We identify unique challenges in improving reliability for PMEM-based software and propose solutions. The thesis first introduces NOVA-Fortis, a fault-tolerant PMEM file system incorporating replication, checksums, and parity for protecting the file system’s metadata and the user’s file data. NOVA-Fortis is both fast and resilient in the face of corruption due to media errors and software bugs.NOVA-Fortis only protects file data via the read() and write() system calls. When an application memory-maps a PMEM file, NOVA-Fortis has to disable file data protection because mmap() leaves the file system unaware of updates made to the file. For protecting memory-mapped PMEM data, we present Pangolin, a fault-tolerant persistent object library to protect an application’s objects from persistent memory errors.Writing programs to ensure crash consistency in PMEM remains challenging. Recovery bugs arise as a new type of programming error, preventing a post-crash PMEM file from recovering to a consistent state. Thus, we design two debugging tools for persistent memory programming: PmemConjurer and PmemSanitizer. PmemConjurer is a static analyzer using symbolic execution to find recovery bugs without running a compiled program. PmemSanitizer contains compiler instrumentation and run-time recovery bug analysis, compensating PmemConjurer with multi-threading support and store reordering tests
How general-purpose can a GPU be?
The use of graphics processing units (GPUs) in general-purpose computation (GPGPU) is a growing field. GPU instruction sets, while implementing a graphics pipeline, draw from a range of single instruction multiple datastream (SIMD) architectures characteristic of the heyday of supercomputers. Yet only one of these SIMD instruction sets has been of application on a wide enough range of problems to survive the era when the full range of supercomputer design variants was being explored: vector instructions. Supercomputers covered a range of exotic designs such as hypercubes and the Connection Machine (Fox, 1989). The latter is likely the source of the snide comment by Cray: it had thousands of relatively low-speed CPUs (Tucker & Robertson, 1988). Since Cray won, why are we not basing our ideas on his designs (Cray Inc., 2004), rather than those of the losers? The Top 500 supercomputer list is dominated by general-purpose CPUs, and nothing like the Connection Machine that headed the list in 1993 still exists
Cautiously Optimistic Program Analyses for Secure and Reliable Software
Modern computer systems still have various security and reliability vulnerabilities. Well-known dynamic analyses solutions can mitigate them using runtime monitors that serve as lifeguards. But the additional work in enforcing these security and safety properties incurs exorbitant performance costs, and such tools are rarely used in practice. Our work addresses this problem by constructing a novel technique- Cautiously Optimistic Program Analysis (COPA).
COPA is optimistic- it infers likely program invariants from dynamic observations, and assumes them in its static reasoning to precisely identify and elide wasteful runtime monitors. The resulting system is fast, but also ensures soundness by recovering to a conservatively optimized analysis when a likely invariant rarely fails at runtime. COPA is also cautious- by carefully restricting optimizations to only safe elisions, the recovery is greatly simplified. It avoids unbounded rollbacks upon recovery, thereby enabling analysis for live production software.
We demonstrate the effectiveness of Cautiously Optimistic Program Analyses in three areas:
Information-Flow Tracking (IFT) can help prevent security breaches and information leaks. But they are rarely used in practice due to their high performance overhead (>500% for web/email servers). COPA dramatically reduces this cost by eliding wasteful IFT monitors to make it practical (9% overhead, 4x speedup).
Automatic Garbage Collection (GC) in managed languages (e.g. Java) simplifies programming tasks while ensuring memory safety. However, there is no correct GC for weakly-typed languages (e.g. C/C++), and manual memory management is prone to errors that have been exploited in high profile attacks. We develop the first sound GC for C/C++, and use COPA to optimize its performance (16% overhead).
Sequential Consistency (SC) provides intuitive semantics to concurrent programs that simplifies reasoning for their correctness. However, ensuring SC behavior on commodity hardware remains expensive. We use COPA to ensure SC for Java at the language-level efficiently, and significantly reduce its cost (from 24% down to 5% on x86).
COPA provides a way to realize strong software security, reliability and semantic guarantees at practical costs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/170027/1/subarno_1.pd
Pretenuring for Java
Pretenuring is a technique for reducing copying costs in garbage collectors. When pretenuring, the allocator places long-lived objects into regions that the garbage collector will rarely, if ever, collect. We extend previous work on profiling-driven pretenuring as follows. (1) We develop a collector-neutral approach to obtaining object lifetime profile information. We show that our collection of Java programs exhibits a very high degree of homogeneity of object lifetimes at each allocation site. This result is robust with respect to different inputs, and is similar to previous work on ML, but is in contrast to C programs, which require dynamic call chain context information to extract homogeneous lifetimes. Call-site homogeneity considerably simplifies the implementation of pretenuring and makes it more efficient. (2) Our pretenuring advice is neutral with respect to the collector algorithm, and we use it to improve two quite different garbage collectors: a traditional generational collector and an older-first collector. The system is also novel because it classifies and allocates objects into 3 categories: we allocate immortal objects into a permanent region that the collector will never consider, long-lived objects into a region in which the collector placed survivors of the most recent collection, and shortlived objects into the nursery, i.e., the default region. (3) We evaluate pretenuring on Java programs. Our simulation results show that pretenuring significantly reduces collector copying for generational and older-first collectors. 1
Leveraging Gate-Level Properties to Identify Hardware Timing Channels
Abstract—Modern embedded computing systems such as med-ical devices, airplanes, and automobiles continue to dominate some of the most critical aspects of our lives. In such systems, the movement of information throughout a device must be tightly controlled to prevent violations of privacy or integrity. Unfortunately, bounding the flow of information can often present a significant challenge, as information can flow through channels that are difficult to detect, such as timing channels. As has been demonstrated by recent research in hardware security, information flow tracking techniques deployed at the hardware or gate level show promise at identifying these “timing flows ” but provide no formal statements about this claim nor mechanisms for separating out timing information from other types of flows. In this paper, we first prove that gate-level information flow tracking can in fact detect timing flows. In addition, we work to identify these timing flows separately from other flows by presenting a framework for identifying a different type of flow that we call functional flows. By using this framework to either confirm or rule out the existence of such flows, we leverage the previous work in hardware information flow tracking to effectively isolate timing flows. To show the effectiveness of this model, we demonstrate its usage on three practical examples: a shared bus (I2C), a cache in a MIPS-based processor, and an RSA encryption core, all of which were written in Verilog/VHDL and then simulated in a variety of scenarios. In each scenario, we demonstrate how our framework can be used to identify timing and functional flows and also analyze our model’s overhead
Formalizing Stack Safety as a Security Property
The term stack safety is used to describe a variety of compiler, runtime, and hardware mechanisms for protecting stack memory. Unlike “the heap,” the ISA-level stack does not correspond to a single high-level language concept: different compilers use it in different ways to support procedural and functional abstraction mechanisms from a wide range of languages. This protean nature makes it difficult to nail down what it means to correctly enforce stack safety
- …