183 research outputs found
Transcending POSIX: The End of an Era?
In this article, we provide a holistic view of the Portable Operating System Interface (POSIX) abstractions by a systematic review of their historical evolution. We discuss some of the key factors that drove the evolution and identify the pitfalls that make them infeasible when building modern applications.Peer reviewe
Improving I/O performance through an in-kernel disk simulator
This paper presents two mechanisms that can significantly improve the I/O performance of both hard and solid-state drives for read operations: KDSim and REDCAP. KDSim is an in-kernel disk simulator that provides a framework for simultaneously simulating the performance obtained by different I/O system mechanisms and algorithms, and for dynamically turning them on and off, or selecting between different options or policies, to improve the overall system performance. REDCAP is a RAM-based disk cache that effectively enlarges the built-in cache present in disk drives. By using KDSim, this cache is dynamically activated/deactivated according to the throughput achieved. Results show that, by using KDSim and REDCAP together, a system can improve its I/O performance up to 88% for workloads with some spatial locality on both hard and solid-state drives, while it achieves the same performance as a ‘regular system’ for workloads with random or sequential access patterns.Peer ReviewedPostprint (author's final draft
EFFECTIVE GROUPING FOR ENERGY AND PERFORMANCE: CONSTRUCTION OF ADAPTIVE, SUSTAINABLE, AND MAINTAINABLE DATA STORAGE
The performance gap between processors and storage systems has been increasingly critical overthe years. Yet the performance disparity remains, and further, storage energy consumption israpidly becoming a new critical problem. While smarter caching and predictive techniques domuch to alleviate this disparity, the problem persists, and data storage remains a growing contributorto latency and energy consumption.Attempts have been made at data layout maintenance, or intelligent physical placement ofdata, yet in practice, basic heuristics remain predominant. Problems that early studies soughtto solve via layout strategies were proven to be NP-Hard, and data layout maintenance todayremains more art than science. With unknown potential and a domain inherently full of uncertainty,layout maintenance persists as an area largely untapped by modern systems. But uncertainty inworkloads does not imply randomness; access patterns have exhibited repeatable, stable behavior.Predictive information can be gathered, analyzed, and exploited to improve data layouts. Ourgoal is a dynamic, robust, sustainable predictive engine, aimed at improving existing layouts byreplicating data at the storage device level.We present a comprehensive discussion of the design and construction of such a predictive engine,including workload evaluation, where we present and evaluate classical workloads as well asour own highly detailed traces collected over an extended period. We demonstrate significant gainsthrough an initial static grouping mechanism, and compare against an optimal grouping method ofour own construction, and further show significant improvement over competing techniques. We also explore and illustrate the challenges faced when moving from static to dynamic (i.e. online)grouping, and provide motivation and solutions for addressing these challenges. These challengesinclude metadata storage, appropriate predictive collocation, online performance, and physicalplacement. We reduced the metadata needed by several orders of magnitude, reducing the requiredvolume from more than 14% of total storage down to less than 12%. We also demonstrate how ourcollocation strategies outperform competing techniques. Finally, we present our complete modeland evaluate a prototype implementation against real hardware. This model was demonstrated tobe capable of reducing device-level accesses by up to 65%
Recommended from our members
Building Distributed Systems with Non-Volatile Main Memories and RDMA Networks
High-performance, byte-addressable non-volatile main memories (NVMMs) allow application developers to combine storage and memory into a single layer. These high-performance storage systems would be especially useful in large-scale data center environments where data is distributed and replicated across multiple servers.Unfortunately, existing approaches of providing remote storage access rest on the assumption that storage is slow, so the cost of the software and protocols is acceptable. Such assumption no longer holds for the fast NVMM. As a result, taking full advantage of NVMMs’ potential will require changes in system software and networking protocol. This thesis focuses on accessing remote NVMM efficiently using remote direct memory access (RDMA) network. RDMA enables a client to directly access memory on a remote machine without involving its local CPU.This thesis first presents Mojim, a system that provides replicated, reliable, and highly-available NVMM as an operating system service. Applications can access data in Mojim using normal load and store instructions while controlling when and how updates propagate to replicas using system calls. Our evaluation shows Mojim adds little overhead to the un-replicated system and provides 0.4x to 2.7x the throughput of the un-replicated system.This thesis then presents Orion, a distributed file system designed from for NVMM and RDMA networks. Traditional distributed file systems are designed for slower hard drives. These slower media incentivizes complex optimizations (e.g., queuing, striping, and batching) around disk accesses. Orion combines file system functions and network operations into a single layer. It provides low latency metadata accesses and outperforms existing distributed file systems by a large margin.Finally, an NVMM application can map files backed by an NVMM file system into its address space, and accesses them using CPU instructions. In this case, RDMA and NVMM file systems introduce duplication of effort on permissions, naming, and address translation. We introduce two changes to the existing RDMA protocol: the file memory region (FileMR) and range based address translation. By eliminating redundant translations, FileMR minimizes the number of translations done at the NIC, reducing the load on the NIC’s translation cache and resulting in application performance improvement by 1.8x - 2.0x
Study and optimization of the memory management in Memcached
Over the years the Internet has become more popular than ever and web applications
like Facebook and Twitter are gaining more users. This results in generation of more and
more data by the users which has to be efficiently managed, because access speed is an
important factor nowadays, a user will not wait no more than three seconds for a web
page to load before abandoning the site. In-memory key-value stores like Memcached
and Redis are used to speed up web applications by speeding up access to the data by
decreasing the number of accesses to the slower data storage’s. The first implementation
of Memcached, in the LiveJournal’s website, showed that by using 28 instances of Memcached
on ten unique hosts, caching the most popular 30GB of data can achieve a hit rate
around 92%, reducing the number of accesses to the database and reducing the response
time considerably.
Not all objects in cache take the same time to recompute, so this research is going to
study and present a new cost aware memory management that is easy to integrate in a
key-value store, with this approach being implemented in Memcached. The new memory
management and cache will give some priority to key-value pairs that take longer to be
recomputed. Instead of replacing Memcached’s replacement structure and its policy, we
simply add a new segment in each structure that is capable of storing the more costly
key-value pairs. Apart from this new segment in each replacement structure, we created
a new dynamic cost-aware rebalancing policy in Memcached, giving more memory to
store more costly key-value pairs.
With the implementations of our approaches, we were able to offer a prototype that
can be used to research the cost on the caching systems performance. In addition, we
were able to improve in certain scenarios the access latency of the user and the total
recomputation cost of the key-value stored in the system
Off, un nuevo enfoque en la construcción de Sistemas Operativos Distribuidos
La extensión alcanzada por las redes de ordenadores ha provocado la aparición y extensión de Sistemas
Operativos Distribuidos (SSOODD) [158], como alternativa de futuro a los Sistemas Operativos centralizados.
El enfoque más común para la construcción de Sistemas Operativos Distribuidos [158] consiste en la realización de servicios distribuidos sobre un μkernel centralizado encargado de operar en cada uno de los nodos de la red (μkernel servicios distribuidos). Esta estrategia entorpece la distribución del sistema y conduce a sistemas no adaptables debido a que los μkernels empleados suministran
abstracciones alejadas del hardware.
Por otro lado, los Sistemas Operativos Distribuidos todavÃa no han alcanzando niveles de transparencia,
flexibilidad, adaptabilidad y aprovechamiento de recursos comparables a los existentes en
Sistemas Centralizados [160]. Conviene pues explorar otras alternativas, en la construcción de dichos
sistemas, que permitan paliar esta situación.
Esta tesis propone un enfoque radicalmente diferente en la construcción de Sistemas Operativos
Distribuidos: distribuir el sistema justo desde el nivel inferior haciendo uso de un μkernel distribuido
soportando abstracciones próximas al hardware. Nuestro enfoque podrÃa resumirse con la frase
Construyamos Sistemas Operativos basados en un μkernel distribuido en lugar de construir
Sistemas Operativos Distribuidos basados en un μkernel.
Afirmamos que con el enfoque propuesto (μkernel distribuido adaptable servicios) se podrÃan
conseguir importantes ventajas [148] con respecto al enfoque habitual (μkernel servicios distribuidos):
más transparencia, mejor aprovechamiento de los recursos, sistemas m´as flexibles y mayores
cotas de adaptabilidad; sistemas más eficientes, fiables y escalables
Recommended from our members
Enhancing Usability and Explainability of Data Systems
The recent growth of data science expanded its reach to an ever-growing user base of nonexperts, increasing the need for usability, understandability, and explainability in these systems. Enhancing usability makes data systems accessible to people with different skills and backgrounds alike, leading to democratization of data systems. Furthermore, proper understanding of data and data-driven systems is necessary for the users to trust the function of the systems that learn from data. Finally, data systems should be transparent: when a data system behaves unexpectedly or malfunctions, the users deserve proper explanation of what caused the observed incident. Unfortunately, most existing data systems offer limited usability and support for explanations: these systems are usable only by experts with sound technical skills, and even expert users are hindered by the lack of transparency into the systems\u27 inner workings and functions. The aim of my thesis is to bridge the usability gap between nonexpert users and complex data systems, aid all sort of users, including the expert ones, in data and system understanding, and provide explanations that help reason about unexpected outcomes involving data systems. Specifically, my thesis has the following three goals: (1) enhancing usability of data systems for nonexperts, (2) enable data understanding that can assist users in a variety of tasks such as achieving trust in data-driven machine learning, gaining data understanding, and data cleaning, and (3) explaining causes of unexpected outcomes involving data and data systems.
For enhancing usability, we focus on example-driven user intent discovery. We develop systems based on example-driven interactions in two different settings: querying relational databases and personalized document summarization. Towards data understanding, we develop a new data-profiling primitive that can characterize tuples for which a machine-learned model is likely to produce untrustworthy predictions. We also develop an explanation framework to explain causes of such untrustworthy predictions. Additionally, this new data-profiling primitive enables interactive data cleaning. Finally, we develop two explanation frameworks, tailored to provide explanations in debugging data system components, including the data itself. The explanation frameworks focus on explaining the root cause of a concurrent application\u27s intermittent failure and exposing issues in the data that cause a data-driven system to malfunction
XML-aware data synchronization for mobile devices
In everyday life, and when using computer systems in particular, it is sometimes the case that a logical datum is replicated into multiple copies, such as when we send a document by electronic mail, or inform interested parties of a new address of residence. If the datum for some reason changes, we would then also like the changes to be reflected in the copies. The problem of keeping the copies up-to-date with respect to each other is studied under the heading of data synchronization.
In this thesis, we address data synchronization for mobile devices with limited energy resources and limited connectivity to the Internet, such as mobile phones. The importance of data synchronization is emphasized here, as it becomes infeasible to communicate continuously and in high volumes about the current state of each copy. The established conventions of the Internet and mobile computing environments on such matters as storage interfaces and data formats define an overall system architecture, into which we as seamlessly as possible want to incorporate our proposal. By focusing on interoperability we lower the threshold for utilizing our research in practice.
We present a comprehensive approach to data synchronization for mobile devices that is optimistic and state-based, and which targets opaque and XML files on a standard file system. We consider how to use the available connectivity in an economical manner, and so that existing sources of data on the Internet can be utilized. We focus on XML synchronization, where we identify an opportunity to utilize the structure of the data the format exposes. Specifically, we present an algorithm for merging concurrent changes to XML documents which supports subtree moves, an efficient heuristic algorithm for computing tree-level changes between two XML documents, and an overall architecture and algorithms to support the use of lazily instantiated XML documents. Our data synchronization approach is evaluated quantitatively in several experiments, as well as qualitatively by constructing applications that build on top of the approach. One of our applications is an editor that processes 1 GB XML files on a mobile phone
Faculty Publications and Creative Works 1997
One of the ways we recognize our faculty at the University of New Mexico is through this annual publication which highlights our faculty\u27s scholarly and creative activities and achievements and serves as a compendium of UNM faculty efforts during the 1997 calendar year. Faculty Publications and Creative Works strives to illustrate the depth and breadth of research activities performed throughout our University\u27s laboratories, studios and classrooms. We believe that the communication of individual research is a significant method of sharing concepts and thoughts and ultimately inspiring the birth of new of ideas. In support of this, UNM faculty during 1997 produced over 2,770 works, including 2,398 scholarly papers and articles, 72 books, 63 book chapters, 82 reviews, 151 creative works and 4 patents. We are proud of the accomplishments of our faculty which are in part reflected in this book, which illustrates the diversity of intellectual pursuits in support of research and education at the University of New Mexico. Nasir Ahmed Interim Associate Provost for Research and Dean of Graduate Studie
- …