419,027 research outputs found
Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads
LSM-trees are widely adopted as the storage backend of key-value stores.
However, optimizing the system performance under dynamic workloads has not been
sufficiently studied or evaluated in previous work. To fill the gap, we present
RusKey, a key-value store with the following new features: (1) RusKey is a
first attempt to orchestrate LSM-tree structures online to enable robust
performance under the context of dynamic workloads; (2) RusKey is the first
study to use Reinforcement Learning (RL) to guide LSM-tree transformations; (3)
RusKey includes a new LSM-tree design, named FLSM-tree, for an efficient
transition between different compaction policies -- the bottleneck of dynamic
key-value stores. We justify the superiority of the new design with theoretical
analysis; (4) RusKey requires no prior workload knowledge for system
adjustment, in contrast to state-of-the-art techniques. Experiments show that
RusKey exhibits strong performance robustness in diverse workloads, achieving
up to 4x better end-to-end performance than the RocksDB system under various
settings.Comment: 25 pages, 13 figure
Detailed empirical studies of student information storing in the context of distributed design team-based project work
This paper presents the findings of six empirical case studies investigating the information stored by engineering design students in distributed team-based Global Design Projects. The aim is to understand better how students store distributed design information in order to prepare them for work in today‟s international and global context. This paper outlines the descriptive element of the work, the qualitative and quantitative research methods used and the results. It discusses the issues around the emergent themes of information storing; information storing systems; information storing patterns; and information strategy, making recommendations; establishing that there is a need for more prescriptive measures to supporting distributed design information management. This work will be of great value to industry also
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
Self-management for large-scale distributed systems
Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management.
In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving
self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research
on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying
the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed
by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic
managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic
and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a
management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by
presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control
- …