138,931 research outputs found
On I/O Performance and Cost Efficiency of Cloud Storage: A Client\u27s Perspective
Cloud storage has gained increasing popularity in the past few years. In cloud storage, data are stored in the service providerâs data centers; users access data via the network and pay the fees based on the service usage. For such a new storage model, our prior wisdom and optimization schemes on conventional storage may not remain valid nor applicable to the emerging cloud storage.
In this dissertation, we focus on understanding and optimizing the I/O performance and cost efficiency of cloud storage from a clientâs perspective. We first conduct a comprehensive study to gain insight into the I/O performance behaviors of cloud storage from the client side. Through extensive experiments, we have obtained several critical findings and useful implications for system optimization. We then design a client cache framework, called Pacaca, to further improve end-to-end performance of cloud storage. Pacaca seamlessly integrates parallelized prefetching and cost-aware caching by utilizing the parallelism potential and object correlations of cloud storage. In addition to improving system performance, we have also made efforts to reduce the monetary cost of using cloud storage services by proposing a latency- and cost-aware client caching scheme, called GDS-LC, which can achieve two optimization goals for using cloud storage services: low access latency and low monetary cost. Our experimental results show that our proposed client-side solutions significantly outperform traditional methods. Our study contributes to inspiring the community to reconsider system optimization methods in the cloud environment, especially for the purpose of integrating cloud storage into the current storage stack as a primary storage layer
Dynamic re-optimization techniques for stream processing engines and object stores
Large scale data storage and processing systems are strongly motivated by the need to store and analyze massive datasets. The complexity of a large class of these systems is rooted in their distributed nature, extreme scale, need for real-time response, and streaming nature. The use of these systems on multi-tenant, cloud environments with potential resource interference necessitates fine-grained monitoring and control. In this dissertation, we present efficient, dynamic techniques for re-optimizing stream-processing systems and transactional object-storage systems.^ In the context of stream-processing systems, we present VAYU, a per-topology controller. VAYU uses novel methods and protocols for dynamic, network-aware tuple-routing in the dataflow. We show that the feedback-driven controller in VAYU helps achieve high pipeline throughput over long execution periods, as it dynamically detects and diagnoses any pipeline-bottlenecks. We present novel heuristics to optimize overlays for group communication operations in the streaming model.^ In the context of object-storage systems, we present M-Lock, a novel lock-localization service for distributed transaction protocols on scale-out object stores to increase transaction throughput. Lock localization refers to dynamic migration and partitioning of locks across nodes in the scale-out store to reduce cross-partition acquisition of locks. The service leverages the observed object-access patterns to achieve lock-clustering and deliver high performance. We also present TransMR, a framework that uses distributed, transactional object stores to orchestrate and execute asynchronous components in amorphous data-parallel applications on scale-out architectures
A Peer-to-Peer Middleware Framework for Resilient Persistent Programming
The persistent programming systems of the 1980s offered a programming model
that integrated computation and long-term storage. In these systems, reliable
applications could be engineered without requiring the programmer to write
translation code to manage the transfer of data to and from non-volatile
storage. More importantly, it simplified the programmer's conceptual model of
an application, and avoided the many coherency problems that result from
multiple cached copies of the same information. Although technically
innovative, persistent languages were not widely adopted, perhaps due in part
to their closed-world model. Each persistent store was located on a single
host, and there were no flexible mechanisms for communication or transfer of
data between separate stores. Here we re-open the work on persistence and
combine it with modern peer-to-peer techniques in order to provide support for
orthogonal persistence in resilient and potentially long-running distributed
applications. Our vision is of an infrastructure within which an application
can be developed and distributed with minimal modification, whereupon the
application becomes resilient to certain failure modes. If a node, or the
connection to it, fails during execution of the application, the objects are
re-instantiated from distributed replicas, without their reference holders
being aware of the failure. Furthermore, we believe that this can be achieved
within a spectrum of application programmer intervention, ranging from minimal
to totally prescriptive, as desired. The same mechanisms encompass an
orthogonally persistent programming model. We outline our approach to
implementing this vision, and describe current progress.Comment: Submitted to EuroSys 200
Context Aware Computing for The Internet of Things: A Survey
As we are moving towards the Internet of Things (IoT), the number of sensors
deployed around the world is growing at a rapid pace. Market research has shown
a significant growth of sensor deployments over the past decade and has
predicted a significant increment of the growth rate in the future. These
sensors continuously generate enormous amounts of data. However, in order to
add value to raw sensor data we need to understand it. Collection, modelling,
reasoning, and distribution of context in relation to sensor data plays
critical role in this challenge. Context-aware computing has proven to be
successful in understanding sensor data. In this paper, we survey context
awareness from an IoT perspective. We present the necessary background by
introducing the IoT paradigm and context-aware fundamentals at the beginning.
Then we provide an in-depth analysis of context life cycle. We evaluate a
subset of projects (50) which represent the majority of research and commercial
solutions proposed in the field of context-aware computing conducted over the
last decade (2001-2011) based on our own taxonomy. Finally, based on our
evaluation, we highlight the lessons to be learnt from the past and some
possible directions for future research. The survey addresses a broad range of
techniques, methods, models, functionalities, systems, applications, and
middleware solutions related to context awareness and IoT. Our goal is not only
to analyse, compare and consolidate past research work but also to appreciate
their findings and discuss their applicability towards the IoT.Comment: IEEE Communications Surveys & Tutorials Journal, 201
The simplicity project: easing the burden of using complex and heterogeneous ICT devices and services
As of today, to exploit the variety of different "services", users need to configure each of their devices by using different procedures and need to explicitly select among heterogeneous access technologies and protocols. In addition to that, users are authenticated and charged by different means. The lack of implicit human computer interaction, context-awareness and standardisation places an enormous burden of complexity on the shoulders of the final users. The IST-Simplicity project aims at leveraging such problems by: i) automatically creating and customizing a user communication space; ii) adapting services to user terminal characteristics and to users preferences; iii) orchestrating network capabilities. The aim of this paper is to present the technical framework of the IST-Simplicity project. This paper is a thorough analysis and qualitative evaluation of the different technologies, standards and works presented in the literature related to the Simplicity system to be developed
On Achieving Diversity in the Presence of Outliers in Participatory Camera Sensor Networks
This paper addresses the problem of collection and
delivery of a representative subset of pictures, in participatory camera networks, to maximize coverage when a significant portion of the pictures may be redundant or irrelevant. Consider, for example, a rescue mission where volunteers and survivors of a large-scale disaster scout a wide area to capture pictures of
damage in distressed neighborhoods, using handheld cameras, and report them to a rescue station. In this participatory camera network, a significant amount of pictures may be redundant (i.e., similar pictures may be reported by many) or irrelevant (i.e., may
not document an event of interest). Given this pool of pictures, we aim to build a protocol to store and deliver a smaller subset of pictures, among all those taken, that minimizes redundancy and eliminates irrelevant objects and outliers. While previous work addressed removal of redundancy alone, doing so in the presence of outliers is tricky, because outliers, by their very nature, are different from other objects, causing redundancy minimizing algorithms to favor their inclusion, which is at odds with the goal of finding a representative subset. To eliminate both outliers and redundancy at the same time, two seemingly opposite objectives must be met together. The contribution of this
paper lies in a new prioritization technique (and its in-network
implementation) that minimizes redundancy among delivered
pictures, while also reducing outliers.unpublishedis peer reviewe
- âŠ