3,624 research outputs found
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments
Data centres that use consumer-grade disks drives and distributed
peer-to-peer systems are unreliable environments to archive data without enough
redundancy. Most redundancy schemes are not completely effective for providing
high availability, durability and integrity in the long-term. We propose alpha
entanglement codes, a mechanism that creates a virtual layer of highly
interconnected storage devices to propagate redundant information across a
large scale storage system. Our motivation is to design flexible and practical
erasure codes with high fault-tolerance to improve data durability and
availability even in catastrophic scenarios. By flexible and practical, we mean
code settings that can be adapted to future requirements and practical
implementations with reasonable trade-offs between security, resource usage and
performance. The codes have three parameters. Alpha increases storage overhead
linearly but increases the possible paths to recover data exponentially. Two
other parameters increase fault-tolerance even further without the need of
additional storage. As a result, an entangled storage system can provide high
availability, durability and offer additional integrity: it is more difficult
to modify data undetectably. We evaluate how several redundancy schemes perform
in unreliable environments and show that alpha entanglement codes are flexible
and practical codes. Remarkably, they excel at code locality, hence, they
reduce repair costs and become less dependent on storage locations with poor
availability. Our solution outperforms Reed-Solomon codes in many disaster
recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially
supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018
48th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN
The Making of Cloud Applications An Empirical Study on Software Development for the Cloud
Cloud computing is gaining more and more traction as a deployment and
provisioning model for software. While a large body of research already covers
how to optimally operate a cloud system, we still lack insights into how
professional software engineers actually use clouds, and how the cloud impacts
development practices. This paper reports on the first systematic study on how
software developers build applications in the cloud. We conducted a
mixed-method study, consisting of qualitative interviews of 25 professional
developers and a quantitative survey with 294 responses. Our results show that
adopting the cloud has a profound impact throughout the software development
process, as well as on how developers utilize tools and data in their daily
work. Among other things, we found that (1) developers need better means to
anticipate runtime problems and rigorously define metrics for improved fault
localization and (2) the cloud offers an abundance of operational data,
however, developers still often rely on their experience and intuition rather
than utilizing metrics. From our findings, we extracted a set of guidelines for
cloud development and identified challenges for researchers and tool vendors
SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine
Client-side logic and storage are increasingly used in web and mobile
applications to improve response time and availability. Current approaches tend
to be ad-hoc and poorly integrated with the server-side logic. We present a
principled approach to integrate client- and server-side storage. We support
mergeable and strongly consistent transactions that target either client or
server replicas and provide access to causally-consistent snapshots
efficiently. In the presence of infrastructure faults, a client-assisted
failover solution allows client execution to resume immediately and seamlessly
access consistent snapshots without waiting. We implement this approach in
SwiftCloud, the first transactional system to bring geo-replication all the way
to the client machine. Example applications show that our programming model is
useful across a range of application areas. Our experimental evaluation shows
that SwiftCloud provides better fault tolerance and at the same time can
improve both latency and throughput by up to an order of magnitude, compared to
classical geo-replication techniques
Stocator: A High Performance Object Store Connector for Spark
We present Stocator, a high performance object store connector for Apache
Spark, that takes advantage of object store semantics. Previous connectors have
assumed file system semantics, in particular, achieving fault tolerance and
allowing speculative execution by creating temporary files to avoid
interference between worker threads executing the same task and then renaming
these files. Rename is not a native object store operation; not only is it not
atomic, but it is implemented using a costly copy operation and a delete.
Instead our connector leverages the inherent atomicity of object creation, and
by avoiding the rename paradigm it greatly decreases the number of operations
on the object store as well as enabling a much simpler approach to dealing with
the eventually consistent semantics typical of object stores. We have
implemented Stocator and shared it in open source. Performance testing shows
that it is as much as 18 times faster for write intensive workloads and
performs as much as 30 times fewer operations on the object store than the
legacy Hadoop connectors, reducing costs both for the client and the object
storage service provider
- …