245 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Towards Scaling Blockchain Systems via Sharding
Existing blockchain systems scale poorly because of their distributed
consensus protocols. Current attempts at improving blockchain scalability are
limited to cryptocurrency. Scaling blockchain systems under general workloads
(i.e., non-cryptocurrency applications) remains an open question. In this work,
we take a principled approach to apply sharding, which is a well-studied and
proven technique to scale out databases, to blockchain systems in order to
improve their transaction throughput at scale. This is challenging, however,
due to the fundamental difference in failure models between databases and
blockchain. To achieve our goal, we first enhance the performance of Byzantine
consensus protocols, by doing so we improve individual shards' throughput.
Next, we design an efficient shard formation protocol that leverages a trusted
random beacon to securely assign nodes into shards. We rely on trusted
hardware, namely Intel SGX, to achieve high performance for both consensus and
shard formation protocol. Third, we design a general distributed transaction
protocol that ensures safety and liveness even when transaction coordinators
are malicious. Finally, we conduct an extensive evaluation of our design both
on a local cluster and on Google Cloud Platform. The results show that our
consensus and shard formation protocols outperform state-of-the-art solutions
at scale. More importantly, our sharded blockchain reaches a high throughput
that can handle Visa-level workloads, and is the largest ever reported in a
realistic environment.Comment: This is an updated version of the Chain of Trust: Can Trusted
Hardware Help Scaling Blockchains? paper. This version is to be appeared in
SIGMOD 201
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Deep learning recommendation models (DLRMs) are used across many
business-critical services at Facebook and are the single largest AI
application in terms of infrastructure demand in its data-centers. In this
paper we discuss the SW/HW co-designed solution for high-performance
distributed training of large-scale DLRMs. We introduce a high-performance
scalable software stack based on PyTorch and pair it with the new evolution of
Zion platform, namely ZionEX. We demonstrate the capability to train very large
DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup
in terms of time to solution over previous systems. We achieve this by (i)
designing the ZionEX platform with dedicated scale-out network, provisioned
with high bandwidth, optimal topology and efficient transport (ii) implementing
an optimized PyTorch-based training stack supporting both model and data
parallelism (iii) developing sharding algorithms capable of hierarchical
partitioning of the embedding tables along row, column dimensions and load
balancing them across multiple workers; (iv) adding high-performance core
operators while retaining flexibility to support optimizers with fully
deterministic updates (v) leveraging reduced precision communications,
multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we
develop and briefly comment on distributed data ingestion and other supporting
services that are required for the robust and efficient end-to-end training in
production environments
State-Compute Replication: Parallelizing High-Speed Stateful Packet Processing
With the slowdown of Moore's law, CPU-oriented packet processing in software
will be significantly outpaced by emerging line speeds of network interface
cards (NICs). Single-core packet-processing throughput has saturated.
We consider the problem of high-speed packet processing with multiple CPU
cores. The key challenge is state--memory that multiple packets must read and
update. The prevailing method to scale throughput with multiple cores involves
state sharding, processing all packets that update the same state, i.e., flow,
at the same core. However, given the heavy-tailed nature of realistic flow size
distributions, this method will be untenable in the near future, since total
throughput is severely limited by single core performance.
This paper introduces state-compute replication, a principle to scale the
throughput of a single stateful flow across multiple cores using replication.
Our design leverages a packet history sequencer running on a NIC or
top-of-the-rack switch to enable multiple cores to update state without
explicit synchronization. Our experiments with realistic data center and
wide-area Internet traces shows that state-compute replication can scale total
packet-processing throughput linearly with cores, deterministically and
independent of flow size distributions, across a range of realistic
packet-processing programs
- …