23,137 research outputs found
Resource Allocation Strategies for In-Network Stream Processing
In this paper we consider the operator mapping problem for in-network stream
processing applications. In-network stream processing consists in applying a
tree of operators in steady-state to multiple data objects that are continually
updated at various locations on a network. Examples of in-network stream
processing include the processing of data in a sensor network, or of continuous
queries on distributed relational databases. We study the operator mapping
problem in a ``constructive'' scenario, i.e., a scenario in which one builds a
platform dedicated to the application buy purchasing processing servers with
various costs and capabilities. The objective is to minimize the cost of the
platform while ensuring that the application achieves a minimum steady-state
throughput. The first contribution of this paper is the formalization of a set
of relevant operator-placement problems as linear programs, and a proof that
even simple versions of the problem are NP-complete. Our second contribution is
the design of several polynomial time heuristics, which are evaluated via
extensive simulations and compared to theoretical bounds for optimal solutions
BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures
We introduce BriskStream, an in-memory data stream processing system (DSPSs)
specifically designed for modern shared-memory multicore architectures.
BriskStream's key contribution is an execution plan optimization paradigm,
namely RLAS, which takes relative-location (i.e., NUMA distance) of each pair
of producer-consumer operators into consideration. We propose a branch and
bound based approach with three heuristics to resolve the resulting nontrivial
optimization problem. The experimental evaluations demonstrate that BriskStream
yields much higher throughput and better scalability than existing DSPSs on
multi-core architectures when processing different types of workloads.Comment: To appear in SIGMOD'1
SQPR: Stream Query Planning with Reuse
When users submit new queries to a distributed stream processing system (DSPS), a query planner must allocate physical resources, such as CPU cores, memory and network bandwidth, from a set of hosts to queries. Allocation decisions must provide the correct mix of resources required by queries, while achieving an efficient overall allocation to scale in the number of admitted queries. By exploiting overlap between queries and reusing partial results, a query planner can conserve resources but has to carry out more complex planning decisions. In this paper, we describe SQPR, a query planner that targets DSPSs in data centre environments with heterogeneous resources. SQPR models query admission, allocation and reuse as a single constrained optimisation problem and solves an approximate version to achieve scalability. It prevents individual resources from becoming bottlenecks by re-planning past allocation decisions and supports different allocation objectives. As our experimental evaluation in comparison with a state-of-the-art planner shows SQPR makes efficient resource allocation decisions, even with a high utilisation of resources, with acceptable overheads
Optimal Embedding of Functions for In-Network Computation: Complexity Analysis and Algorithms
We consider optimal distributed computation of a given function of
distributed data. The input (data) nodes and the sink node that receives the
function form a connected network that is described by an undirected weighted
network graph. The algorithm to compute the given function is described by a
weighted directed acyclic graph and is called the computation graph. An
embedding defines the computation communication sequence that obtains the
function at the sink. Two kinds of optimal embeddings are sought, the embedding
that---(1)~minimizes delay in obtaining function at sink, and (2)~minimizes
cost of one instance of computation of function. This abstraction is motivated
by three applications---in-network computation over sensor networks, operator
placement in distributed databases, and module placement in distributed
computing.
We first show that obtaining minimum-delay and minimum-cost embeddings are
both NP-complete problems and that cost minimization is actually MAX SNP-hard.
Next, we consider specific forms of the computation graph for which polynomial
time solutions are possible. When the computation graph is a tree, a polynomial
time algorithm to obtain the minimum delay embedding is described. Next, for
the case when the function is described by a layered graph we describe an
algorithm that obtains the minimum cost embedding in polynomial time. This
algorithm can also be used to obtain an approximation for delay minimization.
We then consider bounded treewidth computation graphs and give an algorithm to
obtain the minimum cost embedding in polynomial time
Data Provenance and Management in Radio Astronomy: A Stream Computing Approach
New approaches for data provenance and data management (DPDM) are required
for mega science projects like the Square Kilometer Array, characterized by
extremely large data volume and intense data rates, therefore demanding
innovative and highly efficient computational paradigms. In this context, we
explore a stream-computing approach with the emphasis on the use of
accelerators. In particular, we make use of a new generation of high
performance stream-based parallelization middleware known as InfoSphere
Streams. Its viability for managing and ensuring interoperability and integrity
of signal processing data pipelines is demonstrated in radio astronomy. IBM
InfoSphere Streams embraces the stream-computing paradigm. It is a shift from
conventional data mining techniques (involving analysis of existing data from
databases) towards real-time analytic processing. We discuss using InfoSphere
Streams for effective DPDM in radio astronomy and propose a way in which
InfoSphere Streams can be utilized for large antennae arrays. We present a
case-study: the InfoSphere Streams implementation of an autocorrelating
spectrometer, and using this example we discuss the advantages of the
stream-computing approach and the utilization of hardware accelerators
Tolerating Correlated Failures in Massively Parallel Stream Processing Engines
Fault-tolerance techniques for stream processing engines can be categorized
into passive and active approaches. A typical passive approach periodically
checkpoints a processing task's runtime states and can recover a failed task by
restoring its runtime state using its latest checkpoint. On the other hand, an
active approach usually employs backup nodes to run replicated tasks. Upon
failure, the active replica can take over the processing of the failed task
with minimal latency. However, both approaches have their own inadequacies in
Massively Parallel Stream Processing Engines (MPSPE). The passive approach
incurs a long recovery latency especially when a number of correlated nodes
fail simultaneously, while the active approach requires extra replication
resources. In this paper, we propose a new fault-tolerance framework, which is
Passive and Partially Active (PPA). In a PPA scheme, the passive approach is
applied to all tasks while only a selected set of tasks will be actively
replicated. The number of actively replicated tasks depends on the available
resources. If tasks without active replicas fail, tentative outputs will be
generated before the completion of the recovery process. We also propose
effective and efficient algorithms to optimize a partially active replication
plan to maximize the quality of tentative outputs. We implemented PPA on top of
Storm, an open-source MPSPE and conducted extensive experiments using both real
and synthetic datasets to verify the effectiveness of our approach
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Processing
Carefully balancing load in distributed stream processing systems has a
fundamental impact on execution latency and throughput. Load balancing is
challenging because real-world workloads are skewed: some tuples in the stream
are associated to keys which are significantly more frequent than others. Skew
is remarkably more problematic in large deployments: more workers implies fewer
keys per worker, so it becomes harder to "average out" the cost of hot keys
with cold keys.
We propose a novel load balancing technique that uses a heaving hitter
algorithm to efficiently identify the hottest keys in the stream. These hot
keys are assigned to choices to ensure a balanced load, where is
tuned automatically to minimize the memory and computation cost of operator
replication. The technique works online and does not require the use of routing
tables. Our extensive evaluation shows that our technique can balance
real-world workloads on large deployments, and improve throughput and latency
by and respectively over the previous
state-of-the-art when deployed on Apache Storm.Comment: 12 pages, 14 Figures, this paper is accepted and will be published at
ICDE 201
- …