New techniques to lower the tail latency in stream processing systems

Du, Guangxiang

research

New techniques to lower the tail latency in stream processing systems

Authors: Guangxiang Du
Publication date: 1 May 2016
Publisher

Abstract

Over the past decade, the demand for real time processing of huge amount of streaming data has emerged and grown rapidly. Apache Storm, Apache Flink, Samza and many other stream processing frameworks have been proposed and implemented to meet this need. Although lots of effort has been made to reduce the average latency of stream processing systems, how to shorten their tail latency has received little attention. This thesis presents a series of novel techniques for reducing the tail latency in stream processing systems like Apache Storm. Concretely, we present three mechanisms: (1) adaptive timeout coupled with selective replay to catch straggler tuples; (2) shared queues among different tasks of the same operator to reduce overall queueing delay; (3) latency feedback-based load balancing, intended to mitigate heterogenous scenarios. We have implemented these techniques in Apache Storm, and present experimental results using sets of micro-benchmarks as well as two topologies from Yahoo! Inc. Our results show improvement in tail latency in the range of 2%-72.9%

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Illinois Digital Environment for Access to Learning and Scholarship Repository

oai:www.ideals.illinois.edu:21...

Last time updated on 11/06/2018

IDEALS @ Illinois

oai:www.ideals.illinois.edu:21...

Last time updated on 05/04/2020