Taming Big Data By Streaming

Yang, Lin

Taming Big Data By Streaming

Authors: Lin Yang
Publication date: 22 May 2018
Publisher: 'The Busan Gyeongnam Mathematical Society'

Abstract

Data streams have emerged as a natural computational model for numerous applications of big data processing. In this model, algorithms are assumed to have access to a limited amount of memory and can only make a single pass (or a few passes) over the data, but need to produce sufficiently accurate answers for some objective functions on the dataset. This model captures various real-world applications and stimulates new scalable tools for solving important problems in the big data era. This dissertation focuses on the following two aspects of the streaming model. 1. Understanding the capability of the streaming model. For a vector aggregation stream, i.e., when the stream is a sequence of updates to an underlying