10,433 research outputs found
SURGE: Continuous Detection of Bursty Regions Over a Stream of Spatial Objects
With the proliferation of mobile devices and location-based services,
continuous generation of massive volume of streaming spatial objects (i.e.,
geo-tagged data) opens up new opportunities to address real-world problems by
analyzing them. In this paper, we present a novel continuous bursty region
detection problem that aims to continuously detect a bursty region of a given
size in a specified geographical area from a stream of spatial objects.
Specifically, a bursty region shows maximum spike in the number of spatial
objects in a given time window. The problem is useful in addressing several
real-world challenges such as surge pricing problem in online transportation
and disease outbreak detection. To solve the problem, we propose an exact
solution and two approximate solutions, and the approximation ratio is
in terms of the burst score, where is a parameter
to control the burst score. We further extend these solutions to support
detection of top- bursty regions. Extensive experiments with real-world data
are conducted to demonstrate the efficiency and effectiveness of our solutions
Efficient Summing over Sliding Windows
This paper considers the problem of maintaining statistic aggregates over the
last W elements of a data stream. First, the problem of counting the number of
1's in the last W bits of a binary stream is considered. A lower bound of
{\Omega}(1/{\epsilon} + log W) memory bits for W{\epsilon}-additive
approximations is derived. This is followed by an algorithm whose memory
consumption is O(1/{\epsilon} + log W) bits, indicating that the algorithm is
optimal and that the bound is tight. Next, the more general problem of
maintaining a sum of the last W integers, each in the range of {0,1,...,R}, is
addressed. The paper shows that approximating the sum within an additive error
of RW{\epsilon} can also be done using {\Theta}(1/{\epsilon} + log W) bits for
{\epsilon}={\Omega}(1/W). For {\epsilon}=o(1/W), we present a succinct
algorithm which uses B(1 + o(1)) bits, where B={\Theta}(Wlog(1/W{\epsilon})) is
the derived lower bound. We show that all lower bounds generalize to randomized
algorithms as well. All algorithms process new elements and answer queries in
O(1) worst-case time.Comment: A shorter version appears in SWAT 201
KV-match: A Subsequence Matching Approach Supporting Normalization and Time Warping [Extended Version]
The volume of time series data has exploded due to the popularity of new
applications, such as data center management and IoT. Subsequence matching is a
fundamental task in mining time series data. All index-based approaches only
consider raw subsequence matching (RSM) and do not support subsequence
normalization. UCR Suite can deal with normalized subsequence match problem
(NSM), but it needs to scan full time series. In this paper, we propose a novel
problem, named constrained normalized subsequence matching problem (cNSM),
which adds some constraints to NSM problem. The cNSM problem provides a knob to
flexibly control the degree of offset shifting and amplitude scaling, which
enables users to build the index to process the query. We propose a new index
structure, KV-index, and the matching algorithm, KV-match. With a single index,
our approach can support both RSM and cNSM problems under either ED or DTW
distance. KV-index is a key-value structure, which can be easily implemented on
local files or HBase tables. To support the query of arbitrary lengths, we
extend KV-match to KV-match, which utilizes multiple varied-length
indexes to process the query. We conduct extensive experiments on synthetic and
real-world datasets. The results verify the effectiveness and efficiency of our
approach.Comment: 13 page
- …