3,769 research outputs found
Processing count queries over event streams at multiple time granularities
Management and analysis of streaming data has become crucial with its applications in web, sensor data, network tra c data, and stock market. Data streams consist of mostly numeric data but what is more interesting is the events derived from the numerical data that need to be monitored. The events obtained from streaming data form event streams. Event streams have similar properties to data streams, i.e., they are seen only once in a fixed order as a continuous stream. Events appearing in the event stream have time stamps associated with them in a certain time granularity, such as second, minute, or hour. One type of frequently asked queries over event streams is count queries, i.e., the frequency of an event occurrence over time. Count queries can be answered over event streams easily, however, users may ask queries over di erent time granularities as well. For example, a broker may ask how many times a stock increased in the same time frame, where the time frames specified could be hour, day, or both. This is crucial especially in the case of event streams where only a window of an event stream is available at a certain time instead of the whole stream. In this paper, we propose a technique for predicting the frequencies of event occurrences in event streams at multiple time granularities. The proposed approximation method e ciently estimates the count of events with a high accuracy in an event stream at any time granularity by examining the distance distributions of event occurrences. The proposed method has been implemented and tested on di erent real data sets and the results obtained are presented to show its e ectiveness
Discovering human activities from binary data in smart homes
With the rapid development in sensing technology, data mining, and machine learning fields for human health monitoring, it became possible to enable monitoring of personal motion and vital signs in a manner that minimizes the disruption of an individual’s daily routine and assist individuals with difficulties to live independently at home. A primary difficulty that researchers confront is acquiring an adequate amount of labeled data for model training and validation purposes. Therefore, activity discovery handles the problem that activity labels are not available using approaches based on sequence mining and clustering. In this paper, we introduce an unsupervised method for discovering activities from a network of motion detectors in a smart home setting. First, we present an intra-day clustering algorithm to find frequent sequential patterns within a day. As a second step, we present an inter-day clustering algorithm to find the common frequent patterns between days. Furthermore, we refine the patterns to have more compressed and defined cluster characterizations. Finally, we track the occurrences of various regular routines to monitor the functional health in an individual’s patterns and lifestyle. We evaluate our methods on two public data sets captured in real-life settings from two apartments during seven-month and three-month periods
Heuristic Approaches for Generating Local Process Models through Log Projections
Local Process Model (LPM) discovery is focused on the mining of a set of
process models where each model describes the behavior represented in the event
log only partially, i.e. subsets of possible events are taken into account to
create so-called local process models. Often such smaller models provide
valuable insights into the behavior of the process, especially when no adequate
and comprehensible single overall process model exists that is able to describe
the traces of the process from start to end. The practical application of LPM
discovery is however hindered by computational issues in the case of logs with
many activities (problems may already occur when there are more than 17 unique
activities). In this paper, we explore three heuristics to discover subsets of
activities that lead to useful log projections with the goal of speeding up LPM
discovery considerably while still finding high-quality LPMs. We found that a
Markov clustering approach to create projection sets results in the largest
improvement of execution time, with discovered LPMs still being better than
with the use of randomly generated activity sets of the same size. Another
heuristic, based on log entropy, yields a more moderate speedup, but enables
the discovery of higher quality LPMs. The third heuristic, based on the
relative information gain, shows unstable performance: for some data sets the
speedup and LPM quality are higher than with the log entropy based method,
while for other data sets there is no speedup at all.Comment: paper accepted and to appear in the proceedings of the IEEE Symposium
on Computational Intelligence and Data Mining (CIDM), special session on
Process Mining, part of the Symposium Series on Computational Intelligence
(SSCI
The automatic detection of patterns in people's movements
Bibliography: leaves 102-105
A Causality-Aware Pattern Mining Scheme for Group Activity Recognition in a Pervasive Sensor Space
Human activity recognition (HAR) is a key challenge in pervasive computing
and its solutions have been presented based on various disciplines.
Specifically, for HAR in a smart space without privacy and accessibility
issues, data streams generated by deployed pervasive sensors are leveraged. In
this paper, we focus on a group activity by which a group of users perform a
collaborative task without user identification and propose an efficient group
activity recognition scheme which extracts causality patterns from pervasive
sensor event sequences generated by a group of users to support as good
recognition accuracy as the state-of-the-art graphical model. To filter out
irrelevant noise events from a given data stream, a set of rules is leveraged
to highlight causally related events. Then, a pattern-tree algorithm extracts
frequent causal patterns by means of a growing tree structure. Based on the
extracted patterns, a weighted sum-based pattern matching algorithm computes
the likelihoods of stored group activities to the given test event sequence by
means of matched event pattern counts for group activity recognition. We
evaluate the proposed scheme using the data collected from our testbed and
CASAS datasets where users perform their tasks on a daily basis and validate
its effectiveness in a real environment. Experiment results show that the
proposed scheme performs higher recognition accuracy and with a small amount of
runtime overhead than the existing schemes
- …