2 research outputs found

    Activity monitoring using topic models

    Get PDF
    Activity monitoring is the task of continual observation of a stream of events which necessitates the immediate detection of anomalies based on a short window of data. For many types of categorical data, such as zip codes and phone numbers, thousands of unique attribute values lead to a sparse frequency vector. This vector is then unlikely to be similar to the frequency vector obtained from the training set collected from a longer period of time. In this work, using topic models, we present a method for dimensionality reduction which can detect anomalous windows of categorical data with a low rate of false positives. We apply nonparametric Bayesian topic models to address the variable nature of data, which allows for updating the model parameters during the continual observation to capture gradual changes of the user behavior. Our experiments on several real-life datasets show that our proposed model outperforms state-of-the-art methods for activity monitoring in categorical data with large domains of attribute values
    corecore