8,366 research outputs found
On The Spatiotemporal Burstiness of Terms
Thousands of documents are made available to the users via the web on a daily
basis. One of the most extensively studied problems in the context of such
document streams is burst identification. Given a term t, a burst is generally
exhibited when an unusually high frequency is observed for t. While spatial and
temporal burstiness have been studied individually in the past, our work is the
first to simultaneously track and measure spatiotemporal term burstiness. In
addition, we use the mined burstiness information toward an efficient
document-search engine: given a user's query of terms, our engine returns a
ranked list of documents discussing influential events with a strong
spatiotemporal impact. We demonstrate the efficiency of our methods with an
extensive experimental evaluation on real and synthetic datasets.Comment: VLDB201
A Scalable Framework for Spatiotemporal Analysis of Location-based Social Media Data
In the past several years, social media (e.g., Twitter and Facebook) has been
experiencing a spectacular rise and popularity, and becoming a ubiquitous
discourse for content sharing and social networking. With the widespread of
mobile devices and location-based services, social media typically allows users
to share whereabouts of daily activities (e.g., check-ins and taking photos),
and thus strengthens the roles of social media as a proxy to understand human
behaviors and complex social dynamics in geographic spaces. Unlike conventional
spatiotemporal data, this new modality of data is dynamic, massive, and
typically represented in stream of unstructured media (e.g., texts and photos),
which pose fundamental representation, modeling and computational challenges to
conventional spatiotemporal analysis and geographic information science. In
this paper, we describe a scalable computational framework to harness massive
location-based social media data for efficient and systematic spatiotemporal
data analysis. Within this framework, the concept of space-time trajectories
(or paths) is applied to represent activity profiles of social media users. A
hierarchical spatiotemporal data model, namely a spatiotemporal data cube
model, is developed based on collections of space-time trajectories to
represent the collective dynamics of social media users across aggregation
boundaries at multiple spatiotemporal scales. The framework is implemented
based upon a public data stream of Twitter feeds posted on the continent of
North America. To demonstrate the advantages and performance of this framework,
an interactive flow mapping interface (including both single-source and
multiple-source flow mapping) is developed to allow real-time, and interactive
visual exploration of movement dynamics in massive location-based social media
at multiple scales
Applications of Data Mining Techniques for Vehicular Ad hoc Networks
Due to the recent advances in vehicular ad hoc networks (VANETs), smart
applications have been incorporating the data generated from these networks to
provide quality of life services. In this paper, we have proposed taxonomy of
data mining techniques that have been applied in this domain in addition to a
classification of these techniques. Our contribution is to highlight the
research methodologies in the literature and allow for comparing among them
using different characteristics. The proposed taxonomy covers elementary data
mining techniques such as: preprocessing, outlier detection, clustering, and
classification of data. In addition, it covers centralized, distributed,
offline, and online techniques from the literature
Spatio-Temporal Data Mining: A Survey of Problems and Methods
Large volumes of spatio-temporal data are increasingly collected and studied
in diverse domains including, climate science, social sciences, neuroscience,
epidemiology, transportation, mobile health, and Earth sciences.
Spatio-temporal data differs from relational data for which computational
approaches are developed in the data mining community for multiple decades, in
that both spatial and temporal attributes are available in addition to the
actual measurements/attributes. The presence of these attributes introduces
additional challenges that needs to be dealt with. Approaches for mining
spatio-temporal data have been studied for over a decade in the data mining
community. In this article we present a broad survey of this relatively young
field of spatio-temporal data mining. We discuss different types of
spatio-temporal data and the relevant data mining questions that arise in the
context of analyzing each of these datasets. Based on the nature of the data
mining problem studied, we classify literature on spatio-temporal data mining
into six major categories: clustering, predictive learning, change detection,
frequent pattern mining, anomaly detection, and relationship mining. We discuss
the various forms of spatio-temporal data mining problems in each of these
categories.Comment: Accepted for publication at ACM Computing Survey
Demand Forecasting from Spatiotemporal Data with Graph Networks and Temporal-Guided Embedding
Short-term demand forecasting models commonly combine convolutional and
recurrent layers to extract complex spatiotemporal patterns in data. Long-term
histories are also used to consider periodicity and seasonality patterns as
time series data. In this study, we propose an efficient architecture,
Temporal-Guided Network (TGNet), which utilizes graph networks and
temporal-guided embedding. Graph networks extract invariant features to
permutations of adjacent regions instead of convolutional layers.
Temporal-guided embedding explicitly learns temporal contexts from training
data and is substituted for the input of long-term histories from days/weeks
ago. TGNet learns an autoregressive model, conditioned on temporal contexts of
forecasting targets from temporal-guided embedding. Finally, our model achieves
competitive performances with other baselines on three spatiotemporal demand
dataset from real-world, but the number of trainable parameters is about 20
times smaller than a state-of-the-art baseline. We also show that
temporal-guided embedding learns temporal contexts as intended and TGNet has
robust forecasting performances even to atypical event situations.Comment: NeurIPS 2018 Workshop on Modeling and Decision-Making in the
Spatiotemporal Domai
A Survey on Content-Aware Video Analysis for Sports
Sports data analysis is becoming increasingly large-scale, diversified, and
shared, but difficulty persists in rapidly accessing the most crucial
information. Previous surveys have focused on the methodologies of sports video
analysis from the spatiotemporal viewpoint instead of a content-based
viewpoint, and few of these studies have considered semantics. This study
develops a deeper interpretation of content-aware sports video analysis by
examining the insight offered by research into the structure of content under
different scenarios. On the basis of this insight, we provide an overview of
the themes particularly relevant to the research on content-aware systems for
broadcast sports. Specifically, we focus on the video content analysis
techniques applied in sportscasts over the past decade from the perspectives of
fundamentals and general review, a content hierarchical model, and trends and
challenges. Content-aware analysis methods are discussed with respect to
object-, event-, and context-oriented groups. In each group, the gap between
sensation and content excitement must be bridged using proper strategies. In
this regard, a content-aware approach is required to determine user demands.
Finally, the paper summarizes the future trends and challenges for sports video
analysis. We believe that our findings can advance the field of research on
content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
Eigenspace Method for Spatiotemporal Hotspot Detection
Hotspot detection aims at identifying subgroups in the observations that are
unexpected, with respect to the some baseline information. For instance, in
disease surveillance, the purpose is to detect sub-regions in spatiotemporal
space, where the count of reported diseases (e.g. Cancer) is higher than
expected, with respect to the population. The state-of-the-art method for this
kind of problem is the Space-Time Scan Statistics (STScan), which exhaustively
search the whole space through a sliding window looking for significant
spatiotemporal clusters. STScan makes some restrictive assumptions about the
distribution of data, the shape of the hotspots and the quality of data, which
can be unrealistic for some nontraditional data sources. A novel methodology
called EigenSpot is proposed where instead of an exhaustive search over the
space, tracks the changes in a space-time correlation structure. Not only does
the new approach presents much more computational efficiency, but also makes no
assumption about the data distribution, hotspot shape or the data quality. The
principal idea is that with the joint combination of abnormal elements in the
principal spatial and the temporal singular vectors, the location of hotspots
in the spatiotemporal space can be approximated. A comprehensive experimental
evaluation, both on simulated and real data sets reveals the effectiveness of
the proposed method.Comment: To appear in Expert Systems Journa
Inferring Neuronal Network Connectivity using Time-constrained Episodes
Discovering frequent episodes in event sequences is an interesting data
mining task. In this paper, we argue that this framework is very effective for
analyzing multi-neuronal spike train data. Analyzing spike train data is an
important problem in neuroscience though there are no data mining approaches
reported for this. Motivated by this application, we introduce different
temporal constraints on the occurrences of episodes. We present algorithms for
discovering frequent episodes under temporal constraints. Through simulations,
we show that our method is very effective for analyzing spike train data for
unearthing underlying connectivity patterns.Comment: 9 pages. See also http://neural-code.cs.vt.edu
A Data as a Service (DaaS) Model for GPU-based Data Analytics
Cloud-based services with resources to be provisioned for consumers are
increasingly the norm, especially with respect to Big data, spatiotemporal data
mining and application services that impose a user's agreed Quality of Service
(QoS) rules or Service Level Agreement (SLA). Considering the pervasive nature
of data centers and cloud system, there is a need for a real-time analytics of
the systems considering cost, utility and energy. This work presents an overlay
model of GPU system for Data As A Service (DaaS) to give a real-time data
analysis of network data, customers, investors and users' data from the
datacenters or cloud system. Using a modeled layer to define a learning
protocol and system, we give a custom, profitable system for DaaS on GPU. The
GPU-enabled pre-processing and initial operations of the clustering model
analysis is promising as shown in the results. We examine the model on
real-world data sets to model a big data set or spatiotemporal data mining
services. We also produce results of our model with clustering, neural
networks' Self-organizing feature maps (SOFM or SOM) to produce a distribution
of the clustering for DaaS model. The experimental results thus far show a
promising model that could enhance SLA and or QoS based DaaS.Comment: Accepted, 23 December 2017, by the IEEE IFIP NTMS Workshop on Big
Data and Emerging Trends WBD-ET 2018; it was later withdrawn because of
funding issues. An extended/enhanced version will be published in future
dates in related journal
Data-driven root-cause analysis for distributed system anomalies
Modern distributed cyber-physical systems encounter a large variety of
anomalies and in many cases, they are vulnerable to catastrophic fault
propagation scenarios due to strong connectivity among the sub-systems. In this
regard, root-cause analysis becomes highly intractable due to complex fault
propagation mechanisms in combination with diverse operating modes. This paper
presents a new data-driven framework for root-cause analysis for addressing
such issues. The framework is based on a spatiotemporal feature extraction
scheme for distributed cyber-physical systems built on the concept of symbolic
dynamics for discovering and representing causal interactions among subsystems
of a complex system. We present two approaches for root-cause analysis, namely
the sequential state switching (, based on free energy concept of a
Restricted Boltzmann Machine, RBM) and artificial anomaly association (, a
multi-class classification framework using deep neural networks, DNN).
Synthetic data from cases with failed pattern(s) and anomalous node are
simulated to validate the proposed approaches, then compared with the
performance of vector autoregressive (VAR) model-based root-cause analysis.
Real dataset based on Tennessee Eastman process (TEP) is also used for
validation. The results show that: (1) and approaches can obtain
high accuracy in root-cause analysis and successfully handle multiple nominal
operation modes, and (2) the proposed tool-chain is shown to be scalable while
maintaining high accuracy.Comment: 6 pages, 3 figure
- …