Search CORE

8,366 research outputs found

On The Spatiotemporal Burstiness of Terms

Author: Gunopulos Dimitrios
Lappas Theodoros
Tsotras Vassilis J.
Vieira Marcos R.
Publication venue
Publication date: 30/05/2012
Field of study

Thousands of documents are made available to the users via the web on a daily basis. One of the most extensively studied problems in the context of such document streams is burst identification. Given a term t, a burst is generally exhibited when an unusually high frequency is observed for t. While spatial and temporal burstiness have been studied individually in the past, our work is the first to simultaneously track and measure spatiotemporal term burstiness. In addition, we use the mined burstiness information toward an efficient document-search engine: given a user's query of terms, our engine returns a ranked list of documents discussing influential events with a strong spatiotemporal impact. We demonstrate the efficiency of our methods with an extensive experimental evaluation on real and synthetic datasets.Comment: VLDB201

arXiv.org e-Print Archive

A Scalable Framework for Spatiotemporal Analysis of Location-based Social Media Data

Author: Cao Guofeng
Hwang Myunghwa
Padmanabhan Anand
Soltani Kiumars
Wang Shaowen
Zhang Zhenhua
Publication venue
Publication date: 07/09/2014
Field of study

In the past several years, social media (e.g., Twitter and Facebook) has been experiencing a spectacular rise and popularity, and becoming a ubiquitous discourse for content sharing and social networking. With the widespread of mobile devices and location-based services, social media typically allows users to share whereabouts of daily activities (e.g., check-ins and taking photos), and thus strengthens the roles of social media as a proxy to understand human behaviors and complex social dynamics in geographic spaces. Unlike conventional spatiotemporal data, this new modality of data is dynamic, massive, and typically represented in stream of unstructured media (e.g., texts and photos), which pose fundamental representation, modeling and computational challenges to conventional spatiotemporal analysis and geographic information science. In this paper, we describe a scalable computational framework to harness massive location-based social media data for efficient and systematic spatiotemporal data analysis. Within this framework, the concept of space-time trajectories (or paths) is applied to represent activity profiles of social media users. A hierarchical spatiotemporal data model, namely a spatiotemporal data cube model, is developed based on collections of space-time trajectories to represent the collective dynamics of social media users across aggregation boundaries at multiple spatiotemporal scales. The framework is implemented based upon a public data stream of Twitter feeds posted on the continent of North America. To demonstrate the advantages and performance of this framework, an interactive flow mapping interface (including both single-source and multiple-source flow mapping) is developed to allow real-time, and interactive visual exploration of movement dynamics in massive location-based social media at multiple scales

arXiv.org e-Print Archive

Applications of Data Mining Techniques for Vehicular Ad hoc Networks

Author: Samarah Samer
Zamil Mohammed AL
Publication venue
Publication date: 19/06/2018
Field of study

Due to the recent advances in vehicular ad hoc networks (VANETs), smart applications have been incorporating the data generated from these networks to provide quality of life services. In this paper, we have proposed taxonomy of data mining techniques that have been applied in this domain in addition to a classification of these techniques. Our contribution is to highlight the research methodologies in the literature and allow for comparing among them using different characteristics. The proposed taxonomy covers elementary data mining techniques such as: preprocessing, outlier detection, clustering, and classification of data. In addition, it covers centralized, distributed, offline, and online techniques from the literature

arXiv.org e-Print Archive

Spatio-Temporal Data Mining: A Survey of Problems and Methods

Author: Atluri Gowtham
Karpatne Anuj
Kumar Vipin
Publication venue
Publication date: 17/11/2017
Field of study

Large volumes of spatio-temporal data are increasingly collected and studied in diverse domains including, climate science, social sciences, neuroscience, epidemiology, transportation, mobile health, and Earth sciences. Spatio-temporal data differs from relational data for which computational approaches are developed in the data mining community for multiple decades, in that both spatial and temporal attributes are available in addition to the actual measurements/attributes. The presence of these attributes introduces additional challenges that needs to be dealt with. Approaches for mining spatio-temporal data have been studied for over a decade in the data mining community. In this article we present a broad survey of this relatively young field of spatio-temporal data mining. We discuss different types of spatio-temporal data and the relevant data mining questions that arise in the context of analyzing each of these datasets. Based on the nature of the data mining problem studied, we classify literature on spatio-temporal data mining into six major categories: clustering, predictive learning, change detection, frequent pattern mining, anomaly detection, and relationship mining. We discuss the various forms of spatio-temporal data mining problems in each of these categories.Comment: Accepted for publication at ACM Computing Survey

arXiv.org e-Print Archive

Demand Forecasting from Spatiotemporal Data with Graph Networks and Temporal-Guided Embedding

Author: Cheon Yeongjae
Jung Suehun
Kim Dongil
Lee Doyup
You Seungil
Publication venue
Publication date: 07/10/2019
Field of study

Short-term demand forecasting models commonly combine convolutional and recurrent layers to extract complex spatiotemporal patterns in data. Long-term histories are also used to consider periodicity and seasonality patterns as time series data. In this study, we propose an efficient architecture, Temporal-Guided Network (TGNet), which utilizes graph networks and temporal-guided embedding. Graph networks extract invariant features to permutations of adjacent regions instead of convolutional layers. Temporal-guided embedding explicitly learns temporal contexts from training data and is substituted for the input of long-term histories from days/weeks ago. TGNet learns an autoregressive model, conditioned on temporal contexts of forecasting targets from temporal-guided embedding. Finally, our model achieves competitive performances with other baselines on three spatiotemporal demand dataset from real-world, but the number of trainable parameters is about 20 times smaller than a state-of-the-art baseline. We also show that temporal-guided embedding learns temporal contexts as intended and TGNet has robust forecasting performances even to atypical event situations.Comment: NeurIPS 2018 Workshop on Modeling and Decision-Making in the Spatiotemporal Domai

arXiv.org e-Print Archive

A Survey on Content-Aware Video Analysis for Sports

Author: Shih Huang-Chia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/03/2017
Field of study

Sports data analysis is becoming increasingly large-scale, diversified, and shared, but difficulty persists in rapidly accessing the most crucial information. Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics. This study develops a deeper interpretation of content-aware sports video analysis by examining the insight offered by research into the structure of content under different scenarios. On the basis of this insight, we provide an overview of the themes particularly relevant to the research on content-aware systems for broadcast sports. Specifically, we focus on the video content analysis techniques applied in sportscasts over the past decade from the perspectives of fundamentals and general review, a content hierarchical model, and trends and challenges. Content-aware analysis methods are discussed with respect to object-, event-, and context-oriented groups. In each group, the gap between sensation and content excitement must be bridged using proper strategies. In this regard, a content-aware approach is required to determine user demands. Finally, the paper summarizes the future trends and challenges for sports video analysis. We believe that our findings can advance the field of research on content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT

arXiv.org e-Print Archive

Eigenspace Method for Spatiotemporal Hotspot Detection

Author: Fanaee-T Hadi
Gama João
Publication venue: 'Wiley'
Publication date: 13/06/2014
Field of study

Hotspot detection aims at identifying subgroups in the observations that are unexpected, with respect to the some baseline information. For instance, in disease surveillance, the purpose is to detect sub-regions in spatiotemporal space, where the count of reported diseases (e.g. Cancer) is higher than expected, with respect to the population. The state-of-the-art method for this kind of problem is the Space-Time Scan Statistics (STScan), which exhaustively search the whole space through a sliding window looking for significant spatiotemporal clusters. STScan makes some restrictive assumptions about the distribution of data, the shape of the hotspots and the quality of data, which can be unrealistic for some nontraditional data sources. A novel methodology called EigenSpot is proposed where instead of an exhaustive search over the space, tracks the changes in a space-time correlation structure. Not only does the new approach presents much more computational efficiency, but also makes no assumption about the data distribution, hotspot shape or the data quality. The principal idea is that with the joint combination of abnormal elements in the principal spatial and the temporal singular vectors, the location of hotspots in the spatiotemporal space can be approximated. A comprehensive experimental evaluation, both on simulated and real data sets reveals the effectiveness of the proposed method.Comment: To appear in Expert Systems Journa

arXiv.org e-Print Archive

Inferring Neuronal Network Connectivity using Time-constrained Episodes

Author: Patnaik Debprakash
Sastry P. S.
Unnikrishnan K. P.
Publication venue
Publication date: 26/09/2007
Field of study

Discovering frequent episodes in event sequences is an interesting data mining task. In this paper, we argue that this framework is very effective for analyzing multi-neuronal spike train data. Analyzing spike train data is an important problem in neuroscience though there are no data mining approaches reported for this. Motivated by this application, we introduce different temporal constraints on the occurrences of episodes. We present algorithms for discovering frequent episodes under temporal constraints. Through simulations, we show that our method is very effective for analyzing spike train data for unearthing underlying connectivity patterns.Comment: 9 pages. See also http://neural-code.cs.vt.edu

arXiv.org e-Print Archive

A Data as a Service (DaaS) Model for GPU-based Data Analytics

Author: Abe John Olorunfemi
Ustundaug Burak Berk
Publication venue
Publication date: 05/02/2018
Field of study

Cloud-based services with resources to be provisioned for consumers are increasingly the norm, especially with respect to Big data, spatiotemporal data mining and application services that impose a user's agreed Quality of Service (QoS) rules or Service Level Agreement (SLA). Considering the pervasive nature of data centers and cloud system, there is a need for a real-time analytics of the systems considering cost, utility and energy. This work presents an overlay model of GPU system for Data As A Service (DaaS) to give a real-time data analysis of network data, customers, investors and users' data from the datacenters or cloud system. Using a modeled layer to define a learning protocol and system, we give a custom, profitable system for DaaS on GPU. The GPU-enabled pre-processing and initial operations of the clustering model analysis is promising as shown in the results. We examine the model on real-world data sets to model a big data set or spatiotemporal data mining services. We also produce results of our model with clustering, neural networks' Self-organizing feature maps (SOFM or SOM) to produce a distribution of the clustering for DaaS model. The experimental results thus far show a promising model that could enhance SLA and or QoS based DaaS.Comment: Accepted, 23 December 2017, by the IEEE IFIP NTMS Workshop on Big Data and Emerging Trends WBD-ET 2018; it was later withdrawn because of funding issues. An extended/enhanced version will be published in future dates in related journal

arXiv.org e-Print Archive

Data-driven root-cause analysis for distributed system anomalies

Author: Liu Chao
Lore Kin Gwn
Sarkar Soumik
Publication venue
Publication date: 30/05/2018
Field of study

Modern distributed cyber-physical systems encounter a large variety of anomalies and in many cases, they are vulnerable to catastrophic fault propagation scenarios due to strong connectivity among the sub-systems. In this regard, root-cause analysis becomes highly intractable due to complex fault propagation mechanisms in combination with diverse operating modes. This paper presents a new data-driven framework for root-cause analysis for addressing such issues. The framework is based on a spatiotemporal feature extraction scheme for distributed cyber-physical systems built on the concept of symbolic dynamics for discovering and representing causal interactions among subsystems of a complex system. We present two approaches for root-cause analysis, namely the sequential state switching (

S^3

, based on free energy concept of a Restricted Boltzmann Machine, RBM) and artificial anomaly association (

A^3

, a multi-class classification framework using deep neural networks, DNN). Synthetic data from cases with failed pattern(s) and anomalous node are simulated to validate the proposed approaches, then compared with the performance of vector autoregressive (VAR) model-based root-cause analysis. Real dataset based on Tennessee Eastman process (TEP) is also used for validation. The results show that: (1)

S^3

and

A^3

approaches can obtain high accuracy in root-cause analysis and successfully handle multiple nominal operation modes, and (2) the proposed tool-chain is shown to be scalable while maintaining high accuracy.Comment: 6 pages, 3 figure

arXiv.org e-Print Archive