694 research outputs found

    Functional Brain Imaging with Multi-Objective Multi-Modal Evolutionary Optimization

    Get PDF
    Functional brain imaging is a source of spatio-temporal data mining problems. A new framework hybridizing multi-objective and multi-modal optimization is proposed to formalize these data mining problems, and addressed through Evolutionary Computation (EC). The merits of EC for spatio-temporal data mining are demonstrated as the approach facilitates the modelling of the experts' requirements, and flexibly accommodates their changing goals

    Challenging Issues of Spatio-Temporal Data Mining

    Get PDF
    The spatio-temporal database (STDB) has received considerable attention during the past few years, due to the emergence of numerous applications (e.g., flight control systems, weather forecast, mobile computing, etc.) that demand efficient management of moving objects. These applications record objects' geographical locations (sometimes also shapes) at various timestamps and support queries that explore their historical and future (predictive) behaviors. The STDB significantly extends the traditional spatial database, which deals with only stationary data and hence is inapplicable to moving objects, whose dynamic behavior requires re-investigation of numerous topics including data modeling, indexes, and the related query algorithms. In many application areas, huge amounts of data are generated, explicitly or implicitly containing spatial or spatiotemporal information. However, the ability to analyze these data remains inadequate, and the need for adapted data mining tools becomes a major challenge. In this paper, we have presented the challenging issues of spatio-temporal data mining. Keywords: database, data mining, spatial, temporal, spatio-tempora

    Privacy preserving distributed spatio-temporal data mining

    Get PDF
    Time-stamped location information is regarded as spatio-temporal data due to its time and space dimensions and, by its nature, is highly vulnerable to misuse. Privacy issues related to collection, use and distribution of individuals’ location information are the main obstacles impeding knowledge discovery in spatio-temporal data. Suppressing identifiers from the data does not suffice since movement trajectories can easily be linked to individuals using publicly available information such as home or work addresses. Yet another solution could be employing existing privacy preserving data mining techniques. However these techniques are not suitable since time-stamped location observations of an object are not plain, independent attributes of this object. Therefore, new privacy preserving data mining techniques are required to handle spatio-temporal data specifically. In this thesis, we propose a privacy preserving data mining technique and two preprocessing steps for data mining related to privacy preservation in spatio-temporal datasets: (1) Distributed clustering, (2) Centralized anonymization and (3) Distributed anonymization. We also provide security and efficiency analysis of our algorithms which shows that under reasonable conditions, achieving privacy preservation with minimal sensitive information leakage is possible for data mining purposes

    Spatio-Temporal Data Mining: From Big Data to Patterns

    Get PDF
    Abstract Technological advances in terms of data acquisition enable to better monitor dynamic phenomena in various domains (areas, fields) including environment. The collected data is more and more complex -spatial, temporal, heterogeneous and multi-scale. Exploiting this data requires new data analysis and knowledge discovery methods. In that context, approaches aimed at discovering spatio-temporal patterns are particularly relevant. This paper 1 focuses on spatio-temporal data and associated data mining methods

    Ensuring location diversity in privacy preserving spatio-temporal data mining

    Get PDF
    The rise of mobile technologies in the last decade has lead to vast amounts of location information generated by individuals. From the knowledge discovery point of view, this data is quite valuable as it has commercial value, but the inherent personal information in the data raises privacy concerns. There exist many algorithms in the literature to satisfy the privacy requirements of individuals, by generalizing, perturbing, and suppressing data. The algorithms that try to ensure a level of indistinguishability between trajectories in the dataset, fail when there is not enough diversity among sensitive locations visited by those users. We propose an approach that ensures location diversity named as (c,p)- confidentiality, which bounds the probability of visiting a sensitive location given the background knowledge of the adversary. Instead of grouping the trajectories, we anonymize the underlying map structure. We explain our algorithm and show the performance of our approach. We also compare the performance of our algorithm with an existing technique and show that location diversity can be satisfied efficiently

    Deep Learning-Based Spatio-Temporal Data Mining Using Multi-Source Geospatial Data

    Full text link
    With the rapid development of various geospatial technologies including remote sensing, mobile devices, and Global Position System (GPS), spatio-temporal data are abundantly available nowadays. Extracting valuable knowledge from spatio-temporal data is of crucial importance for many real-world applications such as intelligent transportation, social services, and intelligent distribution. With the fast increase of the amount and resolution of spatio-temporal data, traditional data mining methods are becoming obsolete. In recent years, deep learning models such as Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) have made promising achievements in many fields based on the strong ability in automated feature extraction and have been broadly used in different spatio-temporal data mining tasks. Many methods have been developed, and more diverse data were collected in recent decades, however, the existing methods have faced challenges from multi-source geospatial data. This thesis investigates four efficient techniques in different scenarios for spatio-temporal data mining that take advantage of multi-source geospatial data to overcome the limitations of traditional data mining methods. This study investigates spatio-temporal data mining from four different perspectives. Firstly, a multi-elemental geolocation inference method is proposed to predict the location of tweets without geo-tags. Secondly, an optimization model is proposed to detect multiple Areas-of-Interest (AOIs) simultaneously and solve the multi-AOIs detection problem. Thirdly, a multi-task Res-U-Net model with attention mechanism is developed for the extraction of the building roofs and the whole building shapes from remote sensing images, then an offset vector method is used to detect the footprints of the high-rise buildings based on the boundaries of the corresponding building roofs and shapes. Lastly, a novel decoder fusion model is introduced to extract interior road network from remote sensing images and GPS trajectory data. And this method is effective for multi-source data mining. The proposed four methods use different techniques for spatio-temporal data mining to improve the detection performance. Numerous experiments show that the techniques developed in this thesis can detect ground features efficiently and effectively and overcome the limitations of conventional algorithms. The studies demonstrate that exploiting spatial information from multi-source geospatial data can improve the detection accuracy in comparison with single-source geospatial data

    A Framework for Spatio-Temporal Data Analysis and Hypothesis Exploration

    Get PDF
    We present a general framework for pattern discovery and hypothesis exploration in spatio-temporal data sets that is based on delay-embedding. This is a remarkable method of nonlinear time-series analysis that allows the full phase-space behaviour of a system to be reconstructed from only a single observable (accessible variable). Recent extensions to the theory that focus on a probabilistic interpretation extend its scope and allow practical application to noisy, uncertain and high-dimensional systems. The framework uses these extensions to aid alignment of spatio-temporal sub-models (hypotheses) to empirical data - for example satellite images plus remote-sensing - and to explore modifications consistent with this alignment. The novel aspect of the work is a mechanism for linking global and local dynamics using a holistic spatio-temporal feedback loop. An example framework is devised for an urban based application, transit centric developments, and its utility is demonstrated with real data

    A Graph-structured Dataset for Wikipedia Research

    Get PDF
    Wikipedia is a rich and invaluable source of information. Its central place on the Web makes it a particularly interesting object of study for scientists. Researchers from different domains used various complex datasets related to Wikipedia to study language, social behavior, knowledge organization, and network theory. While being a scientific treasure, the large size of the dataset hinders pre-processing and may be a challenging obstacle for potential new studies. This issue is particularly acute in scientific domains where researchers may not be technically and data processing savvy. On one hand, the size of Wikipedia dumps is large. It makes the parsing and extraction of relevant information cumbersome. On the other hand, the API is straightforward to use but restricted to a relatively small number of requests. The middle ground is at the mesoscopic scale when researchers need a subset of Wikipedia ranging from thousands to hundreds of thousands of pages but there exists no efficient solution at this scale. In this work, we propose an efficient data structure to make requests and access subnetworks of Wikipedia pages and categories. We provide convenient tools for accessing and filtering viewership statistics or "pagecounts" of Wikipedia web pages. The dataset organization leverages principles of graph databases that allows rapid and intuitive access to subgraphs of Wikipedia articles and categories. The dataset and deployment guidelines are available on the LTS2 website \url{https://lts2.epfl.ch/Datasets/Wikipedia/}
    corecore