24 research outputs found

    Parallel wavelet transform for spatio-temporal outlier detection in large meteorological data

    Get PDF
    Abstract. This paper describes a state-of-the-art parallel data mining solution that employs wavelet analysis for scalable outlier detection in large complex spatio-temporal data. The algorithm has been implemented on multiprocessor architecture and evaluated on real-world meteorological data. Our solution on high-performance architecture can process massive and complex spatial data at reasonable time and yields improved prediction

    Towards a digital mine: a spatial database for accessing historical geospatial data on mining and related activities

    Get PDF
    A Research Report submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in partial fulfilment of the requirements for the degree of Master of Science. Johannesburg, 2016.Countries around the world are recognising the importance of geospatial data in answering questions related to spatially varying industries such as mining activities (ongoing and discontinued). This is becoming increasingly evident with countries such as Canada, Australia, and the United Kingdom working towards establishing Abandoned Mine Lands (AML) inventories. However, the increasing need for data on mining activities is not paralleled by an increase in the availability of such data. The aim of this research therefore is to design a database for accessing historical and current geospatial data that can be used to support research, environmental management efforts as well as support decision making at all levels. A user needs survey was conducted. Two sampling methods were employed, convenient sampling and snowball sampling method. The convenient sampling method was used mostly with all the WDMP group members and the latter was employed with the respondents from institutions and organisations outside of the university respectively. The data were then categorised so as to make analysis easier and data could be evaluated on the same basis. An evaluation of the data collected showed that although the WDMP required different types of data (spatial and non- spatial) the data feed into each other and as such it is important that there is a central repository in which to store them. Furthermore investigation also shows that there is a wealth of data on current mining activities, but not so much on historical mining activities. Although data on mining activities exists, accessibility to these data is hindered by various factors such as copyright infringements, data costs, discrepancies in the data request process. The outcome of this research has been that of a physical database PostgreSQL database (PostGIS) and one mounted on an online platform (GeoServer). The databases can be visualised on PostgreSQL using select statements or visualisation through establishing a connection with QGIS, alternatively the database may be accessed on GeoServer. The database is expected to be of use to at least all members of the Wits Digital Mine Project (WDMP) and stakeholders involved in the project. The database can be used for baseline studies and also as a basis for the framework used to analyse, remedy as well as predict future challenges in the mining industry. Moreover, the database can act as a central repository for all data produced from the WDMP.LG201

    A query processing system for very large spatial databases using a new map algebra

    Get PDF
    Dans cette thèse nous introduisons une approche de traitement de requêtes pour des bases de donnée spatiales. Nous expliquons aussi les concepts principaux que nous avons défini et développé: une algèbre spatiale et une approche à base de graphe utilisée dans l'optimisateur. L'algèbre spatiale est défini pour exprimer les requêtes et les règles de transformation pendant les différentes étapes de l'optimisation de requêtes. Nous avons essayé de définir l'algèbre la plus complète que possible pour couvrir une grande variété d'application. L'opérateur algébrique reçoit et produit seulement des carte. Les fonctions reçoivent des cartes et produisent des scalaires ou des objets. L'optimisateur reçoit la requête en expression algébrique et produit un QEP (Query Evaluation Plan) efficace dans deux étapes: génération de QEG (Query Evaluation Graph) et génération de QEP. Dans première étape un graphe (QEG) équivalent de l'expression algébrique est produit. Les règles de transformation sont utilisées pour transformer le graphe a un équivalent plus efficace. Dans deuxième étape un QEP est produit de QEG passé de l'étape précédente. Le QEP est un ensemble des opérations primitives consécutives qui produit les résultats finals (la réponse finale de la requête soumise au base de donnée). Nous avons implémenté l'optimisateur, un générateur de requête spatiale aléatoire, et une base de donnée simulée. La base de donnée spatiale simulée est un ensemble de fonctions pour simuler des opérations spatiales primitives. Les requêtes aléatoires sont soumis à l'optimisateur. Les QEPs générées sont soumis au simulateur de base de données spatiale. Les résultats expérimentaux sont utilisés pour discuter les performances et les caractéristiques de l'optimisateur.Abstract: In this thesis we introduce a query processing approach for spatial databases and explain the main concepts we defined and developed: a spatial algebra and a graph based approach used in the optimizer. The spatial algebra was defined to express queries and transformation rules during different steps of the query optimization. To cover a vast variety of potential applications, we tried to define the algebra as complete as possible. The algebra looks at the spatial data as maps of spatial objects. The algebraic operators act on the maps and result in new maps. Aggregate functions can act on maps and objects and produce objects or basic values (characters, numbers, etc.). The optimizer receives the query in algebraic expression and produces one efficient QEP (Query Evaluation Plan) through two main consecutive blocks: QEG (Query Evaluation Graph) generation and QEP generation. In QEG generation we construct a graph equivalent of the algebraic expression and then apply graph transformation rules to produce one efficient QEG. In QEP generation we receive the efficient QEG and do predicate ordering and approximation and then generate the efficient QEP. The QEP is a set of consecutive phases that must be executed in the specified order. Each phase consist of one or more primitive operations. All primitive operations that are in the same phase can be executed in parallel. We implemented the optimizer, a randomly spatial query generator and a simulated spatial database. The query generator produces random queries for the purpose of testing the optimizer. The simulated spatial database is a set of functions to simulate primitive spatial operations. They return the cost of the corresponding primitive operation according to input parameters. We put randomly generated queries to the optimizer, got the generated QEPs and put them to the spatial database simulator. We used the experimental results to discuss on the optimizer characteristics and performance. The optimizer was designed for databases with a very large number of spatial objects nevertheless most of the concepts we used can be applied to all spatial information systems."--Résumé abrégé par UMI

    Portable High-Performance Indexing for Vector Product Format Spatial Databases

    Get PDF
    Geo-spatial databases have an overall performance problem because of their complexity and large size. For this reason, many researchers seek new ways to improve the overall performance of geo-spatial databases. Typically, these research efforts are focused on complex indexing structures and query processing methods to capture the relationships between the individual features of fully-functional geo-spatial databases. Visualization applications, such as combat simulators and mission planning tools, suffer from the general performance problems associated with geo-spatial databases. This research focuses on building a high-performance geo-spatial database for visualization applications. The main approach is to simplify the complex data model and to index it with high-performance indexing structures. Complex features are reduced to simple primitives, then indexed using a combination of a disk-based array and B(+)-Trees. Test results show that there is a significant performance improvement gained by the new data model and indexing schema for low to medium zoom levels. For high zoom levels, there is a performance drop due to the indexing schema\u27s overhead

    A Heterogeneous High Performance Computing Framework For Ill-Structured Spatial Join Processing

    Get PDF
    The frequently employed spatial join processing over two large layers of polygonal datasets to detect cross-layer polygon pairs (CPP) satisfying a join-predicate faces challenges common to ill-structured sparse problems, namely, that of identifying the few intersecting cross-layer edges out of the quadratic universe. The algorithmic engineering challenge is compounded by GPGPU SIMT architecture. Spatial join involves lightweight filter phase typically using overlap test over minimum bounding rectangles (MBRs) to discard majority of CPPs, followed by refinement phase to rigorously test the join predicate over the edges of the surviving CPPs. In this dissertation, we develop new techniques - algorithms, data structure, i/o, load balancing and system implementation - to accelerate the two-phase spatial-join processing. We present a new filtering technique, called Common MBR Filter (CMF), which changes the overall characteristic of the spatial join algorithms wherein the refinement phase is no longer the computational bottleneck. CMF is designed based on the insight that intersecting cross-layer edges must lie within the rectangular intersection of the MBRs of CPPs, their common MBRs (CMBR). We also address a key limitation of CMF for class of spatial datasets with either large or dense active CMBRs by extended CMF, called CMF-grid, that effectively employs both CMBR and grid techniques by embedding a uniform grid over CMBR of each CPP, but of suitably engineered sizes for different CPPs. To show efficiency of CMF-based filters, extensive mathematical and experimental analysis is provided. Then, two GPU-based spatial join systems are proposed based on two CMF versions including four components: 1) sort-based MBR filter, 2) CMF/CMF-grid, 3) point-in-polygon test, and, 4) edge-intersection test. The systems show two orders of magnitude speedup over the optimized sequential GEOS C++ library. Furthermore, we present a distributed system of heterogeneous compute nodes to exploit GPU-CPU computing in order to scale up the computation. A load balancing model based on Integer Linear Programming (ILP) is formulated for this system. We also provide three heuristic algorithms to approximate the ILP. Finally, we develop MPI-cuda-GIS system based on this heterogeneous computing model by integrating our CUDA-based GPU system into a newly designed distributed framework designed based on Message Passing Interface (MPI). Experimental results show good scalability and performance of MPI-cuda-GIS system

    Um sistema de informação espaço-temporal para objectos móveis

    Get PDF
    Dissertação de mestrado integrado em Engenharia de ComunicaçõesA grande evolução tecnológica tanto dos Sistemas de Informação Geográfica (SIG) bem como das tecnologias de localização (GPS, WI-FI, RFID) contribuíram para um significativo aumento da recolha de dados espaciais. Devido a esta proliferação e às grandes quantidades de dados que são recolhidos, são necessárias bases de dados e mecanismos apropriados para o seu armazenamento e sua respectiva análise. No passado, o desenho de sistemas de informação para objectos móveis basearam-se muitas vezes em abordagens (aplicações) restritas a tecnologias de localização específicas, levando a um vasto leque de modelos de dados, modelos de base de dados e respectivas funcionalidades. Para superar esta proliferação de modelos, este projecto propõe um sistema de informação espaço-temporal independente do domínio de aplicação, e com o intuito de se abstrair da tecnologia de posicionamento utilizada para a recolha de dados. Este sistema de informação espaço-temporal visa o desenvolvimento de um sistema de armazenamento, análise e visualização de dados sobre objectos móveis capazes de representar e armazenar dados com características espaciais e de realizar análises sobre os mesmos. Este sistema é intitulado STAR e integra três componentes principais, as bases de dados espaçotemporais, os SIG e os dados espaço-temporais. A base de dados espaço-temporal desenhada permite o armazenamento dos dados espaço-temporais bem como o armazenamento da geometria do espaço no qual ocorre o movimento. Esta geometria é representada por pontos, linhas e/ou polígonos. Os SIG auxiliam os processos de análise e visualizações de dados. Os dados espaço-temporais são intitulados neste projecto de objectos móveis, podendo o movimento dos mesmos ser analisado no STAR. Os resultados alcançados são promissores na demonstração de como o sistema é capaz de armazenar dados de posicionamento com formatos diferentes, e na aplicação de análises, visualizações e representações sobre os mesmos.The huge technological evolution of both the Geographic Information Systems (GIS) as well as location technologies (GPS, Wi-Fi, RFID) contributes to a large increase in spatial data collection. Because of this proliferation and to the large amounts of data that are collected, databases and appropriate mechanisms for their storage and their analysis are necessary. In the past, the design of information systems for moving objects was often based on approaches (applications) restricted to specific location technologies, leading to a wide range of data models, database models and functionalities. To overcome this proliferation of models, this project proposes a spatio-temporal information system that is independent of the application domain and that abstract the positioning technology used for data collection. This spatio-temporal information system aims the development of a storage system with data analysis and visualization mechanism for moving objects that can represent and store data with spatial characteristics and perform spatial analysis on them. This system is named STAR and incorporates three main components, spatio-temporal databases, GIS and spatio-temporal data. The designed spatio-temporal database enables the storage of the spatio-temporal data and the geometry of the space in which the movement occurs. This geometry is represented by points, lines and/or polygons. The GIS help the processes of analysis and visualization of data. The spatio-temporal data in this project are entitled to moving objects, their movement can be analysed in STAR. The results obtained so far are promising in demonstrating how the system is capable of storing positioning data with different formats, and in applying analysis, visualizations and different representations of the analysed data
    corecore