10 research outputs found

    Sustainable system design for gridded, spatio-temporal, agroecosystem forecasting models

    Full text link

    Towards intelligent geo-database support for earth system observation: Improving the preparation and analysis of big spatio-temporal raster data

    Get PDF
    The European COPERNICUS program provides an unprecedented breakthrough in the broad use and application of satellite remote sensing data. Maintained on a sustainable basis, the COPERNICUS system is operated on a free-and-open data policy. Its guaranteed availability in the long term attracts a broader community to remote sensing applications. In general, the increasing amount of satellite remote sensing data opens the door to the diverse and advanced analysis of this data for earth system science. However, the preparation of the data for dedicated processing is still inefficient as it requires time-consuming operator interaction based on advanced technical skills. Thus, the involved scientists have to spend significant parts of the available project budget rather on data preparation than on science. In addition, the analysis of the rich content of the remote sensing data requires new concepts for better extraction of promising structures and signals as an effective basis for further analysis. In this paper we propose approaches to improve the preparation of satellite remote sensing data by a geo-database. Thus the time needed and the errors possibly introduced by human interaction are minimized. In addition, it is recommended to improve data quality and the analysis of the data by incorporating Artificial Intelligence methods. A use case for data preparation and analysis is presented for earth surface deformation analysis in the Upper Rhine Valley, Germany, based on Persistent Scatterer Interferometric Synthetic Aperture Radar data. Finally, we give an outlook on our future research

    Sustainable system design for gridded, spatio-temporal, agroecosystem forecasting models

    Get PDF
    Decision support systems able to capitalize on publicly available high resolution datasets have become increasingly valuable to agroecosystem, hydrologic and urban system stakeholders. In this paper we address the common agroecosystem modeling problem of weather-based risk forecasting. We compare storage system designs for an expandable crop disease forecasting system that relies on multiple gridded weather forecast inputs to artificial neural network disease risk models. A traditional relational database management system (PostgreSQL), a NoSQL database system (MongoDB) and a scientific file format version (netCDF) of a single crop disease risk modeling system in one region of the country, for potato late blight in the US Great Lakes region, were designed and compared for speed. To test expandability, another crop disease risk modeling system, for modeling risk of economically significant deoxynivalenol (eDON) accumulation due to Fusarium head blight of barley in the northern US Great Plains, was also created in the three formats. Speeds for the three types of systems were fairly similar. Expandability, which is becoming highly desirable in agroecosystem model design, differed based on designer priorities.</jats:p

    TOWARDS AN INTELLIGENT PLATFORM FOR BIG 3D GEOSPATIAL DATA MANAGEMENT

    Get PDF
    The use of intelligent technologies within 3D geospatial data analysis and management will decidedly open the door towards efficiency, cost transparency, and on-time schedules in planning processes. Furthermore, the mission of smart cities as a future option of urban development can lead to an environment that provides high-quality life along stable structures. However, neither geospatial information systems nor building information modelling systems seem to be well prepared for this new development. After a review of current approaches and a discussion of their limitations we present our approach on the way to an intelligent platform for the management and analysis of big 3D geospatial data focusing on infrastructure projects such as metro or railway tracks planning. three challenges are presented focusing on the management of big geospatial data with existing geo-database management systems, the integration of heterogeneous data, and the 3D visualization for database query formulation and query results. The approach for the development of a platform for big geospatial data analysis is discussed. Finally, we give an outlook on our future research supporting intelligent 3D city applications in the United Arab Emirates

    IMPROVING DATA QUALITY AND MANAGEMENT FOR REMOTE SENSING ANALYSIS: USE-CASES AND EMERGING RESEARCH QUESTIONS

    Get PDF
    During the last decades satellite remote sensing has become an emerging technology producing big data for various application fields every day. However, data quality checking as well as the long-time management of data and models are still issues to be improved. They are indispensable to guarantee smooth data integration and the reproducibility of data analysis such as carried out by machine learning models. In this paper we clarify the emerging need of improving data quality and the management of data and models in a geospatial database management system before and during data analysis. In different use cases various processes of data preparation and quality checking, integration of data across different scales and references systems, efficient data and model management, and advanced data analysis are presented in detail. Motivated by these use cases we then discuss emerging research questions concerning data preparation and data quality checking, data management, model management and data integration. Finally conclusions drawn from the paper are presented and an outlook on future research work is given

    Visões em bancos de dados de grafos : uma abordagem multifoco para dados heterogêneos

    Get PDF
    Orientador: Claudia Maria Bauzer MedeirosTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A pesquisa científica tornou-se cada vez mais dependente de dados. Esse novo paradigma de pesquisa demanda técnicas e tecnologias computacionais sofisticadas para apoiar tanto o ciclo de vida dos dados científicos como a colaboração entre cientistas de diferentes áreas. Uma demanda recorrente em equipes multidisciplinares é a construção de múltiplas perspectivas sobre um mesmo conjunto de dados. Soluções atuais cobrem vários aspectos, desde o projeto de padrões de interoperabilidade ao uso de sistemas de gerenciamento de bancos de dados não-relacionais. Entretanto, nenhum desses esforços atende de forma adequada a necessidade de múltiplas perspectivas, denominadas focos nesta tese. Em termos gerais, um foco é projetado e construído para atender um determinado grupo de pesquisa (mesmo no escopo de um único projeto) que necessita manipular um subconjunto de dados de interesse em múltiplos níveis de agregação/generalização. A definição e criação de um foco são tarefas complexas que demandam mecanismos capazes de manipular múltiplas representações de um mesmo fenômeno do mundo real. O objetivo desta tese é prover múltiplos focos sobre dados heterogêneos. Para atingir esse objetivo, esta pesquisa se concentrou em quatro principais problemas. Os problemas inicialmente abordados foram: (1) escolher um paradigma de gerenciamento de dados adequado e (2) elencar os principais requisitos de pesquisas multifoco. Nossos resultados nos direcionaram para a adoção de bancos de dados de grafos como solução para o problema (1) e a utilização do conceito de visões, de bancos de dados relacionais, para o problema (2). Entretanto, não há consenso sobre um modelo de dados para bancos de dados de grafos e o conceito de visões é pouco explorado nesse contexto. Com isso, os demais problemas tratados por esta pesquisa são: (3) a especificação de um modelo de dados de grafos e (4) a definição de um framework para manipular visões em bancos de dados de grafos. Nossa pesquisa nesses quatro problemas resultaram nas contribuições principais desta tese: (i) apontar o uso de bancos de dados de grafos como camada de persistência em pesquisas multifoco - um tipo de banco de dados de esquema flexível e orientado a relacionamentos que provê uma ampla compreensão sobre as relações entre os dados; (ii) definir visões para bancos de dados de grafos como mecanismo para manipular múltiplos focos, considerando operações de manipulação de dados em grafos, travessias e algoritmos de grafos; (iii) propor um modelo de dados para grafos - baseado em grafos de propriedade - para lidar com a ausência de um modelo de dados pleno para grafos; (iv) especificar e implementar um framework, denominado Graph-Kaleidoscope, para prover o uso de visões em bancos de dados de grafos e (v) validar nosso framework com dados reais em aplicações distintas - em biodiversidade e em recursos naturais - dois típicos exemplos de pesquisas multidisciplinares que envolvem a análise de interações de fenômenos a partir de dados heterogêneosAbstract: Scientific research has become data-intensive and data-dependent. This new research paradigm requires sophisticated computer science techniques and technologies to support the life cycle of scientific data and collaboration among scientists from distinct areas. A major requirement is that researchers working in data-intensive interdisciplinary teams demand construction of multiple perspectives of the world, built over the same datasets. Present solutions cover a wide range of aspects, from the design of interoperability standards to the use of non-relational database management systems. None of these efforts, however, adequately meet the needs of multiple perspectives, which are called foci in the thesis. Basically, a focus is designed/built to cater to a research group (even within a single project) that needs to deal with a subset of data of interest, under multiple ggregation/generalization levels. The definition and creation of a focus are complex tasks that require mechanisms and engines to manipulate multiple representations of the same real world phenomenon. This PhD research aims to provide multiple foci over heterogeneous data. To meet this challenge, we deal with four research problems. The first two were (1) choosing an appropriate data management paradigm; and (2) eliciting multifocus requirements. Our work towards solving these problems made as choose graph databases to answer (1) and the concept of views in relational databases for (2). However, there is no consensual data model for graph databases and views are seldom discussed in this context. Thus, research problems (3) and (4) are: (3) specifying an adequate graph data model and (4) defining a framework to handle views on graph databases. Our research in these problems results in the main contributions of this thesis: (i) to present the case for the use of graph databases in multifocus research as persistence layer - a schemaless and relationship driven type of database that provides a full understanding of data connections; (ii) to define views for graph databases to support the need for multiple foci, considering graph data manipulation, graph algorithms and traversal tasks; (iii) to propose a property graph data model (PGDM) to fill the gap of absence of a full-fledged data model for graphs; (iv) to specify and implement a framework, named Graph-Kaleidoscope, that supports views over graph databases and (v) to validate our framework for real world applications in two domains - biodiversity and environmental resources - typical examples of multidisciplinary research that involve the analysis of interactions of phenomena using heterogeneous dataDoutoradoCiência da ComputaçãoDoutora em Ciência da Computaçã

    Geospatial Inference and Management of Utility Infrastructure Networks

    Get PDF
    Ph. D. Thesis.Modern cities consist of spatially and temporally complex networks that connect urban infrastructure assets to the buildings they service. Critical infrastructure networks include transport, electricity, water supply, waste water and gas, all of which play a key role in the functioning of modern cities. Understanding network spatial connectivity, resource flow, dependencies and interdependencies is essential for infrastructure planning, management, and assessment of system robustness and resilience. However, there is a sparsity of fine spatial scale data from which such understanding can be derived or inferred. Often data is held within commercially sensitive organisations and may be incomplete topologically and/or spatially. Thus, there is an urgent need to develop new approaches to the integrated inference, management and analysis of the complex utility infrastructure networks. Such approaches should allow the highly granular representation of utility network connectivity to be represented in a spatially explicit manner, employing methods of data and information management to ensure they are scalable and generic. This thesis presents the development of such an approach, one that employs a geospatial ontology to formally define the key entities, attributes and relationships of fine spatial scale utility infrastructure networks. This ontology is used as the conceptual framework for the development of a suite of algorithms that allow the heuristic inference of the spatial layout of utility infrastructure networks for any urban conurbation within the UK. This is demonstrated via several case studies where the electricity feeder network between substations and buildings is generated for several different cities within the UK. Validation against the known network for the city of Newcastle upon Tyne indicates that the network can be inferred to high levels of accuracy (about 90%). Moreover, the algorithm is shown to be a transferable to the inference and integration of other utility infrastructure networks (gas, water supply, waste water, and new road layouts). ii The representation, management and analysis of such spatially complex and large utility networks is, however, a major challenge. The efficient storage, management and analysis of such spatial networks is explored via a comparison of a traditional RDMS approach (PgRouting within Postgres), spatial database (PostGIS) and a NoSQL graph-database (Neo4j), as well as a bespoke hybrid spatial-graph framework (combination of PostGIS and Neo4j). A suite of comparison tests of data writing, data reading and complex network analysis demonstrated that significant performance benefits in the use of the NoSQL graph database approach for data read (around 210% faster) and network analysis (between 420 and 1170 % faster). However, this was at the expenses of data writing which was found to be between 135 and 150% slower.MISTRAL project, School of Engineering at Newcastle University

    Deep learning y big data en cartografía digital. Creación de inteligencias artificiales para el tratamiento de ortofotografías y sistemas de información geográfica tridimensionales

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Filosofía y Letras. Departamento de Geografía. Fecha de Lectura: 16-07-202

    Evaluation of Data Management Systems for Geospatial Big Data

    Get PDF
    Big Data encompasses collection, management, processing and analysis of the huge amount of data that varies in types and changes with high frequency. Often data component of Big Data has a positional component as an important part of it in various forms, such as postal address, Internet Protocol (IP) address and geographical location. If the positional components in Big Data extensively used in storage, retrieval, analysis, processing, visualization and knowledge discovery (geospatial Big Data) the Big Data systems need certain type of techniques and algorithms for management, analytics and sharing. This paper describes the concept of geospatial Big Data management with focus on using typical and modern database management systems. Then the typical and modern types of databases for management of geospatial Big Data are evaluated based on model for storage, query languages, handling connected data, distribution models and schema evolution. As the results of the evaluations and benchmarks of this paper illustrate there is no single solution for efficient management of geospatial Big Data and in order to utilize unique characteristics of geospatial Big Data (such as topological, directional and distance relationship) a polyglot geospatial data persistence system is needed
    corecore