364 research outputs found

    GeoTriples: a Tool for Publishing Geospatial Data as RDF Graphs Using R2RML Mappings

    Get PDF
    In this paper we present the tool GeoTriples that allows the transformation of Earth Observation data and geospatial data into RDF graphs, by using and extending the R2RML mapping language to be able to deal with the specificities of geospatial data. GeoTriples is a semi-automated tool that transforms geospatial information into RDF following the state of the art vocabularies like GeoSPARQL and stSPARQL, but at the same time it is not tightly coupled to a specific vocabulary

    Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)

    Full text link
    Database management systems (DBMSs) carefully optimize complex multi-join queries to avoid expensive disk I/O. As servers today feature tens or hundreds of gigabytes of RAM, a significant fraction of many analytic databases becomes memory-resident. Even after careful tuning for an in-memory environment, a linear disk I/O model such as the one implemented in PostgreSQL may make query response time predictions that are up to 2X slower than the optimal multi-join query plan over memory-resident data. This paper introduces a memory I/O cost model to identify good evaluation strategies for complex query plans with multiple hash-based equi-joins over memory-resident data. The proposed cost model is carefully validated for accuracy using three different systems, including an Amazon EC2 instance, to control for hardware-specific differences. Prior work in parallel query evaluation has advocated right-deep and bushy trees for multi-join queries due to their greater parallelization and pipelining potential. A surprising finding is that the conventional wisdom from shared-nothing disk-based systems does not directly apply to the modern shared-everything memory hierarchy. As corroborated by our model, the performance gap between the optimal left-deep and right-deep query plan can grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in SoCC'1

    GeoTriples: Transforming geospatial data into RDF graphs using R2RML and RML mappings

    Get PDF
    A lot of geospatial data has become available at no charge in many countries recently. Geospatial data that is currently made available by government agencies usually do not follow the linked data paradigm. In the few cases where government agencies do follow the linked data paradigm (e.g., Ordnance Survey in the United Kingdom), specialized scripts have been used for transforming geospatial data into RDF. In this paper we present the open source tool GeoTriples which generates and processes extended R2RML and RML mappings that transform geospatial data from many input formats into RDF. GeoTriples allows the transformation of geospatial data stored in raw files (shapefiles, CSV, KML, XML, GML and GeoJSON) and spatially-enabled RDBMS (PostGIS and MonetDB) into RDF graphs using well-known vocabularies like GeoSPARQL and stSPARQL, but without being tightly coupled to a specific vocabulary. GeoTriples has been developed in European projects LEO and Melodies and has been used to transform many geospatial data sources into linked data. We study the performance of GeoTriples experimentally using large publicly available geospatial datasets, and show that GeoTriples is very efficient and scalable especially when its mapping processor is implemented using Apache Hadoop

    Data Vaults: a Database Welcome to Scientiļ¬c File Repositories

    Get PDF
    Efficient management and exploration of high-volume scientific file repositories have become pivotal for advancement in science. We propose to demonstrate the Data Vault, an extension of the database system architecture that transparently opens scientific file repositories for efficient in-database processing and exploration. The Data Vault facilitates science data analysis using high-level declarative languages, such as the traditional SQL and the novel array-oriented SciQL. Data of interest are loaded from the attached repository in a just-in-time manner without need for up-front data ingestion. The demo is built around concrete implementations of the Data Vault for two scientific use cases: seismic time series and Earth observation images. The seismic Data Vault uses the queries submitted by the audience to illustrate the internals of Data Vault functioning by revealing the mechanisms of dynamic query plan generation and on-demand external data ingestion. The image Data Vault shows an application view from the perspective of data mining researchers

    The Repeatability Experiment of SIGMOD 2008

    Get PDF
    SIGMOD 2008 was the first database conference that offered to test submitters' programs against their data to verify the experiments published. This paper discusses the rationale for this effort, the community's reaction, our experiences, and advice for future similar efforts

    Thinking Big in a Small World ā€” Efficient Query Execution on Small-Scale SMPs

    Full text link
    Many techniques developed for parallel database systems were focused on large-scale, often prototypical, hardware platforms. Therefore, most results cannot easily be transfered to widely available workstation clusters such as multiprocessor workstations. In this paper we address exploitation of pipelining parallelism in query processing on small multiprocessor environments. We present DTE/R, a strategy for executing pipelining segments of arbitrary length by replicating the segment's operator. Therefore, DTE/R avoids static processor-to-operator assignment of conventional processing techniques. Consequently, DTE/R achieves automatic load-balancing and skew-handling. Furthermore, DTE/R outperforms conventional pipelining execution techniques substantially

    Managing big, linked, and open earth-observation data: Using the TELEIOS/LEO software stack

    Get PDF
    Big Earth-observation (EO) data that are made freely available by space agencies come from various archives. Therefore, users trying to develop an application need to search within these archives, discover the needed data, and integrate them into their application. In this article, we argue that if EO data are published using the linked data paradigm, then the data discovery, data integration, and development of applications becomes easier. We present the life cycle of big, linked, and open EO data and show how to support their various stages using the software stack developed by the European Union (EU) research projects TELEIOS and the Linked Open EO Data for Precision Farming (LEO). We also show how this stack of tools can be used to implement an operational wildfire-monitoring service

    Wildfire monitoring via the integration of remote sensing with innovative information technologies

    Get PDF
    In the Institute for Space Applications and Remote Sensing of the National Observatory of Athens (ISARS/NOA) volumes of Earth Observation images of different spectral and spatial resolutions are being processed on a systematic basis to derive thematic products that cover a wide spectrum of applications during and after wildfire crisis, from fire detection and fire-front propagation monitoring, to damage assessment in the inflicted areas. The processed satellite imagery is combined with auxiliary geo-information layers, including land use/land cover, administrative boundaries, road and rail network, points of interest, and meteorological data to generate and validate added-value fire-related products. The service portfolio has become available to institutional End Users with a mandate to act on natural disasters and that have activated Emergency Support Services at a European level in the framework of the operational GMES projects SAFER and LinkER. Towards the goal of delivering integrated services for fire monitoring and management, ISARS/NOA employs observational capacities which include the operation of MSG/SEVIRI and NOAA/AVHRR receiving stations, NOA's in-situ monitoring networks for capturing meteorological parameters to generate weather forecasts, and datasets originating from the European Space Agency and third party satellite operators. The qualified operational activity of ISARS/NOA in the domain of wildfires management is highly enhanced by the integration of state-of-the-art Information Technologies that have become available in the framework of the TELEIOS (EC/ICT) project. TELEIOS aims at the development of fully automatic processing chains reliant on a) the effective storing and management of the large amount of EO and GIS data, b) the post-processing refinement of the fire products using semantics, and c) the creation of thematic maps and added-value services. The first objective is achieved with the use of advanced Array Database technologies, such as MonetDB, to enable efficiency in accessing large archives of image data and metadata in a fully transparent way, without worrying for their format, size, and location, as well as efficiency in processing such data using state-of-the-art implementations of image processing algorithms expressed in a high-level Scientific Query Language (SciQL). The product refinement is realized through the application of update operations that incorporate human evidence and human logic, with semantic content extracted from thematic information coming from auxiliary geo-information layers and sources, for reducing considerably the number of false alarms in fire detection, and improving the credibility of the burnt area assessment. The third objective is approached via the combination of the derived fire-products with Linked Geospatial Data, structured accordingly and freely available in the web, using Semantic Web technologies. These technologies are built on top of a robust and modular computational environment, to facilitate several wildfire applications to run efficiently, such as real-time fire detection, fire-front propagation monitoring, rapid burnt area mapping, after crisis detailed burnt scar mapping, and time series analysis of burnt areas. The approach adopted allows ISARS/NOA to routinely serve requests from the end-user community, irrespective of the area of interest and its extent, the observation time period, or the data volume involved, granting the opportunity to combine innovative IT solutions with remote sensing techniques and

    Operational Wildfire Monitoring and Disaster Management Support Using State-of-the-art EO and Information Technologies

    Get PDF
    Fires have been one of the main driving forces in the evolution of plants and ecosystems, determining the current structure and composition of the Landscapes. However, significant alterations in the fire regime have occurred in the recent decades, primarily as a result of socioeconomic changes, increasing dramatically the catastrophic impacts of wildfires as it is reflected in the increase during the 20th century of both, number of fires and the annual area burnt. Therefore, the establishment of a permanent robust fire monitoring system is of paramount importance to implement an effective environmental management policy. Such an integrated system has been developed in the Institute for Space Applications and Remote Sensing of the National Observatory of Athens (ISARS/NOA). Volumes of Earth Observation images of different spectral and spatial resolutions are being processed on a systematic basis to derive thematic products that cover a wide spectrum of applications during and after wildfire crisis, from fire detection and fire-front propagation monitoring, to damage assessment in the inflicted areas. The processed satellite imagery is combined with auxiliary geo-information layers and meteorological data to generate and validate added-value fire-related products. The service portfolio has become available to institutional End Users with a mandate to act on natural disasters in the framework of the operational GMES projects SAFER and LinkER addressing fire emergency response and emergency support needs for the entire European Union. Towards the goal of delivering integrated services for fire monitoring and management, ISARS/NOA employs observational capacities which include the operation of MSG/SEVIRI and NOAA/AVHRR receiving stations, NOAā€™s in-situ monitoring networks for capturing meteorological parameters to generate weather forecasts, and datasets originating from the European Space Agency and third party satellite operators. The qualified operational activity of ISARS/NOA in the domain of wildfires management is highly enhanced by the integra

    Real-Time Wildfire Monitoring Using Scientific Database and Linked Data Technologies

    Get PDF
    We present a real-time wildfire monitoring service that exploits satellite images and linked geospatial data to detect hotspots and monitor the evolution of fire fronts. The service makes heavy use of scientific database technologies (array databases, SciQL, data vaults) and linked data technologies (ontologies, linked geospatial data, stSPARQL) and is implemented on top of MonetDB and Strabon. The service is now operational at the National Observatory of Athens and has been used during the previous summer by emergency managers monitoring wildfires in Greece
    • ā€¦
    corecore