15 research outputs found

    A HADOOP-BASED ALGORITHM OF GENERATING DEM GRID FROM POINT CLOUD DATA

    Get PDF

    Automatic Scaling Hadoop in the Cloud for Efficient Process of Big Geospatial Data

    Get PDF
    Efficient processing of big geospatial data is crucial for tackling global and regional challenges such as climate change and natural disasters, but it is challenging not only due to the massive data volume but also due to the intrinsic complexity and high dimensions of the geospatial datasets. While traditional computing infrastructure does not scale well with the rapidly increasing data volume, Hadoop has attracted increasing attention in geoscience communities for handling big geospatial data. Recently, many studies were carried out to investigate adopting Hadoop for processing big geospatial data, but how to adjust the computing resources to efficiently handle the dynamic geoprocessing workload was barely explored. To bridge this gap, we propose a novel framework to automatically scale the Hadoop cluster in the cloud environment to allocate the right amount of computing resources based on the dynamic geoprocessing workload. The framework and auto-scaling algorithms are introduced, and a prototype system was developed to demonstrate the feasibility and efficiency of the proposed scaling mechanism using Digital Elevation Model (DEM) interpolation as an example. Experimental results show that this auto-scaling framework could (1) significantly reduce the computing resource utilization (by 80% in our example) while delivering similar performance as a full-powered cluster; and (2) effectively handle the spike processing workload by automatically increasing the computing resources to ensure the processing is finished within an acceptable time. Such an auto-scaling approach provides a valuable reference to optimize the performance of geospatial applications to address data- and computational-intensity challenges in GIScience in a more cost-efficient manner

    Quality test of interpolation methods on steepness regions for the use in surface modelling

    Get PDF
    Modeliranje površine je široko korištena metodologija u svim vrstama istraživanja vezanih za tlo. Postoje mnoge metode interpolacije koje se koriste za razvijanje modela uporabom mjernih točaka, uzoraka, na tlu. Dobivena kvaliteta određene metode interpolacije uvelike ovisi o točnosti, količini i raspodjeli izabranih uzoraka koji odražavaju topografiju proučavanog područja. Cilj je ovoga rada ispitati kvalitetu četiriju metoda interpolacije, odnosno metoda Kriging, Modified Shepard’s, inverzno ponderiranje udaljenosti i Radial Basis Function, uzimajući u obzir visinske razlike između susjednih mjesta. Za provjeru kvalitete visinskih komponenti u okviru proučavanog područja, dobivenih primjenom različitih modela interpolacije, kreirane su četiri umjetne površine naglih promjena visine. Odstupanja od standarda korištena za usporedbu kvalitete modela interpolacije određena su primjenom razlika između vrijednosti visina na kontrolnim točkama. Te su vrijednosti uzete kao prave i interpolirane vrijednosti za iste točke.Surface modelling has been a widely used methodology for interdisciplinary facilities in all kinds of earth-related studies. There are many interpolation methods applied for model generation using the measured points, samples, on the ground. The quality of the outcomes of an interpolation method is highly related to the accuracy, quantity, and distribution of the selected samples reflecting the topography within the study area. This study aims to examine the quality of four interpolation methods, namely the methods Kriging, Modified Shepard’s, Inverse distance weighting, and Radial Basis Function, considering height differences between the neighbour stations. To check the quality of height components within the study area derived applying different interpolation models, four artificial surfaces with sudden height changes were created. The standard deviations used for comparison of the quality of interpolation models were determined using differences between the height values of control points. These values were set as true and interpolated values for the same points

    Hadoop MapReduce tolerante a faltas bizantinas

    Get PDF
    Tese de mestrado em Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2011O MapReduce é frequentemente usado para executar tarefas críticas, tais como análise de dados científicos. No entanto, evidências na literatura mostram que as faltas ocorrem de forma arbitrária e podem corromper os dados. O Hadoop MapReduce está preparado para tolerar faltas acidentais, mas não tolera faltas arbitrárias ou Bizantinas. Neste trabalho apresenta-se um protótipo do Hadoop MapReduce Tolerante a Faltas Bizantinas(BFT). Uma avaliaçãao experimental mostra que a execução de um trabalho com o algoritmo implementado usa o dobro dos recursos do Hadoop original, em vez de mais 3 ou 4 vezes, como seria alcançado com uma aplicação directa dos paradigmas comuns a tolerância a faltas Bizantinas. Acredita-se que este custo seja aceitável para aplicações críticas que requerem este nível de tolerância a faltas.MapReduce is often used to run critical jobs such as scientific data analysis. However, evidence in the literature shows that arbitrary faults do occur and can probably corrupt the results of MapReduce jobs. MapReduce runtimes like Hadoop tolerate crash faults, but not arbitrary or Byzantine faults. In this work, it is presented a MapReduce algorithm and prototype that tolerate these faults. An experimental evaluation shows that the execution of a job with the implemented algorithm uses twice the resources of the original Hadoop, instead of the 3 or 4 times more that would be achieved with the direct application of common Byzantine fault-tolerance paradigms. It is believed that this cost is acceptable .for critical applications that require that level of fault tolerance

    Integración de Hadoop con planificadores batch

    Get PDF
    Cada vez es mayor el número de aplicaciones desarrolladas en el ámbito científico, como en la Bioinformática o en las Geociencias, escritas bajo el modelo MapReduce, empleando herramientas de código abierto como Apache Hadoop. De la necesidad de integrar Hadoop en entornos HPC, para posibilitar la ejecutar aplicaciones desarrolladas bajo el paradigma MapReduce, nace el presente proyecto. Se analizan dos frameworks diseñados para facilitar dicha integración a los desarrolladores: HoD y myHadoop. En este proyecto se analiza, tanto las posibilidades en cuanto a entornos que ofrecen dichos frameworks para la ejecución de aplicaciones MapReduce, como el rendimiento de los clúster Hadoop generados con HoD o myHadoop respecto a un clúster Hadoop físico.Cada cop és més gran el número d'aplicacions desenvolupades a l'àmbit científic, com la Bioinformàtica o les Geociències, escrites sota el model MapReduce, fent servir eines de codi obert com Apache Hadoop. De la necessitat d'integrar Hadoop en entorns HPC, per permetre l'execució d'aplicacions desenvolupades sota el paradigma MapReduce, neix el present projecte. S'analitzen dos frameworks dissenyats per facilitar aquesta integració als desenvolupadors: HoD y myHadoop. En aquest projecte s'analitza, tant les possibilitats en quan a entorns que ofereixen aquests frameworks per l'execució d'aplicacions MapReduce, com el rendiment dels clústers Hadoop generats amb HoD o myHadoop comparat amb el rendiment d'un clúster Hadoop físic.A growing number of codes in scientific domain such a Bioinformatics and Geosciences are being written using open source MapReduce tools such as Apache Hadoop. Of the need to integrate Hadoop in HPC environments, to make possible to execute applications developed under the MapReduce paradigm, born this project. Two frameworks, designed to facilitate the above mentioned integration to the developers, are analyzed: HoD and myHadoop. In this project, we analyze the possible environments that can be generated with these frameworks, for the execution of MapReduce applications, and the performance of the Hadoop clusters generated with HoD or myHadoop in comparison with a physical Hadoop cluster

    A Novel Approach To Intelligent Navigation Of A Mobile Robot In A Dynamic And Cluttered Indoor Environment

    Get PDF
    The need and rationale for improved solutions to indoor robot navigation is increasingly driven by the influx of domestic and industrial mobile robots into the market. This research has developed and implemented a novel navigation technique for a mobile robot operating in a cluttered and dynamic indoor environment. It divides the indoor navigation problem into three distinct but interrelated parts, namely, localization, mapping and path planning. The localization part has been addressed using dead-reckoning (odometry). A least squares numerical approach has been used to calibrate the odometer parameters to minimize the effect of systematic errors on the performance, and an intermittent resetting technique, which employs RFID tags placed at known locations in the indoor environment in conjunction with door-markers, has been developed and implemented to mitigate the errors remaining after the calibration. A mapping technique that employs a laser measurement sensor as the main exteroceptive sensor has been developed and implemented for building a binary occupancy grid map of the environment. A-r-Star pathfinder, a new path planning algorithm that is capable of high performance both in cluttered and sparse environments, has been developed and implemented. Its properties, challenges, and solutions to those challenges have also been highlighted in this research. An incremental version of the A-r-Star has been developed to handle dynamic environments. Simulation experiments highlighting properties and performance of the individual components have been developed and executed using MATLAB. A prototype world has been built using the WebotsTM robotic prototyping and 3-D simulation software. An integrated version of the system comprising the localization, mapping and path planning techniques has been executed in this prototype workspace to produce validation results

    Coastal management and adaptation: an integrated data-driven approach

    Get PDF
    Coastal regions are some of the most exposed to environmental hazards, yet the coast is the preferred settlement site for a high percentage of the global population, and most major global cities are located on or near the coast. This research adopts a predominantly anthropocentric approach to the analysis of coastal risk and resilience. This centres on the pervasive hazards of coastal flooding and erosion. Coastal management decision-making practices are shown to be reliant on access to current and accurate information. However, constraints have been imposed on information flows between scientists, policy makers and practitioners, due to a lack of awareness and utilisation of available data sources. This research seeks to tackle this issue in evaluating how innovations in the use of data and analytics can be applied to further the application of science within decision-making processes related to coastal risk adaptation. In achieving this aim a range of research methodologies have been employed and the progression of topics covered mark a shift from themes of risk to resilience. The work focuses on a case study region of East Anglia, UK, benefiting from the input of a partner organisation, responsible for the region’s coasts: Coastal Partnership East. An initial review revealed how data can be utilised effectively within coastal decision-making practices, highlighting scope for application of advanced Big Data techniques to the analysis of coastal datasets. The process of risk evaluation has been examined in detail, and the range of possibilities afforded by open source coastal datasets were revealed. Subsequently, open source coastal terrain and bathymetric, point cloud datasets were identified for 14 sites within the case study area. These were then utilised within a practical application of a geomorphological change detection (GCD) method. This revealed how analysis of high spatial and temporal resolution point cloud data can accurately reveal and quantify physical coastal impacts. Additionally, the research reveals how data innovations can facilitate adaptation through insurance; more specifically how the use of empirical evidence in pricing of coastal flood insurance can result in both communication and distribution of risk. The various strands of knowledge generated throughout this study reveal how an extensive range of data types, sources, and advanced forms of analysis, can together allow coastal resilience assessments to be founded on empirical evidence. This research serves to demonstrate how the application of advanced data-driven analytical processes can reduce levels of uncertainty and subjectivity inherent within current coastal environmental management practices. Adoption of methods presented within this research could further the possibilities for sustainable and resilient management of the incredibly valuable environmental resource which is the coast

    GEOBIA 2016 : Solutions and Synergies., 14-16 September 2016, University of Twente Faculty of Geo-Information and Earth Observation (ITC): open access e-book

    Get PDF

    Geospatial Computing: Architectures and Algorithms for Mapping Applications

    Get PDF
    Beginning with the MapTube website (1), which was launched in 2007 for crowd-sourcing maps, this project investigates approaches to exploratory Geographic Information Systems (GIS) using web-based mapping, or ‘web GIS’. Users can log in to upload their own maps and overlay different layers of GIS data sets. This work looks into the theory behind how web-based mapping systems function and whether their performance can be modelled and predicted. One of the important questions when dealing with different geospatial data sets is how they relate to one another. Internet data stores provide another source of information, which can be exploited if more generic geospatial data mining techniques are developed. The identification of similarities between thousands of maps is a GIS technique that can give structure to the overall fabric of the data, once the problems of scalability and comparisons between different geographies are solved. After running MapTube for nine years to crowd-source data, this would mark a natural progression from visualisation of individual maps to wider questions about what additional knowledge can be discovered from the data collected. In the new ‘data science’ age, the introduction of real-time data sets introduces a new challenge for web-based mapping applications. The mapping of real-time geospatial systems is technically challenging, but has the potential to show inter-dependencies as they emerge in the time series. Combined geospatial and temporal data mining of realtime sources can provide archives of transport and environmental data from which to accurately model the systems under investigation. By using techniques from machine learning, the models can be built directly from the real-time data stream. These models can then be used for analysis and experimentation, being derived directly from city data. This then leads to an analysis of the behaviours of the interacting systems. (1) The MapTube website: http://www.maptube.org
    corecore