383 research outputs found

    SQL versus NoSQL Databases for Geospatial Applications

    Get PDF
    In the last years, we are witnessing an increasing availability of geolocated data, ranging from satellite images to user generated content (e.g., tweets). This big amount of data is exploited by several cloud-based applications to deliver effective and customized services to end users. In order to provide a good user experience, a low-latency response time is needed, both when data are retrieved and provided. To achieve this goal, current geospatial applications need to exploit efficient and scalable geospatial databases, the choice of which has a high impact on the overall performance of the deployed applications. In this paper, we compare, from a qualitative point of view, four state-of-the-art SQL and NoSQL databases with geospatial features, and then we analyze the performances of two of them, selecting the ones based on the Database-as-a-service (DBaaS) model: Azure SQL Database and Azure DocumentDB (i.e., an SQL database versus a NoSQL one). The empirical evaluation shows pros and cons of both solutions and it is performed on a real use case related to an emergency management application

    Evaluación del tiempo de respuesta de un geoservicio utilizando una base de datos híbrida y distribuida

    Get PDF
    Web mapping services provide information directly to users and other software programs that can consume and produce information. One of the main challenges this type of service presents is improving its performance. Therefore, in this research, a new geoservice integrated into GeoServer was developed, called GeoToroTur, with an OWS implementation of vector layers that consumes the information from a hybrid and distributed database that was implemented with PostgreSQL and MongoDB, making use of ToroDB for document replication. This geoservice was evaluated by executing geographic and descriptive attribute filter queries. Based on the results, we can conclude that the response time for GeoToroTur is shorter than that for Geoserver.Los servicios de cartografía Web proporcionan información directamente, no sólo a los usuarios, sino también a otros programas de software que pueden consumir y producir información. Uno de los principales retos que presentan este tipo de servicios es mejorar su rendimiento. Por ello, en esta investigación se desarrolló un nuevo geoservicio integrado a GeoServer, denominado GeoToroTur con una implementación OWS de capas vectoriales que consume la información de una base de datos híbrida y distribuida que fue implementada con PostgreSQL y MongoDB haciendo uso de ToroDB para la replicación de documentos. Este geoservicio fue evaluado mediante la ejecución de consultas geográficas y de filtro de atributos descriptivos. Los resultados obtenidos permiten concluir que el geoservicio GeoToroTur tiene un menor tiempo de respuesta que Geoserver

    Benchmarking Scalability of NoSQL Databases for Geospatial Queries

    Get PDF
    NoSQL databases provide an edge when it comes to dealing with big unstructured data. Flexibility, agility, and scalability offered by NoSQL databases become increasingly essential when dealing with geospatial data. The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of data that the data stores must manage. Such characteristics of big spatial data surpassed the capability and anticipated use cases of relational databases. Because we can choose from an extensive collection of NoSQL databases these days, it becomes vital for organizations to make an informed decision. NoSQL Database benchmarks provide system architects, who shoulder a considerable burden of selecting the right technology for their data stores, with a vital start point and source of information. The major utility of these benchmarks is reproducing experiments on similar experimental data that can verify and optimize the process of selecting an optimum tool for data management needs in the early phases of the development. The goal of this research is to develop a benchmark that can compare the performance of NoSQL databases for querying complex geospatial data. We have analyzed throughputs, latencies, and runtime of MongoDB and Couchbase to identify the correct fit for our use case. This way we have also demonstrated a systematic process that can be followed to make an optimum choice of datastore. This benchmark can be extended easily to any NoSQL database that supports geospatial querying

    GeoYCSB: A Benchmark Framework for the Performance and Scalability Evaluation of Geospatial NoSQL Databases

    Get PDF
    The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of spatial data that data stores have to manage. Traditional relational databases reveal limitations in handling such big geospatial data, mainly due to their rigid schema requirements and limited scalability. Numerous NoSQL databases have emerged and actively serve as alternative data stores for big spatial data. This study presents a framework, called GeoYCSB, developed for benchmarking NoSQL databases with geospatial workloads. To develop GeoYCSB, we extend YCSB, a de facto benchmark framework for NoSQL systems, by integrating into its design architecture the new components necessary to support geospatial workloads. GeoYCSB supports both microbenchmarks and macrobenchmarks and facilitates the use of real datasets in both. It is extensible to evaluate any NoSQL database, provided they support spatial queries, using geospatial workloads performed on datasets of any geometric complexity. We use GeoYCSB to benchmark two leading document stores, MongoDB and Couchbase, and present the experimental results and analysis. Finally, we demonstrate the extensibility of GeoYCSB by including a new dataset consisting of complex geometries and using it to benchmark a system with a wide variety of geospatial queries: Apache Accumulo, a wide-column store, with the GeoMesa framework applied on top

    Incorporating Census Data into a Geospatial Student Database

    Get PDF
    The University of New Mexico(UNM) stores data on students, faculty, and staff at the University. The data is used to generate reports and fill surveys for several local, statewide and nationwide reporting entities. The reports convey statistical and analytical information such as the graduation rates, retention, performance, ethnicity, age, and gender of students. Furthermore, the Institute of Design and Innovation (IDI), and the Office of Institutional Analytics (OIA) at UNM use the data provided for various predictive studies aimed at improving student outcomes. This thesis proposes geospatial data as an additional layer of information for the data repository. The paper runs through the general steps involved in setting up a geospatial database using PostgreSQL and geospatial extensions including PostGIS, Tiger Geocoder, and Address Standardizer. With geospatial functionality incorporated into the data repository, the university can know how far students live, which amenities are in proximity to students, and other geospatial features which describe students’ journeys through college. To demonstrate how the university could exploit geospatial functionality a dataset of UNM students is spatially joined to socioeconomic data from the United States’ Census Bureau. Various student related geospatial queries are shown, as well as, how to set up a geospatial database

    NoSQL Databases in Kubernetes

    Get PDF
    With the increasing popularity of deploying applications in containers, Kubernetes (K8s) has become one of the most accepted container orchestration systems. Kubernetes helps maintain containers smoothly and simplifies DevOps with powerful automations. It was originally developed as a tool to manage stateless microservices that run seamlessly in containers. The ephemeral nature of pods, the smallest deployable unit, in Kubernetes was well-aligned with stateless applications since destroying and recreating pods didn’t impact applications. There was a need to provision solutions around stateful workloads like databases so as to take advantage of K8s. This project explores this need, the challenges associated and the available solutions for running databases in Kubernetes. Most of the current research is focused towards SQL-like databases in K8s even though the DNA of NoSQL distributed databases is more aligned with K8s. With no research being done with NoSQL databases, this project outlines the process behind setting up two famous NoSQL databases in K8s: MongoDB and Cassandra. The project also shows a representative viewpoint of the performance comparison between them using the YCSB benchmark. The project lays a foundation around the setup of these databases using K8s Operators and their benchmarking. The goal of the project is to describe the advantages of having databases in K8s, provide developers a clear path for setup and provide insights on basic benchmark performance
    corecore