Search CORE

227 research outputs found

GeoYCSB: A Benchmark Framework for the Performance and Scalability Evaluation of Geospatial NoSQL Databases

Author: Hoang Yvonne
Kanwar Yuvraj Singh
Kim Suneuy
Yu Tsz Ting
Publication venue: 'Elsevier BV'
Publication date: 28/02/2023
Field of study

The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of spatial data that data stores have to manage. Traditional relational databases reveal limitations in handling such big geospatial data, mainly due to their rigid schema requirements and limited scalability. Numerous NoSQL databases have emerged and actively serve as alternative data stores for big spatial data. This study presents a framework, called GeoYCSB, developed for benchmarking NoSQL databases with geospatial workloads. To develop GeoYCSB, we extend YCSB, a de facto benchmark framework for NoSQL systems, by integrating into its design architecture the new components necessary to support geospatial workloads. GeoYCSB supports both microbenchmarks and macrobenchmarks and facilitates the use of real datasets in both. It is extensible to evaluate any NoSQL database, provided they support spatial queries, using geospatial workloads performed on datasets of any geometric complexity. We use GeoYCSB to benchmark two leading document stores, MongoDB and Couchbase, and present the experimental results and analysis. Finally, we demonstrate the extensibility of GeoYCSB by including a new dataset consisting of complex geometries and using it to benchmark a system with a wide variety of geospatial queries: Apache Accumulo, a wide-column store, with the GeoMesa framework applied on top

SJSU ScholarWorks

A Review of Elastic Search: Performance Metrics and challenges

Author: Subhani Shaik, Nallamothu Naga Malleswara Rao
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/11/2017
Field of study

The most important aspect of a search engine is the search. Elastic search is a highly scalable search engine that stores data in a structure, optimized for language based searches. When it comes to using Elastic search, there are lots of metrics engendered. By using Elastic search to index millions of code repositories as well as indexing critical event data, you can satisfy the search needs of millions of users while instantaneously providing strategic operational visions that help you iteratively improve customer service. In this paper we are going to study about Elastic searchperformance metrics to watch, important Elastic search challenges, and how to deal with them. This should be helpful to anyone new to Elastic search, and also to experienced users who want a quick start into performance monitoring of Elastic search

International Journal on Recent and Innovation Trends in Computing and Communication

Framework for resource efficient profiling of spatial model performance, A

Author: Carlson Caleb
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2022
Field of study

2022 Summer.Includes bibliographical references.We design models to understand phenomena, make predictions, and/or inform decision-making. This study targets models that encapsulate spatially evolving phenomena. Given a model M, our objective is to identify how well the model predicts across all geospatial extents. A modeler may expect these validations to occur at varying spatial resolutions (e.g., states, counties, towns, census tracts). Assessing a model with all available ground-truth data is infeasible due to the data volumes involved. We propose a framework to assess the performance of models at scale over diverse spatial data collections. Our methodology ensures orchestration of validation workloads while reducing memory strain, alleviating contention, enabling concurrency, and ensuring high throughput. We introduce the notion of a validation budget that represents an upper-bound on the total number of observations that are used to assess the performance of models across spatial extents. The validation budget attempts to capture the distribution characteristics of observations and is informed by multiple sampling strategies. Our design allows us to decouple the validation from the underlying model-fitting libraries to interoperate with models designed using different libraries and analytical engines; our advanced research prototype currently supports Scikit-learn, PyTorch, and TensorFlow. We have conducted extensive benchmarks that demonstrate the suitability of our methodology

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Scalable Persistent Storage for Erlang

Author: Chechina Natalia
Ghaffari Amir
Meredith Jon
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

The many core revolution makes scalability a key property. The RELEASE project aims to improve the scalability of Erlang on emergent commodity architectures with 100,000 cores. Such architectures require scalable and available persistent storage on up to 100 hosts. We enumerate the requirements for scalable and available persistent storage, and evaluate four popular Erlang DBMSs against these requirements. This analysis shows that Mnesia and CouchDB are not suitable persistent storage at our target scale, but Dynamo-like NoSQL DataBase Management Systems (DBMSs) such as Cassandra and Riak potentially are. We investigate the current scalability limits of the Riak 1.1.1 NoSQL DBMS in practice on a 100-node cluster. We establish for the first time scientifically the scalability limit of Riak as 60 nodes on the Kalkyl cluster, thereby confirming developer folklore. We show that resources like memory, disk, and network do not limit the scalability of Riak. By instrumenting Erlang/OTP and Riak libraries we identify a specific Riak functionality that limits scalability. We outline how later releases of Riak are refactored to eliminate the scalability bottlenecks. We conclude that Dynamo-style NoSQL DBMSs provide scalable and available persistent storage for Erlang in general, and for our RELEASE target architecture in particular

Crossref

Enlighten