6,344 research outputs found

    NOSQL design for analytical workloads: Variability matters

    Get PDF
    Big Data has recently gained popularity and has strongly questioned relational databases as universal storage systems, especially in the presence of analytical workloads. As result, co-relational alternatives, commonly known as NOSQL (Not Only SQL) databases, are extensively used for Big Data. As the primary focus of NOSQL is on performance, NOSQL databases are directly designed at the physical level, and consequently the resulting schema is tailored to the dataset and access patterns of the problem in hand. However, we believe that NOSQL design can also benefit from traditional design approaches. In this paper we present a method to design databases for analytical workloads. Starting from the conceptual model and adopting the classical 3-phase design used for relational databases, we propose a novel design method considering the new features brought by NOSQL and encompassing relational and co-relational design altogether.Peer ReviewedPostprint (author's final draft

    The potential of semantic paradigm in warehousing of big data

    Get PDF
    Big data have analytical potential that was hard to realize with available technologies. After new storage paradigms intended for big data such as NoSQL databases emerged, traditional systems got pushed out of the focus. The current research is focused on their reconciliation on different levels or paradigm replacement. Similarly, the emergence of NoSQL databases has started to push traditional (relational) data warehouses out of the research and even practical focus. Data warehousing is known for the strict modelling process, capturing the essence of the business processes. For that reason, a mere integration to bridge the NoSQL gap is not enough. It is necessary to deal with this issue on a higher abstraction level during the modelling phase. NoSQL databases generally lack clear, unambiguous schema, making the comprehension of their contents difficult and their integration and analysis harder. This motivated involving semantic web technologies to enrich NoSQL database contents by additional meaning and context. This paper reviews the application of semantics in data integration and data warehousing and analyses its potential in integrating NoSQL data and traditional data warehouses with some focus on document stores. Also, it gives a proposal of the future pursuit directions for the big data warehouse modelling phases

    Evaluating Riak Key Value Cluster for Big Data

    Get PDF
    NoSQL database has become an important alternative to traditional relational databases. Those databases are prepared by the management of large, continuously and variably changing data sets. They are widely used in cloud databases and distributed systems. With NoSQL databases, static schemes and many other restrictions are avoided. In the era of big data, such databases provide scalable high availability solutions. Their key-value feature allows fast retrieval of data and the ability to store a lot of it. There are many kinds of NoSQL databases with various performances. Therefore, comparing those different types of databases in terms of performance and verifying the relationship between performance and database type has become very important. In this paper, we test and evaluate the Riak key-value database for big data clusters using benchmark tools, where huge amounts of data are stored and retrieved in different sizes in a distributed database environment. Execution times of the NoSQL database over different types of workloads and different sizes of data are compared. The results show that the Riak key-value is stable in execution time for both small and large amounts of data, and the throughput performance increases as the number of threads increases

    Creating NoSQL Biological Databases with Ontologies for Query Relaxation

    Get PDF
    AbstractThe complexity of building biological databases is well-known and ontologies play an extremely important role in biological databases. However, much of the emphasis on the role of ontologies in biological databases has been on the construction of databases. In this paper, we explore a somewhat overlooked aspect regarding ontologies in biological databases, namely, how ontologies can be used to assist better database retrieval. In particular, we show how ontologies can be used to revise user submitted queries for query relaxation. In addition, since our research is conducted at today's “big data” era, our investigation is centered on NoSQL databases which serve as a kind of “representatives” of big data. This paper contains two major parts: First we describe our methodology of building two NoSQL application databases (MongoDB and AllegroGraph) using GO ontology, and then discuss how to achieve query relaxation through GO ontology. We report our experiments and show sample queries and results. Our research on query relaxation on NoSQL databases is complementary to existing work in big data and in biological databases and deserves further exploration

    Benchmarking Scalability of NoSQL Databases for Geospatial Queries

    Get PDF
    NoSQL databases provide an edge when it comes to dealing with big unstructured data. Flexibility, agility, and scalability offered by NoSQL databases become increasingly essential when dealing with geospatial data. The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of data that the data stores must manage. Such characteristics of big spatial data surpassed the capability and anticipated use cases of relational databases. Because we can choose from an extensive collection of NoSQL databases these days, it becomes vital for organizations to make an informed decision. NoSQL Database benchmarks provide system architects, who shoulder a considerable burden of selecting the right technology for their data stores, with a vital start point and source of information. The major utility of these benchmarks is reproducing experiments on similar experimental data that can verify and optimize the process of selecting an optimum tool for data management needs in the early phases of the development. The goal of this research is to develop a benchmark that can compare the performance of NoSQL databases for querying complex geospatial data. We have analyzed throughputs, latencies, and runtime of MongoDB and Couchbase to identify the correct fit for our use case. This way we have also demonstrated a systematic process that can be followed to make an optimum choice of datastore. This benchmark can be extended easily to any NoSQL database that supports geospatial querying

    A Literature Review On NoSQL Database For Big Data Processing

    Get PDF
    Abstract Objective:Aim of the present study was to literature review on the NoSQL Database for Big Data processing including the structural issues and the real-time data mining techniques to extract the estimated valuable information.Methods:We searched the Springer Link and IEEE Xplore online databases for articles published in English language during the last seven years (between January 2011 and December 2017).We specifically searched for two keywords (“NoSQL” and “Big Data”) to find the articles.The inclusion criteria were articles on the use of performance comparison on valuable information processing in the field of Big Data through NoSQL databases.Results:In the 18 selected articles,this review identified 8 articles which provided various suitable recommendations on NoSQL databases for specific area focus on the value chain of Big Data,5 articles described the performance comparison of different NoSQL databases, 2 articles presented the background of basics characteristics data model for NoSQL,1 article denoted the storage in respect of cloud computing and 2 articles focused the transactions of NoSQL.Conclusion:In this literature,we presented the NoSQL databases for Big Data processing including its transactional and structural issues. Additionally, we highlight research directions and challenges in relation to Big Data processing. Therefore,we believe that the information contained in this review will incredible support and guide the progress of the Big Data processing

    Application of HADOOP to Store and Process Big Data Gathered from an Urban Water Distribution System

    Get PDF
    Information technology has become an integral part of municipal water distribution systems (WDS). Various types of sensors, e.g., smart water meters, usually work in real-time mode delivering a huge amount of data. Big data must be stored in appropriate databases. Along with the development of data mining tools, the analysis of big data is very important for the management of WDS. Valuation of NoSQL databases for water data is currently in its very early stages. In this paper, the Apache Hadoop platform is investigated with respect to a possible database solution based on NoSQL. We present comparative experiments evaluating the performance of the Hadoop and MySQL databases

    Integrating NoSQL in the Classroom

    Get PDF
    With the increasing popularity of big data, more and more organizations are turning to NoSQL databases as their preferred system for handling the unique demands of capturing and storing massive amounts of data. The likelihood that employees in all sizes of organizations will encounter NoSQL databases is growing every year. College students need to be exposed to this technology and begin to have a functional understanding of how it works and how to use it. This paper offers a teaching case for college instructors to integrate NoSQL into their existing database courses
    • …
    corecore