130 research outputs found

    New Perspectives for NoSQL Database Design: A Systematic Review

    Get PDF
    The use of NoSQL databases has increasingly become a trend in software development, mainly due to the expansion of Web 2.0 systems. However, there is not yet a standard to be used for the design of this type of database even with the growing number of studies related to this subject. This paper presents a systematic review looking for new trends regarding strategies used in this context. The result of this process demonstrates that there are still few methodologies for the NoSQL database design and there are no design methodologies capable of working with polyglot persistence

    Introducing polyglot-based data-flow awareness to time-series data stores

    Get PDF
    The rising interest in extracting value from data has led to a broad proliferation of monitoring infrastructures, most notably composed by sensors, intended to collect this new oil. Thus, gathering data has become fundamental for a great number of applications, such as predictive maintenance techniques or anomaly detection algorithms. However, before data can be refined into insights and knowledge, it has to be efficiently stored and prepared for its later retrieval. As a consequence of this sensor and IoT boom, Time-Series databases (TSDB), designed to manage sensor data, became the fastest-growing database category since 2019. Here we propose a holistic approach intended to improve TSDB’s performance and efficiency. More precisely, we introduce and evaluate a novel polyglot-based approximation, aimed to tailor the data store, not only to time-series data –as it is done conventionally– but also to the data flow itself: From its ingestion, until its retrieval. In order to evaluate the approach, we materialize it in an alternative implementation of NagareDB, a resource-efficient time-series database, based on MongoDB, in turn, the most popular NoSQL storage solution. After implementing our approach into the database, we observe a global speed up, solving queries up to 12 times faster than MongoDB’s recently launched Time-series capability, as well as generally outperforming InfluxDB, the most popular time-series database. Our polyglot-based data-flow aware solution can ingest data more than two times faster than MongoDB, InfluxDB, and NagareDB’s original implementation, while using the same disk space as InfluxDB, and half of the requested by MongoDB.This research was partly supported by the Spanish Ministry of Science and Innovation (contract PID2019-107255GB) and by the Generalitat de Catalunya (contract 2017-SGR-1414).Peer ReviewedPostprint (published version

    NOSQL VS RDBMS - WHY THERE IS ROOM FOR BOTH

    Get PDF
    The relational database or RDBMS has been the dominant model for database management since it was developed by Edgar Codd in 1970 (Shuxin and Indrakshi, 2005). However, a new database model called NoSQL is gaining significant attention in the enterprise. NoSQL databases are non-relational data stores that have been employed in massively scaled web site scenarios, where traditional relational database features matter less, and the improved performance of retrieving relatively simple data sets matters most. The relational database model and the NoSQL database model are each good for specific applications. Depending on what problem the organization is trying to solve, it will determine if a NoSQL database model should be used or if a relational database model should be used. Also, some organizations may choose to use a hybrid mix of NoSQL databases and relational databases

    Model-Driven Cloud Data Storage

    No full text
    ISBN: 978-87-643-1014-6 - http://www2.imm.dtu.dk/conferences/ECMFA-2012/proceedings/International audienceThe increasing adoption of the cloud computing paradigm has motivated a re definition of traditional software development methods. In particular, data storage management has received a great deal of attention, due to a growing interest in the challenges and opportunities associated to the NoSQL movement. However, appropriate selection, administration and use of cloud storage implementations remain a highly technical endeavor, due to large differences in the way data is represented, stored and accessed by these systems. This position paper motivates the use of model-driven techniques to avoid dependencies between high-level data models and cloud storage implementations. In this way, developers depend only on high-level data models, and then rely on transformation procedures to deal with particular cloud storage details, such as different APIs and deployment providers, and are able to target multiple cloud storage environments, without modifying their core data models

    A holistic scalability strategy for time series databases following cascading polyglot persistence

    Get PDF
    Time series databases aim to handle big amounts of data in a fast way, both when introducing new data to the system, and when retrieving it later on. However, depending on the scenario in which these databases participate, reducing the number of requested resources becomes a further requirement. Following this goal, NagareDB and its Cascading Polyglot Persistence approach were born. They were not just intended to provide a fast time series solution, but also to find a great cost-efficiency balance. However, although they provided outstanding results, they lacked a natural way of scaling out in a cluster fashion. Consequently, monolithic approaches could extract the maximum value from the solution but distributed ones had to rely on general scalability approaches. In this research, we proposed a holistic approach specially tailored for databases following Cascading Polyglot Persistence to further maximize its inherent resource-saving goals. The proposed approach reduced the cluster size by 33%, in a setup with just three ingestion nodes and up to 50% in a setup with 10 ingestion nodes. Moreover, the evaluation shows that our scaling method is able to provide efficient cluster growth, offering scalability speedups greater than 85% in comparison to a theoretically 100% perfect scaling, while also ensuring data safety via data replication.This research was partly supported by the Grant Agreement No. 857191, by the Spanish Ministry of Science and Innovation (contract PID2019-107255GB) and by the Generalitat de Catalunya (contract 2017-SGR-1414).Peer ReviewedPostprint (published version

    Is Distributed Database Evaluation Cloud-Ready?

    Get PDF
    The database landscape has significantly evolved over the last decade as cloud computing enables to run distributed databases on virtually unlimited cloud resources. Hence, the already non-trivial task of selecting and deploying a distributed database system becomes more challenging. Database evaluation frameworks aim at easing this task by guiding the database selection and deployment decision. The evaluation of databases has evolved as well by moving the evaluation focus from performance to distribution aspects such as scalability and elasticity. This paper presents a cloud-centric analysis of distributed database evaluation frameworks based on evaluation tiers and framework requirements. It analysis eight well adopted evaluation frameworks. The results point out that the evaluation tiers performance, scalability, elasticity and consistency are well supported, in contrast to resource selection and availability. Further, the analysed frameworks do not support cloud-centric requirements but support classic evaluation requirements

    Icarus: Towards a Multistore Database System

    Get PDF
    The last years have seen a vast diversification on the database market. In contrast to the "one-size-fits-all" paradigm according to which systems have been designed in the past, today's database management systems (DBMSs) are tuned for particular workloads. This has led to DBMSs optimized for high performance, high throughput read/write workload in online transaction processing (OLTP) and systems optimized for complex analytical queries (OLAP). However, this approach reaches a limit when systems have to deal with mixed workloads that are neither pure OLAP nor pure OLTP workloads. In such cases, polystores are increasingly gaining popularity. Rather than supporting one single database paradigm and addressing one particular workload, polystores encompass several DBMSs that store data in different schemas and allow to route requests at a per-query-level to the most appropriate system. In this paper, we introduce the polystore Icarus. In our evaluation based on a workload that combines OLTP and OLAP elements, We show that Icarus is able to speed-up queries up to a factor of 3 by properly routing queries to the best underlying DBMS
    • 

    corecore