10 research outputs found

    Which NoSQL Database? A Performance Overview

    Get PDF
    NoSQL data stores are widely used to store and retrieve possibly large amounts of data, typically in a key-value format. There are many NoSQL types with different performances, and thus it is important to compare them in terms of performance and verify how the performance is related to the database type. In this paper, we evaluate five most popular NoSQL databases: Cassandra, HBase, MongoDB, OrientDB and Redis. We compare those databases in terms of query performance, based on reads and updates, taking into consideration the typical workloads, as represented by the Yahoo! Cloud Serving Benchmark. This comparison allows users to choose the most appropriate database according to the specific mechanisms and application needs

    LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms

    Get PDF
    ABSTRACT In this paper we introduce LDBC Graphalytics, a new industrial-grade benchmark for graph analysis platforms. It consists of six deterministic algorithms, standard datasets, synthetic dataset generators, and reference output, that enable the objective comparison of graph analysis platforms. Its test harness produces deep metrics that quantify multiple kinds of system scalability, such as horizontal/vertical and weak/strong, and of robustness, such as failures and performance variability. The benchmark comes with open-source software for generating data and monitoring performance. We describe and analyze six implementations of the benchmark (three from the community, three from the industry), providing insights into the strengths and weaknesses of the platforms. Key to our contribution, vendors perform the tuning and benchmarking of their platforms

    The LDBC Graphalytics Benchmark

    Full text link
    In this document, we describe LDBC Graphalytics, an industrial-grade benchmark for graph analysis platforms. The main goal of Graphalytics is to enable the fair and objective comparison of graph analysis platforms. Due to the diversity of bottlenecks and performance issues such platforms need to address, Graphalytics consists of a set of selected deterministic algorithms for full-graph analysis, standard graph datasets, synthetic dataset generators, and reference output for validation purposes. Its test harness produces deep metrics that quantify multiple kinds of systems scalability, weak and strong, and robustness, such as failures and performance variability. The benchmark also balances comprehensiveness with runtime necessary to obtain the deep metrics. The benchmark comes with open-source software for generating performance data, for validating algorithm results, for monitoring and sharing performance data, and for obtaining the final benchmark result as a standard performance report

    Katsaus NoSQL-tietokantojen suorituskykyyn

    Get PDF
    Tässä tutkielmassa käsitellään NoSQL-tietokantojen suorituskykyä. Työssä esitellään NoSQL-tietomallien keskeisimmät ominaisuudet ja vertaillaan NoSQL- ja relaatiotietomallien välisiä eroja. Lisäksi käsitellään tietokantojen hajautusmekanismeja sekä hajautettujen tietokantojen ominaisuuksia määrittelevää CAP-teoreemaa. Työssä perehdytään suorituskyvyn mittaamiseen ja sen mittareihin. Mittaustyökaluista esitellään tarkemmin avain-arvoparitietokantojen vertailuun kehitetty Yahoo! Cloud Serving Benchmark (YCSB). Työn tavoitteena on NoSQL-tietomallien ja niihin perustuvien tietokantojen suorituskyvyn vertailu aihetta käsittelevien tutkimusten, julkaisujen ja artikkeleiden pohjalta. Tutkimuksen tuloksena saatiin tietoa NoSQL-tietomallien ja -tietokantojen suorituskyvystä sekä suorituskykyyn vaikuttavista tekijöistä. Tutkimustuloksia voidaan käyttää apuna suorituskyvyltään käyttökohteeseensa parhaan mahdollisen tietokannan valinnassa

    Workload mix definition for benchmarking BPMN 2.0 Workflow Management Systems

    Get PDF
    Nowadays, enterprises broadly use Workflow Management Systems (WfMSs) to design, deploy, execute, monitor and analyse their automated business processes. Through the years, WfMSs evolved into platforms that deliver complex service oriented applications. In this regard, they need to satisfy enterprise-grade performance requirements, such as dependability and scalability. With the ever-growing number of WfMSs that are currently available in the market, companies are called to choose which product is optimal for their requirements and business models. Benchmarking is an established practice used to compare alternative products and leverages the continuous improvement of technology by setting a clear target in measuring and assessing performance. In particular, for service oriented WfMSs there is not yet a widely accepted standard benchmark available, even if workflow modelling languages such as Web Services Business Process Execution Language (WS-BPEL) and Business Process Model and Notation 2.0 (BPMN 2.0) have been adopted as the de-facto standards. A possible explanation on this deficiency can be given by the inherent architectural complexity of WfMSs and the very large number of parameters affecting their performance. However, the need for a standard benchmark for WfMSs is frequently affirmed by the literature. The goal of the BenchFlow approach is to propose a framework towards the first standard benchmark forassessing and comparing the performance of BPMN 2.0 WfMSs. To this end, the approach addresses a set of challenges spanning from logistic challenges, that are related to the collection of a representative set of usage scenarios,to technical challenges, that concern the specific characteristics of a WfMS. This work focuses on a subset of these challenges dealing with the definition of a representative set of process models and corresponding data that will be given as an input to the benchmark. This set of representative process models and corresponding data are referred to as the workload mix of the benchmark. More particularly, we first prepare the theoretical background for defining a representative workload mix. This is accomplished through identification of the basic components of a workload model for WfMS benchmarks, as well as the investigation of the impact of the BPMN 2.0 language constructs to the WfMS’s performance, by means of introducing the first BPMN 2.0 micro-benchmark. We proceed by collecting real-world process models for the identification of a representative workload mix. Therefore, the collection is analysed with respect to its statistical characteristics and also with a novel algorithm that detects and extracts the reoccurring structural patterns of the collection.The extracted reoccurring structures are then used for generating synthetic process models that reflect the essence of the original collection.The introduced methods are brought together in a tool chain that supports the workload mix generation. As a final step, we applied the proposed methods on a real-world case study, that bases on a collection of thousands of real-world process models and generates a representative workload mix to be used in a benchmark. The results show that the generated workload mix is successful in its application for stressing the WfMSs under test

    Graph databases and their application to the Italian Business Register for efficient search of relationships among companies

    Get PDF
    We studied and tested three of the major graph databases, and we compared them with a relational database. We worked on a dataset representing equity participations among companies, and we found out that the strong points of graph databases are: the purposely designed storage techniques; and their query languages. The main performance increments have been obtained when heavy graph situations are queried; for simpler situations and queries, a relational database performs equally wellope
    corecore