6,932 research outputs found

    Performance Analysis of Blockchain Platforms

    Full text link
    Blockchain technologies have drawn massive attention to the world these past few years mostly because of the burst of cryptocurrencies like Bitcoin, Etherium, Ripple and many others. A Blockchain, also known as distributed ledger technology, has demonstrated huge potential in saving time and costs. This open-source technology which generates a decentralized public ledger of transactions is widely appreciated for ensuring a high level of privacy through encryption and thus sharing the transaction details only amongst the participants involved in the transactions. The Blockchain is used not only for cryptocurrency but also by various companies to meet their business ends, such as efficient management of supply chains and logistics. The rise and fall of numerous crypto-currencies based on blockchain technology have generated debate among tech-giants and regulatory bodies. There are various groups which are working on standardizing the blockchain technology. At the same time, numerous groups are actively working, developing and fine-tuning their own blockchain platforms. Platforms such as etherium, hyperledger, parity, etc. have their own pros and cons. This research is focused on the performance analysis of blockchain platforms which gives a comparative understanding of these platforms

    ENHANCING DATABASE PERFORMANCE IN A DSS ENVIRONMENT VIA QUERY CACHING

    Get PDF
    A key element in all decision support systems is availability of sufficiently good and timely data to support the decision making process. Much research was, and is, devoted to data and information quality: attributes, assurance that quality data is used in the decision process, etc. In this paper we concentrate on a particular dimension of data availability and usage -the retrieval of data in a timely and decision enhancing manner. We propose to augment the decision support databases by an adaptive and efficient query cache. The cache contains snapshots of the decision support database, each being the answer to a recently invoked query. A snapshot can be reused by the originating user, or a different user, at a later time --provided the use of cached data leads to savings over the use of a new query, and these savings exceed the cost of using stale date. The proposed scheme is conceptually different from conventional data replication schemes. In data replication schemes the data items to be replicated and the protocols for concurrency control are defined at the system level. In our scheme the cache is populated dynamically and the snapshots it contains are refreshed only if the cost of using stale information is higher than cost of refreshing the snapshots. At the same time, users can still decide to refresh the stored snapshot, based on their own decision environment. Our scheme thus enhances the data retrievalprocess, while supporting a more efficient data retrieval at both the user level and the data warehouse leve

    Managing Linguistic Data Summaries in Advanced P2P Applications

    Get PDF
    chapitre... à corrigerAs the amount of stored data increases, data localization techniques become no longer sufficient in P2P systems. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this chapter, we describe a solution for managing linguistic data summaries in advanced P2P applications which are dealing with semantically rich data. The produced summaries are synthetic, multidimensional views over relational tables. The novelty of this proposal relies on the double summary exploitation in distributed P2P systems. First, as semantic indexes, they support locating relevant nodes based on their data descriptions. Second, due to their intelligibility, these summaries can be directly queried and thus approximately answer a query without the need for exploring original data. The proposed solution consists first in defining a summary model for hierarchical P2P systems. Second, appropriate algorithms for summary creation and maintenance are presented. A query processing mechanism, which relies on summary querying, is then proposed to demonstrate the benefits that might be obtained from summary exploitation

    Enabling On-Demand Database Computing with MIT SuperCloud Database Management System

    Full text link
    The MIT SuperCloud database management system allows for rapid creation and flexible execution of a variety of the latest scientific databases, including Apache Accumulo and SciDB. It is designed to permit these databases to run on a High Performance Computing Cluster (HPCC) platform as seamlessly as any other HPCC job. It ensures the seamless migration of the databases to the resources assigned by the HPCC scheduler and centralized storage of the database files when not running. It also permits snapshotting of databases to allow researchers to experiment and push the limits of the technology without concerns for data or productivity loss if the database becomes unstable.Comment: 6 pages; accepted to IEEE High Performance Extreme Computing (HPEC) conference 2015. arXiv admin note: text overlap with arXiv:1406.492

    Summary Management in P2P Systems

    Get PDF
    International audienceSharing huge, massively distributed databases in P2P systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer suf- ficient. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this paper, we consider summaries that are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. Our main contribution is to define a summary model for P2P systems, and the appropriate algorithms for summary management. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance

    Approximate NN Queries on Streams with Guaranteed Error/performance Bounds

    Get PDF

    ArrayBridge: Interweaving declarative array processing with high-performance computing

    Full text link
    Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aims to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.Comment: 12 pages, 13 figure
    • …
    corecore