35,979 research outputs found

    Extending a multi-set relational algebra to a parallel environment

    Get PDF
    Parallel database systems will very probably be the future for high-performance data-intensive applications. In the past decade, many parallel database systems have been developed, together with many languages and approaches to specify operations in these systems. A common background is still missing, however. This paper proposes an extended relational algebra for this purpose, based on the well-known standard relational algebra. The extended algebra provides both complete database manipulation language features, and data distribution and process allocation primitives to describe parallelism. It is defined in terms of multi-sets of tuples to allow handling of duplicates and to obtain a close connection to the world of high-performance data processing. Due to its algebraic nature, the language is well suited for optimization and parallelization through expression rewriting. The proposed language can be used as a database manipulation language on its own, as has been done in the PRISMA parallel database project, or as a formal basis for other languages, like SQL

    Partout: A Distributed Engine for Efficient RDF Processing

    Full text link
    The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for efficient RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log, allocating the fragments to nodes in a cluster, and finding the optimal configuration. Partout can efficiently handle updates and its query optimizer produces efficient query execution plans for ad-hoc SPARQL queries. Our experiments show the superiority of our approach to state-of-the-art approaches for partitioning and distributed SPARQL query processing

    Optimal Control of Applications for Hybrid Cloud Services

    Full text link
    Development of cloud computing enables to move Big Data in the hybrid cloud services. This requires research of all processing systems and data structures for provide QoS. Due to the fact that there are many bottlenecks requires monitoring and control system when performing a query. The models and optimization criteria for the design of systems in a hybrid cloud infrastructures are created. In this article suggested approaches and the results of this build.Comment: 4 pages, Proc. conf. (not published). arXiv admin note: text overlap with arXiv:1402.146

    Security Optimization for Distributed Applications Oriented on Very Large Data Sets

    Get PDF
    The paper presents the main characteristics of applications which are working with very large data sets and the issues related to security. First section addresses the optimization process and how it is approached when dealing with security. The second section describes the concept of very large datasets management while in the third section the risks related are identified and classified. Finally, a security optimization schema is presented with a cost-efficiency analysis upon its feasibility. Conclusions are drawn and future approaches are identified.Security, Optimization, Very Large Data Sets, Distributed Applications

    Large-Scale Distributed Internet-based Discovery Mechanism for Dynamic Spectrum Allocation

    Full text link
    Scarcity of frequencies and the demand for more bandwidth is likely to increase the need for devices that utilize the available frequencies more efficiently. Radios must be able to dynamically find other users of the frequency bands and adapt so that they are not interfered, even if they use different radio protocols. As transmitters far away may cause as much interference as a transmitter located nearby, this mechanism can not be based on location alone. Central databases can be used for this purpose, but require expensive infrastructure and planning to scale. In this paper, we propose a decentralized protocol and architecture for discovering radio devices over the Internet. The protocol has low resource requirements, making it suitable for implementation on limited platforms. We evaluate the protocol through simulation in network topologies with up to 2.3 million nodes, including topologies generated from population patterns in Norway. The protocol has also been implemented as proof-of-concept in real Wi-Fi routers.Comment: Accepted for publication at IEEE DySPAN 201
    corecore