35,979 research outputs found
Extending a multi-set relational algebra to a parallel environment
Parallel database systems will very probably be the future for high-performance data-intensive applications. In the past decade, many parallel database systems have been developed, together with many languages and approaches to specify operations in these systems. A common background is still missing, however. This paper proposes an extended relational algebra for this purpose, based on the well-known standard relational algebra. The extended algebra provides both complete database manipulation language features, and data distribution and process allocation primitives to describe parallelism. It is defined in terms of multi-sets of tuples to allow handling of duplicates and to obtain a close connection to the world of high-performance data processing. Due to its algebraic nature, the language is well suited for optimization and parallelization through expression rewriting. The proposed language can be used as a database manipulation language on its own, as has been done in the PRISMA parallel database project, or as a formal basis for other languages, like SQL
Partout: A Distributed Engine for Efficient RDF Processing
The increasing interest in Semantic Web technologies has led not only to a
rapid growth of semantic data on the Web but also to an increasing number of
backend applications with already more than a trillion triples in some cases.
Confronted with such huge amounts of data and the future growth, existing
state-of-the-art systems for storing RDF and processing SPARQL queries are no
longer sufficient. In this paper, we introduce Partout, a distributed engine
for efficient RDF processing in a cluster of machines. We propose an effective
approach for fragmenting RDF data sets based on a query log, allocating the
fragments to nodes in a cluster, and finding the optimal configuration. Partout
can efficiently handle updates and its query optimizer produces efficient query
execution plans for ad-hoc SPARQL queries. Our experiments show the superiority
of our approach to state-of-the-art approaches for partitioning and distributed
SPARQL query processing
Optimal Control of Applications for Hybrid Cloud Services
Development of cloud computing enables to move Big Data in the hybrid cloud
services. This requires research of all processing systems and data structures
for provide QoS. Due to the fact that there are many bottlenecks requires
monitoring and control system when performing a query. The models and
optimization criteria for the design of systems in a hybrid cloud
infrastructures are created. In this article suggested approaches and the
results of this build.Comment: 4 pages, Proc. conf. (not published). arXiv admin note: text overlap
with arXiv:1402.146
Security Optimization for Distributed Applications Oriented on Very Large Data Sets
The paper presents the main characteristics of applications which are working with very large data sets and the issues related to security. First section addresses the optimization process and how it is approached when dealing with security. The second section describes the concept of very large datasets management while in the third section the risks related are identified and classified. Finally, a security optimization schema is presented with a cost-efficiency analysis upon its feasibility. Conclusions are drawn and future approaches are identified.Security, Optimization, Very Large Data Sets, Distributed Applications
Large-Scale Distributed Internet-based Discovery Mechanism for Dynamic Spectrum Allocation
Scarcity of frequencies and the demand for more bandwidth is likely to
increase the need for devices that utilize the available frequencies more
efficiently. Radios must be able to dynamically find other users of the
frequency bands and adapt so that they are not interfered, even if they use
different radio protocols. As transmitters far away may cause as much
interference as a transmitter located nearby, this mechanism can not be based
on location alone. Central databases can be used for this purpose, but require
expensive infrastructure and planning to scale. In this paper, we propose a
decentralized protocol and architecture for discovering radio devices over the
Internet. The protocol has low resource requirements, making it suitable for
implementation on limited platforms. We evaluate the protocol through
simulation in network topologies with up to 2.3 million nodes, including
topologies generated from population patterns in Norway. The protocol has also
been implemented as proof-of-concept in real Wi-Fi routers.Comment: Accepted for publication at IEEE DySPAN 201
- …