49 research outputs found
Privacy Aware Parallel Computation of Skyline Sets Queries from Distributed Databases
A skyline query finds objects that are not dominated by another object from a given set of objects. Skyline queries help us to filter unnecessary information efficiently and provide us clues for various decision making tasks. However, we cannot use skyline queries in privacy aware environment, since we have to hide individual's records values even though there is no ID information. Therefore, we considered skyline sets queries. The skyline set query returns skyline sets from all possible sets, each of which is composed of some objects in a database. With the growth of network infrastructure data are stored in distributed databases. In this paper, we expand the idea to compute skyline sets queries in parallel fashion from distributed databases without disclosing individual records to others. The proposed method utilizes an agent-based parallel computing framework that can efficiently compute skyline sets queries and can solve the privacy problems of skyline queries in distributed environment. The computation of skyline sets is performed simultaneously in all databases which increases parallelism and reduces the computation time
Skyline queries in dynamic environments
Ph.DDOCTOR OF PHILOSOPH
Airborne Directional Networking: Topology Control Protocol Design
This research identifies and evaluates the impact of several architectural design choices in relation to airborne networking in contested environments related to autonomous topology control. Using simulation, we evaluate topology reconfiguration effectiveness using classical performance metrics for different point-to-point communication architectures. Our attention is focused on the design choices which have the greatest impact on reliability, scalability, and performance. In this work, we discuss the impact of several practical considerations of airborne networking in contested environments related to autonomous topology control modeling. Using simulation, we derive multiple classical performance metrics to evaluate topology reconfiguration effectiveness for different point-to-point communication architecture attributes for the purpose of qualifying protocol design elements
AoI-based Multicast Routing over Voronoi Overlays with Minimal Overhead
The increasing pervasive and ubiquitous presence of devices at the edge of
the Internet is creating new scenarios for the emergence of novel services and
applications. This is particularly true for location- and context-aware
services. These services call for new decentralized, self-organizing
communication schemes that are able to face issues related to demanding
resource consumption constraints, while ensuring efficient locality-based
information dissemination and querying. Voronoi-based communication techniques
are among the most widely used solutions in this field. However, when used for
forwarding messages inside closed areas of the network (called Areas of
Interest, AoIs), these solutions generally require a significant overhead in
terms of redundant and/or unnecessary communications. This fact negatively
impacts both the devices' resource consumption levels, as well as the network
bandwidth usage. In order to eliminate all unnecessary communications, in this
paper we present the MABRAVO (Multicast Algorithm for Broadcast and Routing
over AoIs in Voronoi Overlays) protocol suite. MABRAVO allows to forward
information within an AoI in a Voronoi network using only local information,
reaching all the devices in the area, and using the lowest possible number of
messages, i.e., just one message for each node included in the AoI. The paper
presents the mathematical and algorithmic descriptions of MABRAVO, as well as
experimental findings of its performance, showing its ability to reduce
communication costs to the strictly minimum required.Comment: Submitted to: IEEE Access; CodeOcean: DOI:10.24433/CO.1722184.v1;
code: https://github.com/michelealbano/mabrav
Efficient processing of similarity queries with applications
Today, a myriad of data sources, from the Internet to business operations to scientific instruments, produce large and different types of data. Many application scenarios, e.g., marketing analysis, sensor networks, and medical and biological applications, call for identifying and processing similarities in big data. As a result, it is imperative to develop new similarity query processing approaches and systems that scale from low dimensional data to high dimensional data, from single machine to clusters of hundreds of machines, and from disk-based to memory-based processing. This dissertation introduces and studies several similarity-aware query operators, analyzes and optimizes their performance.
The first contribution of this dissertation is an SQL-based Similarity Group-by operator (SGB, for short) that extends the semantics of the standard SQL Group-by operator to group data with similar but not necessarily equal values. We realize these SGB operators by extending the Standard SQL Group-by and introduce two new SGB operators for multi-dimensional data. We implement and test the new SGB operators and their algorithms inside an open-source centralized database server (PostgreSQL).
In the second contribution of this dissertation, we study how to efficiently process Hamming-distance-based similarity queries (Hamming-distance select and Hamming-distance join) that are crucial to many applications. We introduce a new index, termed the HA-Index, that speeds up distance comparisons and eliminates redundancies when performing the two flavors of Hamming distance range queries (namely, the selects and joins).
In the third and last contribution of this dissertation, we develop a system for similarity query processing and optimization in an in-memory and distributed setup for big spatial data. We propose a query scheduler and a distributed query optimizer that use a new cost model to optimize the cost of similarity query processing in this in-memory distributed setup. The scheduler and query optimizer generates query execution plans that minimize the effect of query skew. The query scheduler employs new spatial indexing techniques based on bloom filters to forward queries to the appropriate local sites. The proposed query processing and optimization techniques are prototyped inside Spark, a distributed main-memory computation system
スカイライン問合わせを利用した大規模データベースの情報選別
Conventional SQL queries take exact input and produce complete result set. However, with massive increase in data volume in different applications, the large result sets returned by traditional SQL queries are not well suited for the users to take effective decisions. Therefore, there is an increasing interest in queries like top-k queries and skyline queries those produce a more concise result set.
Top-k queries rely on the scores of the objects to evaluate the usefulness of the objects. In this type of queries, users require to define their own scoring function by combining their interests. Based on the user defined scoring function, the system sorts the objects by their scores and outputs the top-k objects in the ranking list as the result. However, defining a scoring function by the users is a major draw of the top-k queries as in the large data sets where there are many conflicting criteria exist, it is very difficult for the users to define the scoring functions by themselves.……広島大学(Hiroshima University)博士(工学)Engineeringdoctora