7,489 research outputs found
GraphSE: An Encrypted Graph Database for Privacy-Preserving Social Search
In this paper, we propose GraphSE, an encrypted graph database for online
social network services to address massive data breaches. GraphSE preserves
the functionality of social search, a key enabler for quality social network
services, where social search queries are conducted on a large-scale social
graph and meanwhile perform set and computational operations on user-generated
contents. To enable efficient privacy-preserving social search, GraphSE
provides an encrypted structural data model to facilitate parallel and
encrypted graph data access. It is also designed to decompose complex social
search queries into atomic operations and realise them via interchangeable
protocols in a fast and scalable manner. We build GraphSE with various
queries supported in the Facebook graph search engine and implement a
full-fledged prototype. Extensive evaluations on Azure Cloud demonstrate that
GraphSE is practical for querying a social graph with a million of users.Comment: This is the full version of our AsiaCCS paper "GraphSE: An
Encrypted Graph Database for Privacy-Preserving Social Search". It includes
the security proof of the proposed scheme. If you want to cite our work,
please cite the conference version of i
The Potential of Learned Index Structures for Index Compression
Inverted indexes are vital in providing fast key-word-based search. For every
term in the document collection, a list of identifiers of documents in which
the term appears is stored, along with auxiliary information such as term
frequency, and position offsets. While very effective, inverted indexes have
large memory requirements for web-sized collections. Recently, the concept of
learned index structures was introduced, where machine learned models replace
common index structures such as B-tree-indexes, hash-indexes, and
bloom-filters. These learned index structures require less memory, and can be
computationally much faster than their traditional counterparts. In this paper,
we consider whether such models may be applied to conjunctive Boolean querying.
First, we investigate how a learned model can replace document postings of an
inverted index, and then evaluate the compromises such an approach might have.
Second, we evaluate the potential gains that can be achieved in terms of memory
requirements. Our work shows that learned models have great potential in
inverted indexing, and this direction seems to be a promising area for future
research.Comment: Will appear in the proceedings of ADCS'1
Conditional Lower Bounds for Space/Time Tradeoffs
In recent years much effort has been concentrated towards achieving
polynomial time lower bounds on algorithms for solving various well-known
problems. A useful technique for showing such lower bounds is to prove them
conditionally based on well-studied hardness assumptions such as 3SUM, APSP,
SETH, etc. This line of research helps to obtain a better understanding of the
complexity inside P.
A related question asks to prove conditional space lower bounds on data
structures that are constructed to solve certain algorithmic tasks after an
initial preprocessing stage. This question received little attention in
previous research even though it has potential strong impact.
In this paper we address this question and show that surprisingly many of the
well-studied hard problems that are known to have conditional polynomial time
lower bounds are also hard when concerning space. This hardness is shown as a
tradeoff between the space consumed by the data structure and the time needed
to answer queries. The tradeoff may be either smooth or admit one or more
singularity points.
We reveal interesting connections between different space hardness
conjectures and present matching upper bounds. We also apply these hardness
conjectures to both static and dynamic problems and prove their conditional
space hardness.
We believe that this novel framework of polynomial space conjectures can play
an important role in expressing polynomial space lower bounds of many important
algorithmic problems. Moreover, it seems that it can also help in achieving a
better understanding of the hardness of their corresponding problems in terms
of time
On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems
Current matching approaches in pub/sub systems only allow conjunctive subscriptions. Arbitrary subscriptions have to be transformed into canonical expressions, e.g., DNFs, and need to be treated as several conjunctive subscriptions. This technique is known from database systems and allows us to apply more efficient filtering algorithms. Since pub/sub systems are the contrary to traditional database systems, it is questionable if filtering several canonical subscriptions is the most efficient and scalable way of dealing with arbitrary subscriptions. In this paper we show that our filtering approach supporting arbitrary Boolean subscriptions is more scalable and efficient than current matching algorithms requiring transformations of subscriptions into DNFs
Effective retrieval and new indexing method for case based reasoning: Application in chemical process design
In this paper we try to improve the retrieval step for case based reasoning for preliminary design. This improvement deals with three major parts of our CBR system. First, in the preliminary design step, some uncertainties like imprecise or unknown values remain in the description of the problem, because they need a deeper analysis to be withdrawn. To deal with this issue, the faced problem description is soften with the fuzzy sets theory. Features are described with a central value, a percentage of imprecision and a relation with respect to the central value. These additional data allow us to build a domain of possible values for each attributes. With this representation, the calculation of the similarity function is impacted, thus the characteristic function is used to calculate the local similarity between two features. Second, we focus our attention on the main goal of the retrieve step in CBR to find relevant cases for adaptation. In this second part, we discuss the assumption of similarity to find the more appropriated case. We put in highlight that in some situations this classical similarity must be improved with further knowledge to facilitate case adaptation. To avoid failure during the adaptation step, we implement a method that couples similarity measurement with adaptability one, in order to approximate the cases utility more accurately. The latter gives deeper information for the reusing of cases. In a last part, we present a generic indexing technique for the base, and a new algorithm for the research of relevant cases in the memory. The sphere indexing algorithm is a domain independent index that has performances equivalent to the decision tree ones. But its main strength is that it puts the current problem in the center of the research area avoiding boundaries issues. All these points are discussed and exemplified through the preliminary design of a chemical engineering unit operation
A Backend Framework for the Efficient Management of Power System Measurements
Increased adoption and deployment of phasor measurement units (PMU) has
provided valuable fine-grained data over the grid. Analysis over these data can
provide insight into the health of the grid, thereby improving control over
operations. Realizing this data-driven control, however, requires validating,
processing and storing massive amounts of PMU data. This paper describes a PMU
data management system that supports input from multiple PMU data streams,
features an event-detection algorithm, and provides an efficient method for
retrieving archival data. The event-detection algorithm rapidly correlates
multiple PMU data streams, providing details on events occurring within the
power system. The event-detection algorithm feeds into a visualization
component, allowing operators to recognize events as they occur. The indexing
and data retrieval mechanism facilitates fast access to archived PMU data.
Using this method, we achieved over 30x speedup for queries with high
selectivity. With the development of these two components, we have developed a
system that allows efficient analysis of multiple time-aligned PMU data
streams.Comment: Published in Electric Power Systems Research (2016), not available
ye
- …