7,489 research outputs found

    GraphSE2^2: An Encrypted Graph Database for Privacy-Preserving Social Search

    Full text link
    In this paper, we propose GraphSE2^2, an encrypted graph database for online social network services to address massive data breaches. GraphSE2^2 preserves the functionality of social search, a key enabler for quality social network services, where social search queries are conducted on a large-scale social graph and meanwhile perform set and computational operations on user-generated contents. To enable efficient privacy-preserving social search, GraphSE2^2 provides an encrypted structural data model to facilitate parallel and encrypted graph data access. It is also designed to decompose complex social search queries into atomic operations and realise them via interchangeable protocols in a fast and scalable manner. We build GraphSE2^2 with various queries supported in the Facebook graph search engine and implement a full-fledged prototype. Extensive evaluations on Azure Cloud demonstrate that GraphSE2^2 is practical for querying a social graph with a million of users.Comment: This is the full version of our AsiaCCS paper "GraphSE2^2: An Encrypted Graph Database for Privacy-Preserving Social Search". It includes the security proof of the proposed scheme. If you want to cite our work, please cite the conference version of i

    The Potential of Learned Index Structures for Index Compression

    Full text link
    Inverted indexes are vital in providing fast key-word-based search. For every term in the document collection, a list of identifiers of documents in which the term appears is stored, along with auxiliary information such as term frequency, and position offsets. While very effective, inverted indexes have large memory requirements for web-sized collections. Recently, the concept of learned index structures was introduced, where machine learned models replace common index structures such as B-tree-indexes, hash-indexes, and bloom-filters. These learned index structures require less memory, and can be computationally much faster than their traditional counterparts. In this paper, we consider whether such models may be applied to conjunctive Boolean querying. First, we investigate how a learned model can replace document postings of an inverted index, and then evaluate the compromises such an approach might have. Second, we evaluate the potential gains that can be achieved in terms of memory requirements. Our work shows that learned models have great potential in inverted indexing, and this direction seems to be a promising area for future research.Comment: Will appear in the proceedings of ADCS'1

    Conditional Lower Bounds for Space/Time Tradeoffs

    Full text link
    In recent years much effort has been concentrated towards achieving polynomial time lower bounds on algorithms for solving various well-known problems. A useful technique for showing such lower bounds is to prove them conditionally based on well-studied hardness assumptions such as 3SUM, APSP, SETH, etc. This line of research helps to obtain a better understanding of the complexity inside P. A related question asks to prove conditional space lower bounds on data structures that are constructed to solve certain algorithmic tasks after an initial preprocessing stage. This question received little attention in previous research even though it has potential strong impact. In this paper we address this question and show that surprisingly many of the well-studied hard problems that are known to have conditional polynomial time lower bounds are also hard when concerning space. This hardness is shown as a tradeoff between the space consumed by the data structure and the time needed to answer queries. The tradeoff may be either smooth or admit one or more singularity points. We reveal interesting connections between different space hardness conjectures and present matching upper bounds. We also apply these hardness conjectures to both static and dynamic problems and prove their conditional space hardness. We believe that this novel framework of polynomial space conjectures can play an important role in expressing polynomial space lower bounds of many important algorithmic problems. Moreover, it seems that it can also help in achieving a better understanding of the hardness of their corresponding problems in terms of time

    On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

    Get PDF
    Current matching approaches in pub/sub systems only allow conjunctive subscriptions. Arbitrary subscriptions have to be transformed into canonical expressions, e.g., DNFs, and need to be treated as several conjunctive subscriptions. This technique is known from database systems and allows us to apply more efficient filtering algorithms. Since pub/sub systems are the contrary to traditional database systems, it is questionable if filtering several canonical subscriptions is the most efficient and scalable way of dealing with arbitrary subscriptions. In this paper we show that our filtering approach supporting arbitrary Boolean subscriptions is more scalable and efficient than current matching algorithms requiring transformations of subscriptions into DNFs

    Effective retrieval and new indexing method for case based reasoning: Application in chemical process design

    Get PDF
    In this paper we try to improve the retrieval step for case based reasoning for preliminary design. This improvement deals with three major parts of our CBR system. First, in the preliminary design step, some uncertainties like imprecise or unknown values remain in the description of the problem, because they need a deeper analysis to be withdrawn. To deal with this issue, the faced problem description is soften with the fuzzy sets theory. Features are described with a central value, a percentage of imprecision and a relation with respect to the central value. These additional data allow us to build a domain of possible values for each attributes. With this representation, the calculation of the similarity function is impacted, thus the characteristic function is used to calculate the local similarity between two features. Second, we focus our attention on the main goal of the retrieve step in CBR to find relevant cases for adaptation. In this second part, we discuss the assumption of similarity to find the more appropriated case. We put in highlight that in some situations this classical similarity must be improved with further knowledge to facilitate case adaptation. To avoid failure during the adaptation step, we implement a method that couples similarity measurement with adaptability one, in order to approximate the cases utility more accurately. The latter gives deeper information for the reusing of cases. In a last part, we present a generic indexing technique for the base, and a new algorithm for the research of relevant cases in the memory. The sphere indexing algorithm is a domain independent index that has performances equivalent to the decision tree ones. But its main strength is that it puts the current problem in the center of the research area avoiding boundaries issues. All these points are discussed and exemplified through the preliminary design of a chemical engineering unit operation

    A Backend Framework for the Efficient Management of Power System Measurements

    Get PDF
    Increased adoption and deployment of phasor measurement units (PMU) has provided valuable fine-grained data over the grid. Analysis over these data can provide insight into the health of the grid, thereby improving control over operations. Realizing this data-driven control, however, requires validating, processing and storing massive amounts of PMU data. This paper describes a PMU data management system that supports input from multiple PMU data streams, features an event-detection algorithm, and provides an efficient method for retrieving archival data. The event-detection algorithm rapidly correlates multiple PMU data streams, providing details on events occurring within the power system. The event-detection algorithm feeds into a visualization component, allowing operators to recognize events as they occur. The indexing and data retrieval mechanism facilitates fast access to archived PMU data. Using this method, we achieved over 30x speedup for queries with high selectivity. With the development of these two components, we have developed a system that allows efficient analysis of multiple time-aligned PMU data streams.Comment: Published in Electric Power Systems Research (2016), not available ye
    corecore