29,133 research outputs found

    A Framework for the Study of Query Decomposition for Heterogeneous Distributed Database Management Systems

    Get PDF
    This paper presents a framework for the study of the query decomposition translation for heterogeneous record -oriented database management systems. This framework is based on the applied database logic representation of relational, hierarchical and network databases. The input to the query decomposition translation is the query graph which is derived from the complex to basic, external to conceptual and logical optimization translations. Once the query graph is obtained the objective of the query decomposition translation is to break up a query expressed in terms of the actual or conceptual databases into its component parts or subqueries and find a strategy indicating the sequence of primitive or fundamental operations and their corresponding processing sites in the network necessary to answer the query. The query processing strategy is usually chosen so as to satisfy some performance criterion such as response time reduction. Contingent on after each primitive operation. The prequery decomposition translation, the query decomposition translation and the size estimation issues are presented through an example based on the current implementation of the Distributed Access View Integration Database (DAVID) currently being built at NASA's Goddard Space Flight Center (GSFC). The choice of a query processing strategy is the successful estimation of intermediate result

    An Ontology Connected to Several Data Repositories: Query Processing Steps

    Get PDF
    The great expansion of communication networks has made avail- able to users a huge number of heterogeneous and autonomous data repositories that present different structures/organizations, query languages and data semantics. In that context it is clear that new information retrieval techniques with a strategy that focuses on in- formation content and semantics are needed. We propose to use domain specific Ontologies to capture the information content of such repositories whenever available. We describe such Ontolo- gies using a system based on Description Logics. In this paper we present all the stages of the processing of a query formulated over an Ontology when the answer must be found in the underly- ing data repositories. Those stages make up a subpart of the global query processing strategy defined for a set of loosely-coupled On- tologies. We show first how the query is transformed into a seman- tically equivalent one and how inconsistent queries are detected. Then, we explain the test to verify if the query can be answered from the cache memory. Next, we present a set of heuristics used during the query decomposition process. Later on, we show how to optimize plans associated to subqueries that access the underlying data repositories and finally we illustrate how the answers retrieved from the repositories are correlated in order to generate the query answer

    A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs

    Full text link
    Cyber security is one of the most significant technical challenges in current times. Detecting adversarial activities, prevention of theft of intellectual properties and customer data is a high priority for corporations and government agencies around the world. Cyber defenders need to analyze massive-scale, high-resolution network flows to identify, categorize, and mitigate attacks involving networks spanning institutional and national boundaries. Many of the cyber attacks can be described as subgraph patterns, with prominent examples being insider infiltrations (path queries), denial of service (parallel paths) and malicious spreads (tree queries). This motivates us to explore subgraph matching on streaming graphs in a continuous setting. The novelty of our work lies in using the subgraph distributional statistics collected from the streaming graph to determine the query processing strategy. We introduce a "Lazy Search" algorithm where the search strategy is decided on a vertex-to-vertex basis depending on the likelihood of a match in the vertex neighborhood. We also propose a metric named "Relative Selectivity" that is used to select between different query processing strategies. Our experiments performed on real online news, network traffic stream and a synthetic social network benchmark demonstrate 10-100x speedups over selectivity agnostic approaches.Comment: in 18th International Conference on Extending Database Technology (EDBT) (2015

    Query processing of geometric objects with free form boundarie sin spatial databases

    Get PDF
    The increasing demand for the use of database systems as an integrating factor in CAD/CAM applications has necessitated the development of database systems with appropriate modelling and retrieval capabilities. One essential problem is the treatment of geometric data which has led to the development of spatial databases. Unfortunately, most proposals only deal with simple geometric objects like multidimensional points and rectangles. On the other hand, there has been a rapid development in the field of representing geometric objects with free form curves or surfaces, initiated by engineering applications such as mechanical engineering, aviation or astronautics. Therefore, we propose a concept for the realization of spatial retrieval operations on geometric objects with free form boundaries, such as B-spline or Bezier curves, which can easily be integrated in a database management system. The key concept is the encapsulation of geometric operations in a so-called query processor. First, this enables the definition of an interface allowing the integration into the data model and the definition of the query language of a database system for complex objects. Second, the approach allows the use of an arbitrary representation of the geometric objects. After a short description of the query processor, we propose some representations for free form objects determined by B-spline or Bezier curves. The goal of efficient query processing in a database environment is achieved using a combination of decomposition techniques and spatial access methods. Finally, we present some experimental results indicating that the performance of decomposition techniques is clearly superior to traditional query processing strategies for geometric objects with free form boundaries

    Efficient Subgraph Matching on Billion Node Graphs

    Full text link
    The ability to handle large scale graph data is crucial to an increasing number of applications. Much work has been dedicated to supporting basic graph operations such as subgraph matching, reachability, regular expression matching, etc. In many cases, graph indices are employed to speed up query processing. Typically, most indices require either super-linear indexing time or super-linear indexing space. Unfortunately, for very large graphs, super-linear approaches are almost always infeasible. In this paper, we study the problem of subgraph matching on billion-node graphs. We present a novel algorithm that supports efficient subgraph matching for graphs deployed on a distributed memory store. Instead of relying on super-linear indices, we use efficient graph exploration and massive parallel computing for query processing. Our experimental results demonstrate the feasibility of performing subgraph matching on web-scale graph data.Comment: VLDB201

    Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy

    Full text link
    Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably \cite{LHR+10}, has suggested that with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ϵ\epsilon-) and approximate (i.e., (ϵ\epsilon, δ\delta)-) differential privacy definitions. We derive the utility guarantees of LRM, and provide guidance on how to set the privacy parameters given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.Comment: ACM Transactions on Database Systems (ACM TODS). arXiv admin note: text overlap with arXiv:1212.230
    corecore