2,518 research outputs found

    Efficient Parallel Translating Embedding For Knowledge Graphs

    Full text link
    Knowledge graph embedding aims to embed entities and relations of knowledge graphs into low-dimensional vector spaces. Translating embedding methods regard relations as the translation from head entities to tail entities, which achieve the state-of-the-art results among knowledge graph embedding methods. However, a major limitation of these methods is the time consuming training process, which may take several days or even weeks for large knowledge graphs, and result in great difficulty in practical applications. In this paper, we propose an efficient parallel framework for translating embedding methods, called ParTrans-X, which enables the methods to be paralleled without locks by utilizing the distinguished structures of knowledge graphs. Experiments on two datasets with three typical translating embedding methods, i.e., TransE [3], TransH [17], and a more efficient variant TransE- AdaGrad [10] validate that ParTrans-X can speed up the training process by more than an order of magnitude.Comment: WI 2017: 460-46

    Automating the transformation-based analysis of visual languages

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s00165-009-0114-yWe present a novel approach for the automatic generation of model-to-model transformations given a description of the operational semantics of the source language in the form of graph transformation rules. The approach is geared to the generation of transformations from Domain-Specific Visual Languages (DSVLs) into semantic domains with an explicit notion of transition, like for example Petri nets. The generated transformation is expressed in the form of operational triple graph grammar rules that transform the static information (initial model) and the dynamics (source rules and their execution control structure). We illustrate these techniques with a DSVL in the domain of production systems, for which we generate a transformation into Petri nets. We also tackle the description of timing aspects in graph transformation rules, and its analysis through their automatic translation into Time Petri netsWork sponsored by the Spanish Ministry of Science and Innovation, project METEORIC (TIN2008-02081/TIN) and by the Canadian Natural Sciences and Engineering Research Council (NSERC)

    Optimization of Regular Path Queries in Graph Databases

    Get PDF
    Regular path queries offer a powerful navigational mechanism in graph databases. Recently, there has been renewed interest in such queries in the context of the Semantic Web. The extension of SPARQL in version 1.1 with property paths offers a type of regular path query for RDF graph databases. While eminently useful, such queries are difficult to optimize and evaluate efficiently, however. We design and implement a cost-based optimizer we call Waveguide for SPARQL queries with property paths. Waveguide builds a query planwhich we call a waveplan (WP)which guides the query evaluation. There are numerous choices in the con- struction of a plan, and a number of optimization methods, so the space of plans for a query can be quite large. Execution costs of plans for the same query can vary by orders of magnitude with the best plan often offering excellent performance. A WPs costs can be estimated, which opens the way to cost-based optimization. We demonstrate that Waveguide properly subsumes existing techniques and that the new plans it adds are relevant. We analyze the effective plan space which is enabled by Waveguide and design an efficient enumerator for it. We implement a pro- totype of a Waveguide cost-based optimizer on top of an open-source relational RDF store. Finally, we perform a comprehensive performance study of the state of the art for evaluation of SPARQL property paths and demonstrate the significant performance gains that Waveguide offers

    Querying the web of data with low latency: high performance distributed SPARQL processing and benchmarking

    No full text
    The Web of Data extends the World Wide Web (WWW) in a way that applications can understand information and cooperate with humans on complex tasks. The basis of performing complex tasks is low latency queries over the Web of Data. The large scale and distributed nature of the Web of Data have negative impacts on several critical factors for efficient query processing, including fast data transmission between datasets, predictable data distribution and statistics that summarise and describe certain patterns in the data. Moreover, it is common on the Web of Data that the same resource is identified by multiple URIs. This phenomenon, named co-reference, potentially increases the complexity of query processing, and makes it even harder to obtain accurate statistics. With the aforementioned challenges, it is not clear whether it is possible to achieve efficient queries on the Web of Data on a large scale.In this thesis, we explore techniques to improve the efficiency of querying the Web of Data on a large scale. More specifically, we investigate two typical scenarios on the Web of Data, which are: 1) the scenario in which all datasets provide detailed statistics that are possibly available on a large scale, and 2) the scenario in which co-reference is taken into account, and datasetsā€™ statistics are not reliable. For each scenario we explore existing and novel optimisation techniques that are tailored for querying the Web of Data, as well as well developed techniques with careful adjustments.For the scenario with detailed statistics we provide a scheme that implements a statistics query optimisation approach that requires detailed statistics, and intensively exploits parallelism. We propose an efficient algorithm called Parallel Sub-query Identification () to increase the degree of parallelism. () breaks a SPARQL query into sub-queries that can be processed in parallel while not increasing network traffic. We combine with dynamic programming to produce query plans with both minimum costs and a fair degree of parallelism. Furthermore, we develop a mechanism that maximally exploits bandwidth and computing power of datasets. For the scenario having co-reference and without reliable statistics we provide a scheme that implements a dynamic query optimisation approach that takes co-reference into account, and utilises runtime statistics to elevate query efficiency even further. We propose a model called Virtual Graph to transform a query and all its co-referent siblings into a single query with pre-defined bindings. Virtual Graph reduces the large number of outgoing and incoming requests that is required to process co-referent queries individually. Moreover, Virtual Graph enables query optimisers to find the optimal plan with respect to all co-referent queries as a whole. () is used in this scheme as well but provides a higher degree of parallelism with the help of runtime statistics. A Minimum-Spanning-Tree-based algorithm is used in this scheme as a result of using runtime statistics. The same parallel execution mechanism used in the previous scenario is adopted here as well.In order to examine the effectiveness of our schemes in practice, we deploy the above approaches in two distributed SPARQL engines, LHD-s and LHD-d respectively. Both engines are implemented using a popular Java-based platform for building Semantic Web applications. They can be used as either standalone applications or integrated into existing systems that require quick response of Linked Data queries.We also propose a scalable and flexible benchmark, called Distributed SPARQL Evaluation Framework (DSEF), for evaluating optimisation approaches in the Web of Data. DSEF adopts a expandable virtual-machine-based structure and provides a set of efficient tools to help easily set up RDF networks of arbitrary sizes. We further investigate the proportion and distribution of co-reference in the real world, based on which DESF is able to simulate co-reference for given RDF datasets. DSEF bases its soundness in the usage of widely accepted assessment data and queries.By comparing both LHD-s and LHD-d with existing approaches using DSEF, we provide evidence that neither existing statistics provided by datasets nor cost estimation methods, are sufficiently accurate. On the other hand, dynamic optimisation using runtime statistics together with carefully tuned parallelism are promising for significantly reducing the latency of large scale queries on the Web of Data. We also demonstrate that () and Virtual Graph algorithms significantly increase query efficiency for queries with or without co-reference.In summary, the contributions of this these include: 1) proposing two schemes for improving query efficiency in two typical scenarios in the Web of Data; 2) providing implementations, named LHD-s and LHD-d, for the two schemes respectively; 3) proposing a scalable and flexible evaluation framework for distributed SPARQL engines called DSEF; and 4) showing evidence that runtime-statistics-based dynamic optimisation with parallelism are promising to reduce latency of Linked Data queries on a large scale

    Decentralized Knowledge Graphs on the Web

    Get PDF

    Paging and Registration in Cellular Networks: Jointly Optimal Policies and an Iterative Algorithm

    Full text link
    This paper explores optimization of paging and registration policies in cellular networks. Motion is modeled as a discrete-time Markov process, and minimization of the discounted, infinite-horizon average cost is addressed. The structure of jointly optimal paging and registration policies is investigated through the use of dynamic programming for partially observed Markov processes. It is shown that there exist policies with a certain simple form that are jointly optimal, though the dynamic programming approach does not directly provide an efficient method to find the policies. An iterative algorithm for policies with the simple form is proposed and investigated. The algorithm alternates between paging policy optimization and registration policy optimization. It finds a pair of individually optimal policies, but an example is given showing that the policies need not be jointly optimal. Majorization theory and Riesz's rearrangement inequality are used to show that jointly optimal paging and registration policies are given for symmetric or Gaussian random walk models by the nearest-location-first paging policy and distance threshold registration policies.Comment: 13 pages, submitted to IEEE Trans. Information Theor
    • ā€¦
    corecore