81 research outputs found

    Certified Context-Free Parsing: A formalisation of Valiant's Algorithm in Agda

    Get PDF
    Valiant (1975) has developed an algorithm for recognition of context free languages. As of today, it remains the algorithm with the best asymptotic complexity for this purpose. In this paper, we present an algebraic specification, implementation, and proof of correctness of a generalisation of Valiant's algorithm. The generalisation can be used for recognition, parsing or generic calculation of the transitive closure of upper triangular matrices. The proof is certified by the Agda proof assistant. The certification is representative of state-of-the-art methods for specification and proofs in proof assistants based on type-theory. As such, this paper can be read as a tutorial for the Agda system

    Accelerating transitive closure of large-scale sparse graphs

    Get PDF
    Finding the transitive closure of a graph is a fundamental graph problem where another graph is obtained in which an edge exists between two nodes if and only if there is a path in our graph from one node to the other. The reachability matrix of a graph is its transitive closure. This thesis describes a novel approach that uses anti-sections to obtain the transitive closure of a graph. It also examines its advantages when implemented in parallel on a CPU using the Hornet graph data structure. Graph representations of real-world systems are typically sparse in nature due to lesser connectivity between nodes. The anti-section approach is designed specifically to improve performance for large scale sparse graphs. The NVIDIA Titan V CPU is used for the execution of the anti-section parallel implementations. The Dual-Round and Hash-Based implementations of the Anti-Section transitive closure approach provide a significant speedup over several parallel and sequential implementations

    Graph Kernels

    Get PDF
    We present a unified framework to study graph kernels, special cases of which include the random walk (GÀrtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexity of kernel computation between unlabeled graphs with n vertices from O(n^6) to O(n^3). We find a spectral decomposition approach even more efficient when computing entire kernel matrices. For labeled graphs we develop conjugate gradient and fixed-point methods that take O(dn^3) time per iteration, where d is the size of the label set. By extending the necessary linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for d-dimensional edge kernels, and O(n^4) in the infinite-dimensional case; on sparse graphs these algorithms only take O(n^2) time per iteration in all cases. Experiments on graphs from bioinformatics and other application domains show that these techniques can speed up computation of the kernel by an order of magnitude or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to R-convolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment kernel of Fröhlich et al. (2006) yet provably positive semi-definite

    Quantum Algorithms for Matrix Products over Semirings

    Full text link
    In this paper we construct quantum algorithms for matrix products over several algebraic structures called semirings, including the (max,min)-matrix product, the distance matrix product and the Boolean matrix product. In particular, we obtain the following results. We construct a quantum algorithm computing the product of two n x n matrices over the (max,min) semiring with time complexity O(n^{2.473}). In comparison, the best known classical algorithm for the same problem, by Duan and Pettie, has complexity O(n^{2.687}). As an application, we obtain a O(n^{2.473})-time quantum algorithm for computing the all-pairs bottleneck paths of a graph with n vertices, while classically the best upper bound for this task is O(n^{2.687}), again by Duan and Pettie. We construct a quantum algorithm computing the L most significant bits of each entry of the distance product of two n x n matrices in time O(2^{0.64L} n^{2.46}). In comparison, prior to the present work, the best known classical algorithm for the same problem, by Vassilevska and Williams and Yuster, had complexity O(2^{L}n^{2.69}). Our techniques lead to further improvements for classical algorithms as well, reducing the classical complexity to O(2^{0.96L}n^{2.69}), which gives a sublinear dependency on 2^L. The above two algorithms are the first quantum algorithms that perform better than the O~(n5/2)\tilde O(n^{5/2})-time straightforward quantum algorithm based on quantum search for matrix multiplication over these semirings. We also consider the Boolean semiring, and construct a quantum algorithm computing the product of two n x n Boolean matrices that outperforms the best known classical algorithms for sparse matrices. For instance, if the input matrices have O(n^{1.686...}) non-zero entries, then our algorithm has time complexity O(n^{2.277}), while the best classical algorithm has complexity O(n^{2.373}).Comment: 19 page

    Provenance à base de semi-anneaux pour les bases de données graphes

    Get PDF
    The growing amount of data collected by sensors or generated by human interaction has led to an increasing use of graph databases, an efficient model for representing intricate data.Techniques to keep track of the history of computations applied to the data inside classical relational database systems are also topical because of their application to enforce Data Protection Regulations (e.g., GDPR).Our research work mixes the two by considering a semiring-based provenance model for navigational queries over graph databases.We first present a comprehensive survey on semiring theory and their applications in different fields of computer sciences, geared towards their relevance for our context. From the richness of the literature, we notably obtain a lower bound for the complexity of the full provenance computation in our setting.In a second part, we focus on the model itself by introducing a toolkit of provenance-aware algorithms, each targeting specific properties of the semiring of use.We notably introduce a new method based on lattice theory permitting an efficient provenance computation for complex graph queries.We propose an open-source implementation of the above-mentioned algorithms, and we conduct an experimental study over real transportation networks of large size, witnessing the practical efficiency of our approach in practical scenarios.We finally consider how this framework is positioned compared to other provenance models such as the semiring-based Datalog provenance model.We make explicit how the methods we applied for graph databases can be extended to Datalog queries, and we show how they can be seen as an extension of the semi-naĂŻve evaluation strategy.To leverage this fact, we extend the capabilities of SoufflĂ©, a state-of-the-art Datalog solver, to design an efficient provenance-aware Datalog evaluator. Experimental results based on our open-source implementation entail the fact this approach stays competitive with dedicated graph solutions, despite being more general.In a final round, we discuss on some research ideas for improving the model, and state open questions raised by our work.L'augmentation du volume de donnĂ©es collectĂ©es par des capteurs et gĂ©nĂ©rĂ©es par des interactions humaines a menĂ© Ă  l'utilisation des bases de donnĂ©es orientĂ©es graphes en tant que modĂšle de reprĂ©sentation efficace pour les donnĂ©es complexes.Les techniques permettant de tracer les calculs qui ont Ă©tĂ© appliquĂ©s aux donnĂ©es au sein d'une base de donnĂ©es relationnelle classique sont sur le devant de la scĂšne, notamment grĂące Ă  leur utilitĂ© pourfaire respecter les rĂ©gulations sur les donnĂ©es privĂ©es telles que le RGPD en Union EuropĂ©enne.Notre travail de recherche croise ces deux problĂ©matiques en s'intĂ©ressant Ă  un modĂšle de provenance Ă  base de semi-anneaux pour les requĂȘtes navigationnelles.Nous commençons par prĂ©senter une Ă©tude approfondie de la thĂ©orie des semi-anneaux et de leurs applications au sein des sciences informatiques en se concentrant sur les rĂ©sultats ayant un intĂ©rĂȘt direct pour notre travail de recherche.La richesse de la littĂ©rature sur le domaine nous a notamment permis d'obtenir une borne infĂ©rieure sur la complexitĂ© de notre modĂšle.Dans une seconde partie, nous Ă©tudions le modĂšle en lui-mĂȘme et introduisons un ensemble cohĂ©rent d'algorithmes permettant d'effectuer des calculs de provenance et adaptĂ©s aux propriĂ©tĂ©s des semi-anneaux utilisĂ©s.Nous introduisons notablement une nouvelle mĂ©thode basĂ©e sur la thĂ©orie des treillis permettant de calculer la provenance pour des requĂȘtes complexes.Nous proposons une implĂ©mentation open-source de ces algorithmes et faisons une Ă©tude expĂ©rimentale sur de larges rĂ©seaux de transport issus de la vie rĂ©elle pour attester de l'efficacitĂ© pratique de notre approche.On s'intĂ©resse finalement au positionnement de ce cadre de travail par rapport Ă  d'autres modĂšles de provenance Ă  base de semi-anneaux. Nous nous intĂ©ressons Ă  Datalog en particulier.Nous dĂ©montrons que les mĂ©thodes que nous avons dĂ©veloppĂ©es pour les bases de donnĂ©es orientĂ©es graphes peuvent se gĂ©nĂ©raliser sur des requĂȘtes Datalog. Nous montrons de plus qu'elles peuvent ĂȘtre vues comme des gĂ©nĂ©ralisations de la mĂ©thode semi-naĂŻve.En se basant sur ce fait-lĂ , nous Ă©tendons les capacitĂ©s de SoufflĂ©, un Ă©valuateur Datalog appartenant Ă  l'Ă©tat de l'art, afin d'effectuer des calculs de provenance pour des requĂȘtes Datalog.Les Ă©tudes expĂ©rimentales basĂ©es sur cette implĂ©mentation open-source confirment que cette approche reste compĂ©titive avec les solutions spĂ©cifiques pour les graphes, mais tout en Ă©tant plus gĂ©nĂ©rale.Nous terminons par une discussion sur les amĂ©liorations possibles du modĂšle et Ă©nonçons les questions ouvertes qui ont Ă©tĂ© soulevĂ©es au cours de ce travail
    • 

    corecore