18 research outputs found

    On Defining SPARQL with Boolean Tensor Algebra

    Full text link
    The Resource Description Framework (RDF) represents information as subject-predicate-object triples. These triples are commonly interpreted as a directed labelled graph. We propose an alternative approach, interpreting the data as a 3-way Boolean tensor. We show how SPARQL queries - the standard queries for RDF - can be expressed as elementary operations in Boolean algebra, giving us a complete re-interpretation of RDF and SPARQL. We show how the Boolean tensor interpretation allows for new optimizations and analyses of the complexity of SPARQL queries. For example, estimating the size of the results for different join queries becomes much simpler

    Clustering Boolean Tensors

    Full text link
    Tensor factorizations are computationally hard problems, and in particular, are often significantly harder than their matrix counterparts. In case of Boolean tensor factorizations -- where the input tensor and all the factors are required to be binary and we use Boolean algebra -- much of that hardness comes from the possibility of overlapping components. Yet, in many applications we are perfectly happy to partition at least one of the modes. In this paper we investigate what consequences does this partitioning have on the computational complexity of the Boolean tensor factorizations and present a new algorithm for the resulting clustering problem. This algorithm can alternatively be seen as a particularly regularized clustering algorithm that can handle extremely high-dimensional observations. We analyse our algorithms with the goal of maximizing the similarity and argue that this is more meaningful than minimizing the dissimilarity. As a by-product we obtain a PTAS and an efficient 0.828-approximation algorithm for rank-1 binary factorizations. Our algorithm for Boolean tensor clustering achieves high scalability, high similarity, and good generalization to unseen data with both synthetic and real-world data sets

    Preface

    Get PDF
    7th International Conference on Similarity Search and Applications (SISAP).\ud Los Cabos, México. 29-31 october 2014

    Structural building blocks in graph data : characterised by hyperbolic communities and uncovered by Boolean tensor clustering

    Get PDF
    Graph data nowadays easily become so large that it is infeasible to study the underlying structures manually. Thus, computational methods are needed to uncover large-scale structural information. In this thesis, we present methods to understand and summarise large networks. We propose the hyperbolic community model to describe groups of more densely connected nodes within networks using very intuitive parameters. The model accounts for a frequent connectivity pattern in real data: a few community members are highly interconnected; most members mainly have ties to this core. Our model fits real data much better than previously-proposed models. Our corresponding random graph generator, HyGen, creates graphs with realistic intra-community structure. Using the hyperbolic model, we conduct a large-scale study of the temporal evolution of communities on online question–answer sites. We observe that the user activity within a community is constant with respect to its size throughout its lifetime, and a small group of users is responsible for the majority of the social interactions. We propose an approach for Boolean tensor clustering. This special tensor factorisation is restricted to binary data and assumes that one of the tensor directions has only non-overlapping factors. These assumptions – valid for many real-world data, in particular time-evolving networks – enable the use of bitwise operators and lift much of the computational complexity from the task.Netzwerke sind heutzutage oft so groß und unübersichtlich, dass manuelle Analysen nicht reichen, um sie zu verstehen. Um zugrundeliegende Strukturen im großen Maßstab zu identifizieren, bedarf es computergestützter Methoden. Unser Modell für hyperbolische Gemeinschaften beschreibt die innere Struktur eng verknüpfter Knotengruppen in Netzwerken mit sehr eingängigen Parametern. Es basiert auf der Beobachtung, dass oft ein kleiner Teil der Knoten einer Gruppe eng miteinander verknüpft ist und die Mehrheit der Gruppenmitglieder nur Verbindungen zu diesem Zentrum aufweist. Unser Modell bildet echte Daten besser ab als bisherige Modelle. Der entsprechende Zufallsgraphgenerator, HyGen, erzeugt Graphen mit realistischen innergemeinschaftlichen Strukturen. Anhand unseres Modells analysieren wir die Bildung von Gemeinschaften in online Frage-und-Antwort-Netzwerken. Wir beobachten, dass die Aktivität der Mitglieder über die Zeit konstant ist, bezogen auf die Größe der jeweiligen Gemeinschaft. Außerdem ist stets eine kleine Gruppe von Mitgliedern verantwortlich für den Großteil der Aktivität. Wir schlagen eine Methode für Boolesches Tensor Clustering vor. Diese spezielle Tensorfaktorisierung ist beschränkt auf binäre Daten und wir nehmen an, dass es entlang einer Richtung des Tensors keinen nennenswerten Überlapp der Faktoren gibt. Diese Annahmen ermöglichen die Nutzung von Bitoperationen, mindern den Rechenaufwand erheblich und passen gut zu dem, was in echten Daten zu beobachten ist.Max-Planck-Institut für Informati

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Foundations of Software Science and Computation Structures

    Get PDF
    This open access book constitutes the proceedings of the 25th International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2022, which was held during April 4-6, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 23 regular papers presented in this volume were carefully reviewed and selected from 77 submissions. They deal with research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems

    Foundations of Software Science and Computation Structures

    Get PDF
    This open access book constitutes the proceedings of the 25th International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2022, which was held during April 4-6, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 23 regular papers presented in this volume were carefully reviewed and selected from 77 submissions. They deal with research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems
    corecore