12 research outputs found

    Gray code order for Lyndon words

    Get PDF
    International audienceAt the 4th Conference on Combinatorics on Words, Christophe Reutenauer posed the question of whether the dual reflected order yields a Gray code on the Lyndon family. In this paper we give a positive answer. More precisely, we present an O(1)-average-time algorithm for generating length n binary pre-necklaces, necklaces and Lyndon words in Gray code order

    Gray code order for Lyndon words

    Get PDF
    At the 4th Conference on Combinatorics on Words, Christophe Reutenauer posed the question of whether the dual reflected order yields a Gray code on the Lyndon family. In this paper we give a positive answer. More precisely, we present an O(1)-average-time algorithm for generating length n binary pre-necklaces, necklaces and Lyndon words in Gray code order

    Multi-threading a state-of-the-art maximum clique algorithm

    Get PDF
    We present a threaded parallel adaptation of a state-of-the-art maximum clique algorithm for dense, computationally challenging graphs. We show that near-linear speedups are achievable in practice and that superlinear speedups are common. We include results for several previously unsolved benchmark problems

    Secret Sharing Schemes Based on Resilient Boolean Maps

    Get PDF
    We introduce a linear code based on resilient maps on vector spaces over finite fields, we give a basis of this code and upper and lower bounds for its minimal distance. Then the use of the introduced code for building vector space secret sharing schemes is explained and an estimation of the robustness of the schemes against cheaters is provided

    Learning Combinatorial Node Labeling Algorithms

    Full text link
    We present a graph neural network to learn graph coloring heuristics using reinforcement learning. Our learned deterministic heuristics give better solutions than classical degree-based greedy heuristics and only take seconds to evaluate on graphs with tens of thousands of vertices. As our approach is based on policy-gradients, it also learns a probabilistic policy as well. These probabilistic policies outperform all greedy coloring baselines and a machine learning baseline. Our approach generalizes several previous machine-learning frameworks, which applied to problems like minimum vertex cover. We also demonstrate that our approach outperforms two greedy heuristics on minimum vertex cover

    Replicable parallel branch and bound search

    Get PDF
    Combinatorial branch and bound searches are a common technique for solving global optimisation and decision problems. Their performance often depends on good search order heuristics, refined over decades of algorithms research. Parallel search necessarily deviates from the sequential search order, sometimes dramatically and unpredictably, e.g. by distributing work at random. This can disrupt effective search order heuristics and lead to unexpected and highly variable parallel performance. The variability makes it hard to reason about the parallel performance of combinatorial searches. This paper presents a generic parallel branch and bound skeleton, implemented in Haskell, with replicable parallel performance. The skeleton aims to preserve the search order heuristic by distributing work in an ordered fashion, closely following the sequential search order. We demonstrate the generality of the approach by applying the skeleton to 40 instances of three combinatorial problems: Maximum Clique, 0/1 Knapsack and Travelling Salesperson. The overheads of our Haskell skeleton are reasonable: giving slowdown factors of between 1.9 and 6.2 compared with a class-leading, dedicated, and highly optimised C++ Maximum Clique solver. We demonstrate scaling up to 200 cores of a Beowulf cluster, achieving speedups of 100x for several Maximum Clique instances. We demonstrate low variance of parallel performance across all instances of the three combinatorial problems and at all scales up to 200 cores, with median Relative Standard Deviation (RSD) below 2%. Parallel solvers that do not follow the sequential search order exhibit far higher variance, with median RSD exceeding 85% for Knapsack

    Certifying Solvers for Clique and Maximum Common (Connected) Subgraph Problems

    Get PDF
    An algorithm is said to be certifying if it outputs, together with a solution to the problem it solves, a proof that this solution is correct. We explain how state of the art maximum clique, maximum weighted clique, maximal clique enumeration and maximum common (connected) induced subgraph algorithms can be turned into certifying solvers by using pseudo-Boolean models and cutting planes proofs, and demonstrate that this approach can also handle reductions between problems. The generality of our results suggests that this method is ready for widespread adoption in solvers for combinatorial graph problems

    Preserving user privacy in social media data processing

    Get PDF
    Social media data is used for analytics, e.g., in science, authorities or the industry. Privacy is often considered a secondary problem. However, protecting the privacy of social media users is demanded by laws and ethics. In order to prevent subsequent abuse, theft or public exposure of collected datasets, privacy-aware data processing is crucial. This dissertation presents a concept to process social media data with social media user’s privacy in mind. It features a data storage concept based on the cardinality estimator HyperLogLog to store social media data, so that it is not possible to extract individual items from it, but only to estimate the cardinality of items within a certain set, plus running set operations over multiple sets to extend analytical ranges. Applying this method requires to define the scope of the result before even gathering the data. This prevents the data from being misused for other purposes at a later point in time and thus follows the privacy by design principles. This work further shows methods to increase privacy through the implementation of abstraction layers. An included case study demonstrates the presented methods to be suitable for application in the field.:1 Introduction 1.1 Problem 1.2 Research objectives 1.3 Document structure 2 Related work 2.1 The notion of privacy 2.2 Privacy by design 2.3 Differential privacy 2.4 Geoprivacy 2.5 Probabilistic Data Structures 3 Concept and methods 3.1 Collateral data 3.2 Disposable data 3.3 Cardinality estimation 3.4 Data precision 3.5 Extendability 3.6 Abstraction 3.7 Time consideration 4 Summary of publications 4.1 HyperLogLog Introduction 4.2 VOST Case Study 4.3 Real-time Streaming 4.4 Abstraction Layers 4.5 VGIscience Book Chapter 4.6 Supplementary Software Materials 5 Discussion 5.1 Prevent accidental data disclosure 5.2 Feasibility in the field 5.3 Adjustability for different use cases 5.4 Limitations of HLL 5.5 Security 5.6 Outlook and further research 6 Conclusion Appendix References Publication

    Efficient parameterized algorithms on structured graphs

    Get PDF
    In der klassischen Komplexitätstheorie werden worst-case Laufzeiten von Algorithmen typischerweise einzig abhängig von der Eingabegröße angegeben. In dem Kontext der parametrisierten Komplexitätstheorie versucht man die Analyse der Laufzeit dahingehend zu verfeinern, dass man zusätzlich zu der Eingabengröße noch einen Parameter berücksichtigt, welcher angibt, wie strukturiert die Eingabe bezüglich einer gewissen Eigenschaft ist. Ein parametrisierter Algorithmus nutzt dann diese beschriebene Struktur aus und erreicht so eine Laufzeit, welche schneller ist als die eines besten unparametrisierten Algorithmus, falls der Parameter klein ist. Der erste Hauptteil dieser Arbeit führt die Forschung in diese Richtung weiter aus und untersucht den Einfluss von verschieden Parametern auf die Laufzeit von bekannten effizient lösbaren Problemen. Einige vorgestellte Algorithmen sind dabei adaptive Algorithmen, was bedeutet, dass die Laufzeit von diesen Algorithmen mit der Laufzeit des besten unparametrisierten Algorithm für den größtmöglichen Parameterwert übereinstimmt und damit theoretisch niemals schlechter als die besten unparametrisierten Algorithmen und übertreffen diese bereits für leicht nichttriviale Parameterwerte. Motiviert durch den allgemeinen Erfolg und der Vielzahl solcher parametrisierten Algorithmen, welche eine vielzahl verschiedener Strukturen ausnutzen, untersuchen wir im zweiten Hauptteil dieser Arbeit, wie man solche unterschiedliche homogene Strukturen zu mehr heterogenen Strukturen vereinen kann. Ausgehend von algebraischen Ausdrücken, welche benutzt werden können, um von Parametern beschriebene Strukturen zu definieren, charakterisieren wir klar und robust heterogene Strukturen und zeigen exemplarisch, wie sich die Parameter tree-depth und modular-width heterogen verbinden lassen. Wir beschreiben dazu effiziente Algorithmen auf heterogenen Strukturen mit Laufzeiten, welche im Spezialfall mit den homogenen Algorithmen übereinstimmen.In classical complexity theory, the worst-case running times of algorithms depend solely on the size of the input. In parameterized complexity the goal is to refine the analysis of the running time of an algorithm by additionally considering a parameter that measures some kind of structure in the input. A parameterized algorithm then utilizes the structure described by the parameter and achieves a running time that is faster than the best general (unparameterized) algorithm for instances of low parameter value. In the first part of this thesis, we carry forward in this direction and investigate the influence of several parameters on the running times of well-known tractable problems. Several presented algorithms are adaptive algorithms, meaning that they match the running time of a best unparameterized algorithm for worst-case parameter values. Thus, an adaptive parameterized algorithm is asymptotically never worse than the best unparameterized algorithm, while it outperforms the best general algorithm already for slightly non-trivial parameter values. As illustrated in the first part of this thesis, for many problems there exist efficient parameterized algorithms regarding multiple parameters, each describing a different kind of structure. In the second part of this thesis, we explore how to combine such homogeneous structures to more general and heterogeneous structures. Using algebraic expressions, we define new combined graph classes of heterogeneous structure in a clean and robust way, and we showcase this for the heterogeneous merge of the parameters tree-depth and modular-width, by presenting parameterized algorithms on such heterogeneous graph classes and getting running times that match the homogeneous cases throughout

    Optimization Methods for Cluster Analysis in Network-based Data Mining

    Get PDF
    This dissertation focuses on two optimization problems that arise in network-based data mining, concerning identification of basic community structures (clusters) in graphs: the maximum edge weight clique and maximum induced cluster subgraph problems. We propose a continuous quadratic formulation for the maximum edge weight clique problem, and establish the correspondence between its local optima and maximal cliques in the graph. Subsequently, we present a combinatorial branch-and-bound algorithm for this problem that takes advantage of a polynomial-time solvable nonconvex relaxation of the proposed formulation. We also introduce a linear-time-computable analytic upper bound on the clique number of a graph, as well as a new method of upper-bounding the maximum edge weight clique problem, which leads to another exact algorithm for this problem. For the maximum induced cluster subgraph problem, we present the results of a comprehensive polyhedral analysis. We derive several families of facet-defining valid inequalities for the IUC polytope associated with a graph. We also provide a complete description of this polytope for some special classes of graphs. We establish computational complexity of the separation problems for most of the considered families of valid inequalities, and explore the effectiveness of employing the corresponding cutting planes in an integer (linear) programming framework for the maximum induced cluster subgraph problem
    corecore