    Space-Efficient Data Structures in the Word-RAM and Bitprobe Models

    This thesis studies data structures in the word-RAM and bitprobe models, with an emphasis on space efficiency. In the word-RAM model of computation the space cost of a data structure is measured in terms of the number of w-bit words stored in memory, and the cost of answering a query is measured in terms of the number of read, write, and arithmetic operations that must be performed. In the bitprobe model, like the word-RAM model, the space cost is measured in terms of the number of bits stored in memory, but the query cost is measured solely in terms of the number of bit accesses, or probes, that are performed. First, we examine the problem of succinctly representing a partially ordered set, or poset, in the word-RAM model with word size Theta(lg n) bits. A succinct representation of a combinatorial object is one that occupies space matching the information theoretic lower bound to within lower order terms. We show how to represent a poset on n vertices using a data structure that occupies n^2/4 + o(n^2) bits, and can answer precedence (i.e., less-than) queries in constant time. Since the transitive closure of a directed acyclic graph is a poset, this implies that we can support reachability queries on an arbitrary directed graph in the same space bound. As far as we are aware, this is the first representation of an arbitrary directed graph that supports reachability queries in constant time, and stores less than n choose 2 bits. We also consider several additional query operations. Second, we examine the problem of supporting range queries on strings of n characters (or, equivalently, arrays of n elements) in the word-RAM model with word size Theta(lg n) bits. We focus on the specific problem of answering range majority queries: i.e., given a range, report the character that is the majority among those in the range, if one exists. We show that these queries can be supported in constant time using a linear space (in words) data structure. We generalize this result in several directions, considering various frequency thresholds, geometric variants of the problem, and dynamism. These results are in stark contrast to recent work on the similar range mode problem, in which the query operation asks for the mode (i.e., most frequent) character in a given range. The current best data structures for the range mode problem take soft-Oh(n^(1/2)) time per query for linear space data structures. Third, we examine the deterministic membership (or dictionary) problem in the bitprobe model. This problem asks us to store a set of n elements drawn from a universe [1,u] such that membership queries can be always answered in t bit probes. We present several new fully explicit results for this problem, in particular for the case when n = 2, answering an open problem posed by Radhakrishnan, Shah, and Shannigrahi [ESA 2010]. We also present a general strategy for the membership problem that can be used to solve many related fundamental problems, such as rank, counting, and emptiness queries. Finally, we conclude with a list of open problems and avenues for future work

    Efficient computation of rank probabilities in posets

    As the title of this work indicates, the central theme in this work is the computation of rank probabilities of posets. Since the probability space consists of the set of all linear extensions of a given poset equipped with the uniform probability measure, in first instance we develop algorithms to explore this probability space efficiently. We consider in particular the problem of counting the number of linear extensions and the ability to generate extensions uniformly at random. Algorithms based on the lattice of ideals representation of a poset are developed. Since a weak order extension of a poset can be regarded as an order on the equivalence classes of a partition of the given poset not contradicting the underlying order, and thus as a generalization of the concept of a linear extension, algorithms are developed to count and generate weak order extensions uniformly at random as well. However, in order to reduce the inherent complexity of the problem, the cardinalities of the equivalence classes is fixed a priori. Due to the exponential nature of these algorithms this approach is still not always feasible, forcing one to resort to approximative algorithms if this is the case. It is well known that Markov chain Monte Carlo methods can be used to generate linear extensions uniformly at random, but no such approaches have been used to generate weak order extensions. Therefore, an algorithm that can be used to sample weak order extensions uniformly at random is introduced. A monotone assignment of labels to objects from a poset corresponds to the choice of a weak order extension of the poset. Since the random monotone assignment of such labels is a step in the generation process of random monotone data sets, the ability to generate random weak order extensions clearly is of great importance. The contributions from this part therefore prove useful in e.g. the field of supervised classification, where a need for synthetic random monotone data sets is present. The second part focuses on the ranking of the elements of a partially ordered set. Algorithms for the computation of the (mutual) rank probabilities that avoid having to enumerate all linear extensions are suggested and applied to a real-world data set containing pollution data of several regions in Baden-WĂĽrttemberg (Germany). With the emergence of several initiatives aimed at protecting the environment like the REACH (Registration, Evaluation, Authorisation and Restriction of Chemicals) project of the European Union, the need for objective methods to rank chemicals, regions, etc. on the basis of several criteria still increases. Additionally, an interesting relation between the mutual rank probabilities and the average rank probabilities is proven. The third and last part studies the transitivity properties of the mutual rank probabilities and the closely related linear extension majority cycles or LEM cycles for short. The type of transitivity is translated into the cycle-transitivity framework, which has been tailor-made for characterizing transitivity of reciprocal relations, and is proven to be situated between strong stochastic transitivity and a new type of transitivity called delta*-transitivity. It is shown that the latter type is situated between strong stochastic transitivity and a kind of product transitivity. Furthermore, theoretical upper bounds for the minimum cutting level to avoid LEM cycles are found. Cutting levels for posets on up to 13 elements are obtained experimentally and a theoretic lower bound for the cutting level to avoid LEM cycles of length 4 is computed. The research presented in this work has been published in international peer-reviewed journals and has been presented on international conferences. A Java implementation of several of the algorithms presented in this work, as well as binary files containing all posets on up to 13 elements with LEM cycles, can be downloaded from the website http://www.kermit.ugent.be

    Static Value Analysis over C Programs

    Analýza rozsahu hodnot (anglicky value-range analysis) je metoda statické analýzy založená na zjišťování hodnot, kterých může daná proměnná nabývat v určitém místě v programu. Tato technika může být použita k dokázání, že se v programu nevyskytují chyby za běhu, jako například přístup za hranici pole. Jelikož analýza rozsahu hodnot získává informace o každém místě v programu, lze k její implementaci využít analýzu toku dat (anglicky data-flow analysis). Cílem této diplomové práce je návrh a implementace funkčního nástroje provádějícího analýzu rozsahu hodnot. Práce začíná úvodem do problematiky, vysvětlením analýz toku dat a hodnot proměnných a popisem abstraktní interpretace, která tvoří formální základ analyzátoru. Následuje seznámení s prostředím Code Listener, které bylo využito k implementaci analyzátoru. Jádro práce tvoří návrh, implementace a otestování analyzátoru. V závěru jsou shrnuty nabyté zkušenosti a diskutovány možnosti budoucího vývoje vytvořeného nástroje.Value-range analysis is a static analysis technique based on arguing about the values that a variable may take on a given program point. It can be used to prove absence of run-time errors such as out-of-bound array accesses. Since value-range analysis collects information on each program point, data-flow analysis can be used in association with it. The main goal of this work is designing and implementing such a value-range analysis tool. The work begins with an introduction into the topic, an explanation of data-flow and value-range analyses and a description of abstract interpretation, which provides the formal basis of the analyser. The core of this work is the design, implementation, testing and evaluation of the analyser. In the conclusion, our personal experience obtained in the area of the thesis is mentioned, along with a discussion of a possible future development of the designed tool.

    Generalizations of comparability graphs

    2022 Summer.Includes bibliographical references.In rational decision-making models, transitivity of preferences is an important principle. In a transitive preference, one who prefers x to y and y to z must prefer x to z. Many preference relations, including total order, weak order, partial order, and semiorder, are transitive. As a preference which is transitive yet not all pairs of elements are comparable, partial orders have been studied extensively. In graph theory, a comparability graph is an undirected graph which connects all comparable elements in a partial order. A transitive orientation is an assignment of direction to every edge so that the resulting directed graph is transitive. A graph is transitive if there is such an assignment. Comparability graphs are a class of graphs where clique, coloring, and many other optimization problems are solved by polynomial algorithms. It also has close connections with other classes of graphs, such as interval graphs, permutation graphs, and perfect graphs. In this dissertation, we define new measures for transitivity to generalize comparability graphs. We introduce the concept of double threshold digraphs together with a parameter λ which we define as our degree of transitivity. We also define another measure of transitivity, β, as the longest directed path such that there is no edge from the first vertex to the last vertex. We present approximation algorithms and parameterized algorithms for optimization problems and demonstrate that they are efficient for "almost-transitive" preferences

    9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021)

    International audienceFormal Concept Analysis (FCA) is a mathematically well-founded theory aimed at classification and knowledge discovery that can be used for many purposes in Artificial Intelligence (AI). The objective of the ninth edition of the FCA4AI workshop (see http://www.fca4ai.hse.ru/) is to investigate several issues such as: how can FCA support various AI activities (knowledge discovery, knowledge engineering, machine learning, data mining, information retrieval, recommendation...), how can FCA be extended in order to help AI researchers to solve new and complex problems in their domains, and how FCA can play a role in current trends in AI such as explainable AI and fairness of algorithms in decision making.The workshop was held in co-location with IJCAI 2021, Montréal, Canada, August, 28 2021

    Subject Index Volumes 1–200

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    LIPIcs, Volume 251, ITCS 2023, Complete Volum
