410 research outputs found

    Markov-Chain-Based Heuristics for the Feedback Vertex Set Problem for Digraphs

    Get PDF
    A feedback vertex set (FVS) of an undirected or directed graph G=(V, A) is a set F such that G-F is acyclic. The minimum feedback vertex set problem asks for a FVS of G of minimum cardinality whereas the weighted minimum feedback vertex set problem consists of determining a FVS F of minimum weight w(F) given a real-valued weight function w. Both problems are NP-hard [Karp72]. Nethertheless, they have been found to have applications in many fields. So one is naturally interested in approximation algorithms. While most of the existing approximation algorithms for feedback vertex set problems rely on local properties of G only, this thesis explores strategies that use global information about G in order to determine good solutions. The pioneering work in this direction has been initiated by Speckenmeyer [Speckenmeyer89]. He demonstrated the use of Markov chains for determining low cardinality FVSs. Based on his ideas, new approximation algorithms are developed for both the unweighted and the weighted minimum feedback vertex set problem for digraphs. According to the experimental results presented in this thesis, these new algorithms outperform all other existing approximation algorithms. An additional contribution, not related to Markov chains, is the identification of a new class of digraphs G=(V, A) which permit the determination of an optimum FVS in time O(|V|^4). This class strictly encompasses the completely contractible graphs [Levy/Low88]

    Scalable Learning of Bayesian Networks Using Feedback Arc Set-Based Heuristics

    Get PDF
    Bayesianske nettverk er en viktig klasse av probabilistiske grafiske modeller. De består av en struktur (en rettet asyklisk graf) som beskriver betingede uavhengighet mellom stokastiske variabler og deres parametere (lokale sannsynlighetsfordelinger). Med andre ord er Bayesianske nettverk generative modeller som beskriver simultanfordelingene på en kompakt form. Den største utfordringen med å lære et Bayesiansk nettverk skyldes selve strukturen, og på grunn av den kombinatoriske karakteren til asyklisitetsegenskapen er det ingen overraskelse at strukturlæringsproblemet generelt er NP-hardt. Det eksisterer algoritmer som løser dette problemet eksakt: dynamisk programmering og heltalls lineær programmering er de viktigste kandidatene når man ønsker å finne strukturen til små til mellomstore Bayesianske nettverk fra data. På den annen side er heuristikk som bakkeklatringsvarianter ofte brukt når man forsøker å lære strukturen til større nettverk med tusenvis av variabler, selv om disse heuristikkene vanligvis ikke har teoretiske garantier og ytelsen i praksis kan bli uforutsigbar når man arbeider med storskala læring. Denne oppgaven tar for seg utvikling av skalerbare metoder som takler det strukturlæringsproblemet av Bayesianske nettverk, samtidig som det forsøkes å opprettholde et nivå av teoretisk kontroll. Dette ble oppnådd ved bruk av relaterte kombinatoriske problemer, nemlig det maksimale asykliske subgrafproblemet (maximum acyclic subgraph) og det duale problemet (feedback arc set). Selv om disse problemene er NP-harde i seg selv, er de betydelig mer håndterbare i praksis. Denne oppgaven utforsker måter å kartlegge Bayesiansk nettverksstrukturlæring til maksimale asykliske subgrafforekomster og trekke ut omtrentlige løsninger for det første problemet, basert på løsninger oppnådd for det andre. Vår forskning tyder på at selv om økt skalerbarhet kan oppnås på denne måten, er det adskillig mer utfordrende å opprettholde den teoretisk forståelsen med denne tilnærmingen. Videre fant vi ut at å lære strukturen til Bayesianske nettverk basert på maksimal asyklisk subgraf kanskje ikke er den beste metoden generelt, men vi identifiserte en kontekst - lineære strukturelle ligningsmodeller - der vi eksperimentelt kunne validere fordelene med denne tilnærmingen, som fører til rask og skalerbar identifisering av strukturen og med mulighet til å lære komplekse strukturer på en måte som er konkurransedyktig med moderne metoder.Bayesian networks form an important class of probabilistic graphical models. They consist of a structure (a directed acyclic graph) expressing conditional independencies among random variables, as well as parameters (local probability distributions). As such, Bayesian networks are generative models encoding joint probability distributions in a compact form. The main difficulty in learning a Bayesian network comes from the structure itself, owing to the combinatorial nature of the acyclicity property; it is well known and does not come as a surprise that the structure learning problem is NP-hard in general. Exact algorithms solving this problem exist: dynamic programming and integer linear programming are prime contenders when one seeks to recover the structure of small-to-medium sized Bayesian networks from data. On the other hand, heuristics such as hill climbing variants are commonly used when attempting to approximately learn the structure of larger networks with thousands of variables, although these heuristics typically lack theoretical guarantees and their performance in practice may become unreliable when dealing with large scale learning. This thesis is concerned with the development of scalable methods tackling the Bayesian network structure learning problem, while attempting to maintain a level of theoretical control. This was achieved via the use of related combinatorial problems, namely the maximum acyclic subgraph problem and its dual problem the minimum feedback arc set problem. Although these problems are NP-hard themselves, they exhibit significantly better tractability in practice. This thesis explores ways to map Bayesian network structure learning into maximum acyclic subgraph instances and extract approximate solutions for the first problem, based on the solutions obtained for the second. Our research suggests that although increased scalability can be achieved this way, maintaining theoretical understanding based on this approach is much more challenging. Furthermore, we found that learning the structure of Bayesian networks based on maximum acyclic subgraph/minimum feedback arc set may not be the go-to method in general, but we identified a setting - linear structural equation models - in which we could experimentally validate the benefits of this approach, leading to fast and scalable structure recovery with the ability to learn complex structures in a competitive way compared to state-of-the-art baselines.Doktorgradsavhandlin

    An Efficient Semi-Streaming PTAS for Tournament Feedback Arc Set with Few Passes

    Get PDF
    We present the first semi-streaming polynomial-time approximation scheme (PTAS) for the minimum feedback arc set problem on directed tournaments in a small number of passes. Namely, we obtain a (1 + ?)-approximation in time O (poly(n) 2^{poly(1/?)}), with p passes, in n^{1+1/p} ? poly((log n)/?) space. The only previous algorithm with this pass/space trade-off gave a 3-approximation (SODA, 2020), and other polynomial-time algorithms which achieved a (1+?)-approximation did so with quadratic memory or with a linear number of passes. We also present a new time/space trade-off for 1-pass algorithms that solve the tournament feedback arc set problem. This problem has several applications in machine learning such as creating linear classifiers and doing Bayesian inference. We also provide several additional algorithms and lower bounds for related streaming problems on directed graphs, which is a largely unexplored territory

    Linear Orderings of Sparse Graphs

    Get PDF
    The Linear Ordering problem consists in finding a total ordering of the vertices of a directed graph such that the number of backward arcs, i.e., arcs whose heads precede their tails in the ordering, is minimized. A minimum set of backward arcs corresponds to an optimal solution to the equivalent Feedback Arc Set problem and forms a minimum Cycle Cover. Linear Ordering and Feedback Arc Set are classic NP-hard optimization problems and have a wide range of applications. Whereas both problems have been studied intensively on dense graphs and tournaments, not much is known about their structure and properties on sparser graphs. There are also only few approximative algorithms that give performance guarantees especially for graphs with bounded vertex degree. This thesis fills this gap in multiple respects: We establish necessary conditions for a linear ordering (and thereby also for a feedback arc set) to be optimal, which provide new and fine-grained insights into the combinatorial structure of the problem. From these, we derive a framework for polynomial-time algorithms that construct linear orderings which adhere to one or more of these conditions. The analysis of the linear orderings produced by these algorithms is especially tailored to graphs with bounded vertex degrees of three and four and improves on previously known upper bounds. Furthermore, the set of necessary conditions is used to implement exact and fast algorithms for the Linear Ordering problem on sparse graphs. In an experimental evaluation, we finally show that the property-enforcing algorithms produce linear orderings that are very close to the optimum and that the exact representative delivers solutions in a timely manner also in practice. As an additional benefit, our results can be applied to the Acyclic Subgraph problem, which is the complementary problem to Feedback Arc Set, and provide insights into the dual problem of Feedback Arc Set, the Arc-Disjoint Cycles problem

    Parameterized Enumeration of Neighbour Strings and Kemeny Aggregations

    Get PDF
    In this thesis, we consider approaches to enumeration problems in the parameterized complexity setting. We obtain competitive parameterized algorithms to enumerate all, as well as several of, the solutions for two related problems Neighbour String and Kemeny Rank Aggregation. In both problems, the goal is to find a solution that is as close as possible to a set of inputs (strings and total orders, respectively) according to some distance measure. We also introduce a notion of enumerative kernels for which there is a bijection between solutions to the original instance and solutions to the kernel, and provide such a kernel for Kemeny Rank Aggregation, improving a previous kernel for the problem. We demonstrate how several of the algorithms and notions discussed in this thesis are extensible to a group of parameterized problems, improving published results for some other problems