16 research outputs found

    Algorithms for Exact Structure Discovery in Bayesian Networks

    Get PDF
    Bayesian networks are compact, flexible, and interpretable representations of a joint distribution. When the network structure is unknown but there are observational data at hand, one can try to learn the network structure. This is called structure discovery. This thesis contributes to two areas of structure discovery in Bayesian networks: space--time tradeoffs and learning ancestor relations. The fastest exact algorithms for structure discovery in Bayesian networks are based on dynamic programming and use excessive amounts of space. Motivated by the space usage, several schemes for trading space against time are presented. These schemes are presented in a general setting for a class of computational problems called permutation problems; structure discovery in Bayesian networks is seen as a challenging variant of the permutation problems. The main contribution in the area of the space--time tradeoffs is the partial order approach, in which the standard dynamic programming algorithm is extended to run over partial orders. In particular, a certain family of partial orders called parallel bucket orders is considered. A partial order scheme that provably yields an optimal space--time tradeoff within parallel bucket orders is presented. Also practical issues concerning parallel bucket orders are discussed. Learning ancestor relations, that is, directed paths between nodes, is motivated by the need for robust summaries of the network structures when there are unobserved nodes at work. Ancestor relations are nonmodular features and hence learning them is more difficult than modular features. A dynamic programming algorithm is presented for computing posterior probabilities of ancestor relations exactly. Empirical tests suggest that ancestor relations can be learned from observational data almost as accurately as arcs even in the presence of unobserved nodes.Algoritmeja Bayes-verkkojen rakenteen tarkkaan oppimiseen Bayes-verkot ovat todennäköisyysmalleja, joiden avulla voidaan kuvata muuttujien välisiä suhteita. Bayes-verkko koostuu kahdesta osasta: rakenteesta ja kuhunkin muuttujaan liittyvästä ehdollisesta todennäköisyysjakaumasta. Rakenteen puolestaan muodostaa muuttujien välisiä riippuvuuksia kuvaava suunnattu syklitön verkko. Kun tarkasteltavaa ilmiötä hyvin kuvaavaa Bayes-verkkoa ei tunneta ennalta, mutta ilmiöön liittyvistä muuttujista on kerätty havaintoaineistoa, voidaan sopivia algoritmeja käyttäen yrittää löytää verkkorakenne, joka sovittuu aineistoon mahdollisimman hyvin. Nopeimmat tarkat rakenteenoppimisalgoritmit perustuvat niin kutsuttuun dynaamiseen ohjelmointiin, eli ne pitävät välituloksia muistissa ja näin välttävät suorittamasta samoja laskuja useaan kertaan. Vaikka tällaiset menetelmät ovat suhteellisen nopeita, niiden haittapuolena on suuri muistinkäyttö, joka estää suurten verkkojen rakenteen oppimisen. Väitöskirjan alkuosa käsittelee rakenteenoppimisalgoritmeja, jotka tasapainottelevat ajan- ja muistinkäytön välillä. Kirjassa esitellään menetelmiä, joilla verkon rakenne voidaan oppia tehokkaasti käyttäen hyväksi kaikki käytössä oleva tila. Uusi menetelmä mahdollistaa entistä suurempien verkkojen rakenteen oppimisen. Edellä mainittu menetelmä yleistetään ratkaisemaan Bayes-verkkojen rakenteenoppimisen lisäksi myös niin kutsuttuja permutaatio-ongelmia, joista tunnetuin lienee kauppamatkustajan ongelma. Väitöskirjan loppuosa käsittelee muuttujien välisien esi-isäsuhteiden oppimista. Kyseiset suhteet ovat kiinnostavia, sillä ne antavat lisätietoa muuttujien sekä suorista että epäsuorista syy-seuraussuhteista. Väitöskirjassa esitetään algoritmi esi-isäsuhteiden todennäköisyyksien laskemiseen. Algoritmin toimintaa tutkitaan käytännössä ja todetaan, että esi-isäsuhteita pystytään oppimaan melko hyvin jopa silloin, kun useat havaitsemattomat muuttujat vaikuttavat aineiston muuttujiin

    Efficient Sampling and Counting of Graph Structures related to Chordal Graphs

    Get PDF
    Counting problems aim to count the number of solutions for a given input, for example, counting the number of variable assignments that satisfy a Boolean formula. Sampling problems aim to produce a random object from a desired distribution, for example, producing a variable assignment drawn uniformly at random from all assignments that satisfy a Boolean formula. The problems of counting and sampling of graph structures on different types of graphs have been studied for decades for their great importance in areas like complexity theory and statistical physics. For many graph structures such as independent sets and acyclic orientations, it is widely believed that no exact or approximate (with an arbitrarily small error) polynomial-time algorithms on general graphs exist. Therefore, the research community studies various types of graphs, aiming either to design a polynomial-time counting or sampling algorithm for such inputs, or to prove a corresponding inapproximability result. Chordal graphs have been studied widely in both AI and theoretical computer science, but their study from the counting perspective has been relatively limited. Previous works showed that some graph structures can be counted in polynomial time on chordal graphs, when their counting on general graphs is provably computationally hard. The main objective of this thesis is to design and analyze counting and sampling algorithms for several well-known graph structures, including independent sets and different types of graph orientations, on chordal graphs. Our contributions can be described from two perspectives: evaluating the performances of some well-known sampling techniques, such as Markov chain Monte Carlo, on chordal graphs; and showing that the chordality does make those counting problems polynomial-time solvable

    Scalable Learning of Bayesian Networks Using Feedback Arc Set-Based Heuristics

    Get PDF
    Bayesianske nettverk er en viktig klasse av probabilistiske grafiske modeller. De består av en struktur (en rettet asyklisk graf) som beskriver betingede uavhengighet mellom stokastiske variabler og deres parametere (lokale sannsynlighetsfordelinger). Med andre ord er Bayesianske nettverk generative modeller som beskriver simultanfordelingene på en kompakt form. Den største utfordringen med å lære et Bayesiansk nettverk skyldes selve strukturen, og på grunn av den kombinatoriske karakteren til asyklisitetsegenskapen er det ingen overraskelse at strukturlæringsproblemet generelt er NP-hardt. Det eksisterer algoritmer som løser dette problemet eksakt: dynamisk programmering og heltalls lineær programmering er de viktigste kandidatene når man ønsker å finne strukturen til små til mellomstore Bayesianske nettverk fra data. På den annen side er heuristikk som bakkeklatringsvarianter ofte brukt når man forsøker å lære strukturen til større nettverk med tusenvis av variabler, selv om disse heuristikkene vanligvis ikke har teoretiske garantier og ytelsen i praksis kan bli uforutsigbar når man arbeider med storskala læring. Denne oppgaven tar for seg utvikling av skalerbare metoder som takler det strukturlæringsproblemet av Bayesianske nettverk, samtidig som det forsøkes å opprettholde et nivå av teoretisk kontroll. Dette ble oppnådd ved bruk av relaterte kombinatoriske problemer, nemlig det maksimale asykliske subgrafproblemet (maximum acyclic subgraph) og det duale problemet (feedback arc set). Selv om disse problemene er NP-harde i seg selv, er de betydelig mer håndterbare i praksis. Denne oppgaven utforsker måter å kartlegge Bayesiansk nettverksstrukturlæring til maksimale asykliske subgrafforekomster og trekke ut omtrentlige løsninger for det første problemet, basert på løsninger oppnådd for det andre. Vår forskning tyder på at selv om økt skalerbarhet kan oppnås på denne måten, er det adskillig mer utfordrende å opprettholde den teoretisk forståelsen med denne tilnærmingen. Videre fant vi ut at å lære strukturen til Bayesianske nettverk basert på maksimal asyklisk subgraf kanskje ikke er den beste metoden generelt, men vi identifiserte en kontekst - lineære strukturelle ligningsmodeller - der vi eksperimentelt kunne validere fordelene med denne tilnærmingen, som fører til rask og skalerbar identifisering av strukturen og med mulighet til å lære komplekse strukturer på en måte som er konkurransedyktig med moderne metoder.Bayesian networks form an important class of probabilistic graphical models. They consist of a structure (a directed acyclic graph) expressing conditional independencies among random variables, as well as parameters (local probability distributions). As such, Bayesian networks are generative models encoding joint probability distributions in a compact form. The main difficulty in learning a Bayesian network comes from the structure itself, owing to the combinatorial nature of the acyclicity property; it is well known and does not come as a surprise that the structure learning problem is NP-hard in general. Exact algorithms solving this problem exist: dynamic programming and integer linear programming are prime contenders when one seeks to recover the structure of small-to-medium sized Bayesian networks from data. On the other hand, heuristics such as hill climbing variants are commonly used when attempting to approximately learn the structure of larger networks with thousands of variables, although these heuristics typically lack theoretical guarantees and their performance in practice may become unreliable when dealing with large scale learning. This thesis is concerned with the development of scalable methods tackling the Bayesian network structure learning problem, while attempting to maintain a level of theoretical control. This was achieved via the use of related combinatorial problems, namely the maximum acyclic subgraph problem and its dual problem the minimum feedback arc set problem. Although these problems are NP-hard themselves, they exhibit significantly better tractability in practice. This thesis explores ways to map Bayesian network structure learning into maximum acyclic subgraph instances and extract approximate solutions for the first problem, based on the solutions obtained for the second. Our research suggests that although increased scalability can be achieved this way, maintaining theoretical understanding based on this approach is much more challenging. Furthermore, we found that learning the structure of Bayesian networks based on maximum acyclic subgraph/minimum feedback arc set may not be the go-to method in general, but we identified a setting - linear structural equation models - in which we could experimentally validate the benefits of this approach, leading to fast and scalable structure recovery with the ability to learn complex structures in a competitive way compared to state-of-the-art baselines.Doktorgradsavhandlin

    Generalized belief change with imprecise probabilities and graphical models

    Get PDF
    We provide a theoretical investigation of probabilistic belief revision in complex frameworks, under extended conditions of uncertainty, inconsistency and imprecision. We motivate our kinematical approach by specializing our discussion to probabilistic reasoning with graphical models, whose modular representation allows for efficient inference. Most results in this direction are derived from the relevant work of Chan and Darwiche (2005), that first proved the inter-reducibility of virtual and probabilistic evidence. Such forms of information, deeply distinct in their meaning, are extended to the conditional and imprecise frameworks, allowing further generalizations, e.g. to experts' qualitative assessments. Belief aggregation and iterated revision of a rational agent's belief are also explored

    Correlation decay and decentralized optimization in graphical models

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 213-229) and index.Many models of optimization, statistics, social organizations and machine learning capture local dependencies by means of a network that describes the interconnections and interactions of different components. However, in most cases, optimization or inference on these models is hard due to the dimensionality of the networks. This is so even when using algorithms that take advantage of the underlying graphical structure. Approximate methods are therefore needed. The aim of this thesis is to study such large-scale systems, focusing on the question of how randomness affects the complexity of optimizing in a graph; of particular interest is the study of a phenomenon known as correlation decay, namely, the phenomenon where the influence of a node on another node of the network decreases quickly as the distance between them grows. In the first part of this thesis, we develop a new message-passing algorithm for optimization in graphical models. We formally prove a connection between the correlation decay property and (i) the near-optimality of this algorithm, as well as (ii) the decentralized nature of optimal solutions. In the context of discrete optimization with random costs, we develop a technique for establishing that a system exhibits correlation decay. We illustrate the applicability of the method by giving concrete results for the cases of uniform and Gaussian distributed cost coefficients in networks with bounded connectivity. In the second part, we pursue similar questions in a combinatorial optimization setting: we consider the problem of finding a maximum weight independent set in a bounded degree graph, when the node weights are i.i.d. random variables.(cont.) Surprisingly, we discover that the problem becomes tractable for certain distributions. Specifically, we construct a PTAS for the case of exponentially distributed weights and arbitrary graphs with degree at most 3, and obtain generalizations for higher degrees and different distributions. At the same time we prove that no PTAS exists for the case of exponentially distributed weights for graphs with sufficiently large but bounded degree, unless P=NP. Next, we shift our focus to graphical games, which are a game-theoretic analog of graphical models. We establish a connection between the problem of finding an approximate Nash equilibrium in a graphical game and the problem of optimization in graphical models. We use this connection to re-derive NashProp, a message-passing algorithm which computes Nash equilibria for graphical games on trees; we also suggest several new search algorithms for graphical games in general networks. Finally, we propose a definition of correlation decay in graphical games, and establish that the property holds in a restricted family of graphical games. The last part of the thesis is devoted to a particular application of graphical models and message-passing algorithms to the problem of early prediction of Alzheimer's disease. To this end, we develop a new measure of synchronicity between different parts of the brain, and apply it to electroencephalogram data. We show that the resulting prediction method outperforms a vast number of other EEG-based measures in the task of predicting the onset of Alzheimer's disease.by Théophane Weber.Ph.D

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

    Get PDF
    corecore