27 research outputs found

    New metrics and search algorithms for weighted causal DAGs

    Full text link
    Recovering causal relationships from data is an important problem. Using observational data, one can typically only recover causal graphs up to a Markov equivalence class and additional assumptions or interventional data are needed for complete recovery. In this work, under some standard assumptions, we study causal graph discovery via adaptive interventions with node-dependent interventional costs. For this setting, we show that no algorithm can achieve an approximation guarantee that is asymptotically better than linear in the number of vertices with respect to the verification number; a well-established benchmark for adaptive search algorithms. Motivated by this negative result, we define a new benchmark that captures the worst-case interventional cost for any search algorithm. Furthermore, with respect to this new benchmark, we provide adaptive search algorithms that achieve logarithmic approximations under various settings: atomic, bounded size interventions and generalized cost objectives.Comment: Accepted into ICML 202

    Adaptivity Complexity for Causal Graph Discovery

    Full text link
    Causal discovery from interventional data is an important problem, where the task is to design an interventional strategy that learns the hidden ground truth causal graph G(V,E)G(V,E) on V=n|V| = n nodes while minimizing the number of performed interventions. Most prior interventional strategies broadly fall into two categories: non-adaptive and adaptive. Non-adaptive strategies decide on a single fixed set of interventions to be performed while adaptive strategies can decide on which nodes to intervene on sequentially based on past interventions. While adaptive algorithms may use exponentially fewer interventions than their non-adaptive counterparts, there are practical concerns that constrain the amount of adaptivity allowed. Motivated by this trade-off, we study the problem of rr-adaptivity, where the algorithm designer recovers the causal graph under a total of rr sequential rounds whilst trying to minimize the total number of interventions. For this problem, we provide a rr-adaptive algorithm that achieves O(min{r,logn}n1/min{r,logn})O(\min\{r,\log n\} \cdot n^{1/\min\{r,\log n\}}) approximation with respect to the verification number, a well-known lower bound for adaptive algorithms. Furthermore, for every rr, we show that our approximation is tight. Our definition of rr-adaptivity interpolates nicely between the non-adaptive (r=1r=1) and fully adaptive (r=nr=n) settings where our approximation simplifies to O(n)O(n) and O(logn)O(\log n) respectively, matching the best-known approximation guarantees for both extremes. Our results also extend naturally to the bounded size interventions.Comment: Accepted into UAI 202

    Verification and search algorithms for causal DAGs

    Full text link
    We study two problems related to recovering causal graphs from interventional data: (i) verification\textit{verification}, where the task is to check if a purported causal graph is correct, and (ii) search\textit{search}, where the task is to recover the correct causal graph. For both, we wish to minimize the number of interventions performed. For the first problem, we give a characterization of a minimal sized set of atomic interventions that is necessary and sufficient to check the correctness of a claimed causal graph. Our characterization uses the notion of covered edges\textit{covered edges}, which enables us to obtain simple proofs and also easily reason about earlier known results. We also generalize our results to the settings of bounded size interventions and node-dependent interventional costs. For all the above settings, we provide the first known provable algorithms for efficiently computing (near)-optimal verifying sets on general graphs. For the second problem, we give a simple adaptive algorithm based on graph separators that produces an atomic intervention set which fully orients any essential graph while using O(logn)\mathcal{O}(\log n) times the optimal number of interventions needed to verify\textit{verify} (verifying size) the underlying DAG on nn vertices. This approximation is tight as any\textit{any} search algorithm on an essential line graph has worst case approximation ratio of Ω(logn)\Omega(\log n) with respect to the verifying size. With bounded size interventions, each of size k\leq k, our algorithm gives an O(lognlogk)\mathcal{O}(\log n \cdot \log k) factor approximation. Our result is the first known algorithm that gives a non-trivial approximation guarantee to the verifying size on general unweighted graphs and with bounded size interventions

    Learnability of Parameter-Bounded Bayes Nets

    Full text link
    Bayes nets are extensively used in practice to efficiently represent joint probability distributions over a set of random variables and capture dependency relations. In a seminal paper, Chickering et al. (JMLR 2004) showed that given a distribution P\mathbb{P}, that is defined as the marginal distribution of a Bayes net, it is NP\mathsf{NP}-hard to decide whether there is a parameter-bounded Bayes net that represents P\mathbb{P}. They called this problem LEARN. In this work, we extend the NP\mathsf{NP}-hardness result of LEARN and prove the NP\mathsf{NP}-hardness of a promise search variant of LEARN, whereby the Bayes net in question is guaranteed to exist and one is asked to find such a Bayes net. We complement our hardness result with a positive result about the sample complexity that is sufficient to recover a parameter-bounded Bayes net that is close (in TV distance) to a given distribution P\mathbb{P}, that is represented by some parameter-bounded Bayes net, generalizing a degree-bounded sample complexity result of Brustle et al. (EC 2020).Comment: 15 pages, 2 figure

    Learning and Testing Latent-Tree Ising Models Efficiently

    Full text link
    We provide time- and sample-efficient algorithms for learning and testing latent-tree Ising models, i.e. Ising models that may only be observed at their leaf nodes. On the learning side, we obtain efficient algorithms for learning a tree-structured Ising model whose leaf node distribution is close in Total Variation Distance, improving on the results of prior work. On the testing side, we provide an efficient algorithm with fewer samples for testing whether two latent-tree Ising models have leaf-node distributions that are close or far in Total Variation distance. We obtain our algorithms by showing novel localization results for the total variation distance between the leaf-node distributions of tree-structured Ising models, in terms of their marginals on pairs of leaves

    Online bipartite matching with imperfect advice

    Full text link
    We study the problem of online unweighted bipartite matching with nn offline vertices and nn online vertices where one wishes to be competitive against the optimal offline algorithm. While the classic RANKING algorithm of Karp et al. [1990] provably attains competitive ratio of 11/e>1/21-1/e > 1/2, we show that no learning-augmented method can be both 1-consistent and strictly better than 1/21/2-robust under the adversarial arrival model. Meanwhile, under the random arrival model, we show how one can utilize methods from distribution testing to design an algorithm that takes in external advice about the online vertices and provably achieves competitive ratio interpolating between any ratio attainable by advice-free methods and the optimal ratio of 1, depending on the advice quality.Comment: Accepted into ICML 202

    Envy-Free House Allocation with Minimum Subsidy

    Full text link
    House allocation refers to the problem where mm houses are to be allocated to nn agents so that each agent receives one house. Since an envy-free house allocation does not always exist, we consider finding such an allocation in the presence of subsidy. We show that computing an envy-free allocation with minimum subsidy is NP-hard in general, but can be done efficiently if mm differs from nn by an additive constant or if the agents have identical utilities
    corecore