2,330 research outputs found

    Solving Target Set Selection with Bounded Thresholds Faster than 2^n

    Get PDF
    In this paper we consider the Target Set Selection problem. The problem naturally arises in many fields like economy, sociology, medicine. In the Target Set Selection problem one is given a graph G with a function thr: V(G) -> N cup {0} and integers k, l. The goal of the problem is to activate at most k vertices initially so that at the end of the activation process there is at least l activated vertices. The activation process occurs in the following way: (i) once activated, a vertex stays activated forever; (ii) vertex v becomes activated if at least thr(v) of its neighbours are activated. The problem and its different special cases were extensively studied from approximation and parameterized points of view. For example, parameterizations by the following parameters were studied: treewidth, feedback vertex set, diameter, size of target set, vertex cover, cluster editing number and others. Despite the extensive study of the problem it is still unknown whether the problem can be solved in O^*((2-epsilon)^n) time for some epsilon >0. We partially answer this question by presenting several faster-than-trivial algorithms that work in cases of constant thresholds, constant dual thresholds or when the threshold value of each vertex is bounded by one-third of its degree. Also, we show that the problem parameterized by l is W[1]-hard even when all thresholds are constant

    Learning to Reason: Leveraging Neural Networks for Approximate DNF Counting

    Full text link
    Weighted model counting (WMC) has emerged as a prevalent approach for probabilistic inference. In its most general form, WMC is #P-hard. Weighted DNF counting (weighted #DNF) is a special case, where approximations with probabilistic guarantees are obtained in O(nm), where n denotes the number of variables, and m the number of clauses of the input DNF, but this is not scalable in practice. In this paper, we propose a neural model counting approach for weighted #DNF that combines approximate model counting with deep learning, and accurately approximates model counts in linear time when width is bounded. We conduct experiments to validate our method, and show that our model learns and generalizes very well to large-scale #DNF instances.Comment: To appear in Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20). Code and data available at: https://github.com/ralphabb/NeuralDNF

    Least squares after model selection in high-dimensional sparse models

    Get PDF
    In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that the OLS post-Lasso estimator performs at least as well as Lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the Lasso-based model selection "fails" in the sense of missing some components of the "true" regression model. By the "true" model, we mean the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso, in the sense of a strictly faster rate of convergence, if the Lasso-based model selection correctly includes all components of the "true" model as a subset and also achieves sufficient sparsity. In the extreme case, when Lasso perfectly selects the "true" model, the OLS post-Lasso estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by Lasso, which guarantees that this dimension is at most of the same order as the dimension of the "true" model. Our rate results are nonasymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the Lasso estimator acting as a selector in the first step, but also applies to any other estimator, for example, various forms of thresholded Lasso, with good rates and good sparsity properties. Our analysis covers both traditional thresholding and a new practical, data-driven thresholding scheme that induces additional sparsity subject to maintaining a certain goodness of fit. The latter scheme has theoretical guarantees similar to those of Lasso or OLS post-Lasso, but it dominates those procedures as well as traditional thresholding in a wide variety of experiments.Comment: Published in at http://dx.doi.org/10.3150/11-BEJ410 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Some nonasymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple tests

    Get PDF
    We study generalized bootstrap confidence regions for the mean of a random vector whose coordinates have an unknown dependency structure. The random vector is supposed to be either Gaussian or to have a symmetric and bounded distribution. The dimensionality of the vector can possibly be much larger than the number of observations and we focus on a nonasymptotic control of the confidence level, following ideas inspired by recent results in learning theory. We consider two approaches, the first based on a concentration principle (valid for a large class of resampling weights) and the second on a resampled quantile, specifically using Rademacher weights. Several intermediate results established in the approach based on concentration principles are of interest in their own right. We also discuss the question of accuracy when using Monte Carlo approximations of the resampled quantities.Comment: Published in at http://dx.doi.org/10.1214/08-AOS667; http://dx.doi.org/10.1214/08-AOS668 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Co-evolution of Content Popularity and Delivery in Mobile P2P Networks

    Full text link
    Mobile P2P technology provides a scalable approach to content delivery to a large number of users on their mobile devices. In this work, we study the dissemination of a \emph{single} content (e.g., an item of news, a song or a video clip) among a population of mobile nodes. Each node in the population is either a \emph{destination} (interested in the content) or a potential \emph{relay} (not yet interested in the content). There is an interest evolution process by which nodes not yet interested in the content (i.e., relays) can become interested (i.e., become destinations) on learning about the popularity of the content (i.e., the number of already interested nodes). In our work, the interest in the content evolves under the \emph{linear threshold model}. The content is copied between nodes when they make random contact. For this we employ a controlled epidemic spread model. We model the joint evolution of the copying process and the interest evolution process, and derive the joint fluid limit ordinary differential equations. We then study the selection of the parameters under the content provider's control, for the optimization of various objective functions that aim at maximizing content popularity and efficient content delivery.Comment: 21 pages, 16 figure
    • …
    corecore