2,330 research outputs found
Solving Target Set Selection with Bounded Thresholds Faster than 2^n
In this paper we consider the Target Set Selection problem. The problem naturally arises in many fields like economy, sociology, medicine. In the Target Set Selection problem one is given a graph G with a function thr: V(G) -> N cup {0} and integers k, l. The goal of the problem is to activate at most k vertices initially so that at the end of the activation process there is at least l activated vertices. The activation process occurs in the following way: (i) once activated, a vertex stays activated forever; (ii) vertex v becomes activated if at least thr(v) of its neighbours are activated. The problem and its different special cases were extensively studied from approximation and parameterized points of view. For example, parameterizations by the following parameters were studied: treewidth, feedback vertex set, diameter, size of target set, vertex cover, cluster editing number and others.
Despite the extensive study of the problem it is still unknown whether the problem can be solved in O^*((2-epsilon)^n) time for some epsilon >0. We partially answer this question by presenting several faster-than-trivial algorithms that work in cases of constant thresholds, constant dual thresholds or when the threshold value of each vertex is bounded by one-third of its degree. Also, we show that the problem parameterized by l is W[1]-hard even when all thresholds are constant
Learning to Reason: Leveraging Neural Networks for Approximate DNF Counting
Weighted model counting (WMC) has emerged as a prevalent approach for
probabilistic inference. In its most general form, WMC is #P-hard. Weighted DNF
counting (weighted #DNF) is a special case, where approximations with
probabilistic guarantees are obtained in O(nm), where n denotes the number of
variables, and m the number of clauses of the input DNF, but this is not
scalable in practice. In this paper, we propose a neural model counting
approach for weighted #DNF that combines approximate model counting with deep
learning, and accurately approximates model counts in linear time when width is
bounded. We conduct experiments to validate our method, and show that our model
learns and generalizes very well to large-scale #DNF instances.Comment: To appear in Proceedings of the Thirty-Fourth AAAI Conference on
Artificial Intelligence (AAAI-20). Code and data available at:
https://github.com/ralphabb/NeuralDNF
Least squares after model selection in high-dimensional sparse models
In this article we study post-model selection estimators that apply ordinary
least squares (OLS) to the model selected by first-step penalized estimators,
typically Lasso. It is well known that Lasso can estimate the nonparametric
regression function at nearly the oracle rate, and is thus hard to improve
upon. We show that the OLS post-Lasso estimator performs at least as well as
Lasso in terms of the rate of convergence, and has the advantage of a smaller
bias. Remarkably, this performance occurs even if the Lasso-based model
selection "fails" in the sense of missing some components of the "true"
regression model. By the "true" model, we mean the best s-dimensional
approximation to the nonparametric regression function chosen by the oracle.
Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso,
in the sense of a strictly faster rate of convergence, if the Lasso-based model
selection correctly includes all components of the "true" model as a subset and
also achieves sufficient sparsity. In the extreme case, when Lasso perfectly
selects the "true" model, the OLS post-Lasso estimator becomes the oracle
estimator. An important ingredient in our analysis is a new sparsity bound on
the dimension of the model selected by Lasso, which guarantees that this
dimension is at most of the same order as the dimension of the "true" model.
Our rate results are nonasymptotic and hold in both parametric and
nonparametric models. Moreover, our analysis is not limited to the Lasso
estimator acting as a selector in the first step, but also applies to any other
estimator, for example, various forms of thresholded Lasso, with good rates and
good sparsity properties. Our analysis covers both traditional thresholding and
a new practical, data-driven thresholding scheme that induces additional
sparsity subject to maintaining a certain goodness of fit. The latter scheme
has theoretical guarantees similar to those of Lasso or OLS post-Lasso, but it
dominates those procedures as well as traditional thresholding in a wide
variety of experiments.Comment: Published in at http://dx.doi.org/10.3150/11-BEJ410 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Some nonasymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple tests
We study generalized bootstrap confidence regions for the mean of a random
vector whose coordinates have an unknown dependency structure. The random
vector is supposed to be either Gaussian or to have a symmetric and bounded
distribution. The dimensionality of the vector can possibly be much larger than
the number of observations and we focus on a nonasymptotic control of the
confidence level, following ideas inspired by recent results in learning
theory. We consider two approaches, the first based on a concentration
principle (valid for a large class of resampling weights) and the second on a
resampled quantile, specifically using Rademacher weights. Several intermediate
results established in the approach based on concentration principles are of
interest in their own right. We also discuss the question of accuracy when
using Monte Carlo approximations of the resampled quantities.Comment: Published in at http://dx.doi.org/10.1214/08-AOS667;
http://dx.doi.org/10.1214/08-AOS668 the Annals of Statistics
(http://www.imstat.org/aos/) by the Institute of Mathematical Statistics
(http://www.imstat.org
Co-evolution of Content Popularity and Delivery in Mobile P2P Networks
Mobile P2P technology provides a scalable approach to content delivery to a
large number of users on their mobile devices. In this work, we study the
dissemination of a \emph{single} content (e.g., an item of news, a song or a
video clip) among a population of mobile nodes. Each node in the population is
either a \emph{destination} (interested in the content) or a potential
\emph{relay} (not yet interested in the content). There is an interest
evolution process by which nodes not yet interested in the content (i.e.,
relays) can become interested (i.e., become destinations) on learning about the
popularity of the content (i.e., the number of already interested nodes). In
our work, the interest in the content evolves under the \emph{linear threshold
model}. The content is copied between nodes when they make random contact. For
this we employ a controlled epidemic spread model. We model the joint evolution
of the copying process and the interest evolution process, and derive the joint
fluid limit ordinary differential equations. We then study the selection of the
parameters under the content provider's control, for the optimization of
various objective functions that aim at maximizing content popularity and
efficient content delivery.Comment: 21 pages, 16 figure
- …