1,085 research outputs found
On the Complexity of Mining Itemsets from the Crowd Using Taxonomies
We study the problem of frequent itemset mining in domains where data is not
recorded in a conventional database but only exists in human knowledge. We
provide examples of such scenarios, and present a crowdsourcing model for them.
The model uses the crowd as an oracle to find out whether an itemset is
frequent or not, and relies on a known taxonomy of the item domain to guide the
search for frequent itemsets. In the spirit of data mining with oracles, we
analyze the complexity of this problem in terms of (i) crowd complexity, that
measures the number of crowd questions required to identify the frequent
itemsets; and (ii) computational complexity, that measures the computational
effort required to choose the questions. We provide lower and upper complexity
bounds in terms of the size and structure of the input taxonomy, as well as the
size of a concise description of the output itemsets. We also provide
constructive algorithms that achieve the upper bounds, and consider more
efficient variants for practical situations.Comment: 18 pages, 2 figures. To be published to ICDT'13. Added missing
acknowledgemen
Diamond-free Families
Given a finite poset P, we consider the largest size La(n,P) of a family of
subsets of that contains no subposet P. This problem has
been studied intensively in recent years, and it is conjectured that exists for general posets P,
and, moreover, it is an integer. For let \D_k denote the -diamond
poset . We study the average number of times a random
full chain meets a -free family, called the Lubell function, and use it for
P=\D_k to determine \pi(\D_k) for infinitely many values . A stubborn
open problem is to show that \pi(\D_2)=2; here we make progress by proving
\pi(\D_2)\le 2 3/11 (if it exists).Comment: 16 page
Elementary bounds on Poincare and log-Sobolev constants for decomposable Markov chains
We consider finite-state Markov chains that can be naturally decomposed into
smaller ``projection'' and ``restriction'' chains. Possibly this decomposition
will be inductive, in that the restriction chains will be smaller copies of the
initial chain. We provide expressions for Poincare (resp. log-Sobolev)
constants of the initial Markov chain in terms of Poincare (resp. log-Sobolev)
constants of the projection and restriction chains, together with further a
parameter. In the case of the Poincare constant, our bound is always at least
as good as existing ones and, depending on the value of the extra parameter,
may be much better. There appears to be no previously published decomposition
result for the log-Sobolev constant. Our proofs are elementary and
self-contained.Comment: Published at http://dx.doi.org/10.1214/105051604000000639 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
Quantitative Static Analysis of Communication Protocols using Abstract Markov Chains
International audienceIn this paper we present a static analysis of probabilistic programs to quantify their performance properties by taking into account both the stochastic aspects of the language and those related to the execution environment. More particularly, we are interested in the analysis of communication protocols in lossy networks and we aim at inferring statically parametric bounds of some important metrics such as the expectation of the throughput or the energy consumption. Our analysis is formalized within the theory of abstract interpretation and soundly takes all possible executions into account. We model the concrete executions as a set of Markov chains and we introduce a novel notion of abstract Markov chains that provides a finite and symbolic representation to over-approximate the (possi-bly unbounded) set of concrete behaviors. We show that our proposed formalism is expressive enough to handle both probabilistic and pure non-deterministic choices within the same semantics. Our analysis operates in two steps. The first step is a classic abstract interpretation of the source code, using stock numerical abstract domains and a specific automata domain, in order to extract the abstract Markov chain of the program. The second step extracts from this chain particular invari-ants about the stationary distribution and computes its symbolic bounds using a parametric Fourier-Motzkin elimination algorithm. We present a prototype implementation of the analysis and we discuss some preliminary experiments on a number of communication protocols. We compare our prototype to the state-of-the-art probabilistic model checker Prism and we highlight the advantages and shortcomings of both approaches
Colouring set families without monochromatic k-chains
A coloured version of classic extremal problems dates back to Erd\H{o}s and
Rothschild, who in 1974 asked which -vertex graph has the maximum number of
2-edge-colourings without monochromatic triangles. They conjectured that the
answer is simply given by the largest triangle-free graph. Since then, this new
class of coloured extremal problems has been extensively studied by various
researchers. In this paper we pursue the Erd\H{o}s--Rothschild versions of
Sperner's Theorem, the classic result in extremal set theory on the size of the
largest antichain in the Boolean lattice, and Erd\H{o}s' extension to
-chain-free families.
Given a family of subsets of , we define an
-colouring of to be an -colouring of the sets without
any monochromatic -chains . We
prove that for sufficiently large in terms of , the largest
-chain-free families also maximise the number of -colourings. We also
show that the middle level, , maximises the
number of -colourings, and give asymptotic results on the maximum
possible number of -colourings whenever is divisible by three.Comment: 30 pages, final versio
Poset Ramsey number . III. N-shaped poset
Given partially ordered sets (posets) and , we
say that contains a copy of if for some injective function and for any , if and only if
. For any posets and , the poset Ramsey number
is the least positive integer such that no matter how the elements
of an -dimensional Boolean lattice are colored in blue and red, there is
either a copy of with all blue elements or a copy of with all red
elements.
We focus on the poset Ramsey number for a fixed poset and an
-dimensional Boolean lattice , as grows large. It is known that
, for positive constants and .
However, there is no poset known, for which , for
. This paper is devoted to a new method for finding upper bounds
on using a duality between copies of and sets of elements
that cover them, referred to as blockers. We prove several properties of
blockers and their direct relation to the Ramsey numbers. Using these
properties we show that , for a poset
with four elements and , such that , ,
, and the remaining pairs of elements are incomparable.Comment: 19 pages, 6 figure
- …