85,130 research outputs found
A Tight Upper Bound on the Number of Candidate Patterns
In the context of mining for frequent patterns using the standard levelwise
algorithm, the following question arises: given the current level and the
current set of frequent patterns, what is the maximal number of candidate
patterns that can be generated on the next level? We answer this question by
providing a tight upper bound, derived from a combinatorial result from the
sixties by Kruskal and Katona. Our result is useful to reduce the number of
database scans
On the Minimum/Stopping Distance of Array Low-Density Parity-Check Codes
In this work, we study the minimum/stopping distance of array low-density
parity-check (LDPC) codes. An array LDPC code is a quasi-cyclic LDPC code
specified by two integers q and m, where q is an odd prime and m <= q. In the
literature, the minimum/stopping distance of these codes (denoted by d(q,m) and
h(q,m), respectively) has been thoroughly studied for m <= 5. Both exact
results, for small values of q and m, and general (i.e., independent of q)
bounds have been established. For m=6, the best known minimum distance upper
bound, derived by Mittelholzer (IEEE Int. Symp. Inf. Theory, Jun./Jul. 2002),
is d(q,6) <= 32. In this work, we derive an improved upper bound of d(q,6) <=
20 and a new upper bound d(q,7) <= 24 by using the concept of a template
support matrix of a codeword/stopping set. The bounds are tight with high
probability in the sense that we have not been able to find codewords of
strictly lower weight for several values of q using a minimum distance
probabilistic algorithm. Finally, we provide new specific minimum/stopping
distance results for m <= 7 and low-to-moderate values of q <= 79.Comment: To appear in IEEE Trans. Inf. Theory. The material in this paper was
presented in part at the 2014 IEEE International Symposium on Information
Theory, Honolulu, HI, June/July 201
Finding Statistically Significant Interactions between Continuous Features
The search for higher-order feature interactions that are statistically
significantly associated with a class variable is of high relevance in fields
such as Genetics or Healthcare, but the combinatorial explosion of the
candidate space makes this problem extremely challenging in terms of
computational efficiency and proper correction for multiple testing. While
recent progress has been made regarding this challenge for binary features, we
here present the first solution for continuous features. We propose an
algorithm which overcomes the combinatorial explosion of the search space of
higher-order interactions by deriving a lower bound on the p-value for each
interaction, which enables us to massively prune interactions that can never
reach significance and to thereby gain more statistical power. In our
experiments, our approach efficiently detects all significant interactions in a
variety of synthetic and real-world datasets.Comment: 13 pages, 5 figures, 2 tables, accepted to the 28th International
Joint Conference on Artificial Intelligence (IJCAI 2019
Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases
Many studies have been conducted on seeking the efficient solution for
subgraph similarity search over certain (deterministic) graphs due to its wide
application in many fields, including bioinformatics, social network analysis,
and Resource Description Framework (RDF) data management. All these works
assume that the underlying data are certain. However, in reality, graphs are
often noisy and uncertain due to various factors, such as errors in data
extraction, inconsistencies in data integration, and privacy preserving
purposes. Therefore, in this paper, we study subgraph similarity search on
large probabilistic graph databases. Different from previous works assuming
that edges in an uncertain graph are independent of each other, we study the
uncertain graphs where edges' occurrences are correlated. We formally prove
that subgraph similarity search over probabilistic graphs is #P-complete, thus,
we employ a filter-and-verify framework to speed up the search. In the
filtering phase,we develop tight lower and upper bounds of subgraph similarity
probability based on a probabilistic matrix index, PMI. PMI is composed of
discriminative subgraph features associated with tight lower and upper bounds
of subgraph isomorphism probability. Based on PMI, we can sort out a large
number of probabilistic graphs and maximize the pruning capability. During the
verification phase, we develop an efficient sampling algorithm to validate the
remaining candidates. The efficiency of our proposed solutions has been
verified through extensive experiments.Comment: VLDB201
MoMo: a group mobility model for future generation mobile wireless networks
Existing group mobility models were not designed to meet the requirements for
accurate simulation of current and future short distance wireless networks
scenarios, that need, in particular, accurate, up-to-date informa- tion on the
position of each node in the network, combined with a simple and flexible
approach to group mobility modeling. A new model for group mobility in wireless
networks, named MoMo, is proposed in this paper, based on the combination of a
memory-based individual mobility model with a flexible group behavior model.
MoMo is capable of accurately describing all mobility scenarios, from
individual mobility, in which nodes move inde- pendently one from the other, to
tight group mobility, where mobility patterns of different nodes are strictly
correlated. A new set of intrinsic properties for a mobility model is proposed
and adopted in the analysis and comparison of MoMo with existing models. Next,
MoMo is compared with existing group mobility models in a typical 5G network
scenario, in which a set of mobile nodes cooperate in the realization of a
distributed MIMO link. Results show that MoMo leads to accurate, robust and
flexible modeling of mobility of groups of nodes in discrete event simulators,
making it suitable for the performance evaluation of networking protocols and
resource allocation algorithms in the wide range of network scenarios expected
to characterize 5G networks.Comment: 25 pages, 17 figure
- …