85,130 research outputs found

    A Tight Upper Bound on the Number of Candidate Patterns

    Full text link
    In the context of mining for frequent patterns using the standard levelwise algorithm, the following question arises: given the current level and the current set of frequent patterns, what is the maximal number of candidate patterns that can be generated on the next level? We answer this question by providing a tight upper bound, derived from a combinatorial result from the sixties by Kruskal and Katona. Our result is useful to reduce the number of database scans

    On the Minimum/Stopping Distance of Array Low-Density Parity-Check Codes

    Get PDF
    In this work, we study the minimum/stopping distance of array low-density parity-check (LDPC) codes. An array LDPC code is a quasi-cyclic LDPC code specified by two integers q and m, where q is an odd prime and m <= q. In the literature, the minimum/stopping distance of these codes (denoted by d(q,m) and h(q,m), respectively) has been thoroughly studied for m <= 5. Both exact results, for small values of q and m, and general (i.e., independent of q) bounds have been established. For m=6, the best known minimum distance upper bound, derived by Mittelholzer (IEEE Int. Symp. Inf. Theory, Jun./Jul. 2002), is d(q,6) <= 32. In this work, we derive an improved upper bound of d(q,6) <= 20 and a new upper bound d(q,7) <= 24 by using the concept of a template support matrix of a codeword/stopping set. The bounds are tight with high probability in the sense that we have not been able to find codewords of strictly lower weight for several values of q using a minimum distance probabilistic algorithm. Finally, we provide new specific minimum/stopping distance results for m <= 7 and low-to-moderate values of q <= 79.Comment: To appear in IEEE Trans. Inf. Theory. The material in this paper was presented in part at the 2014 IEEE International Symposium on Information Theory, Honolulu, HI, June/July 201

    Finding Statistically Significant Interactions between Continuous Features

    Full text link
    The search for higher-order feature interactions that are statistically significantly associated with a class variable is of high relevance in fields such as Genetics or Healthcare, but the combinatorial explosion of the candidate space makes this problem extremely challenging in terms of computational efficiency and proper correction for multiple testing. While recent progress has been made regarding this challenge for binary features, we here present the first solution for continuous features. We propose an algorithm which overcomes the combinatorial explosion of the search space of higher-order interactions by deriving a lower bound on the p-value for each interaction, which enables us to massively prune interactions that can never reach significance and to thereby gain more statistical power. In our experiments, our approach efficiently detects all significant interactions in a variety of synthetic and real-world datasets.Comment: 13 pages, 5 figures, 2 tables, accepted to the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019

    Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases

    Full text link
    Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.Comment: VLDB201

    MoMo: a group mobility model for future generation mobile wireless networks

    Full text link
    Existing group mobility models were not designed to meet the requirements for accurate simulation of current and future short distance wireless networks scenarios, that need, in particular, accurate, up-to-date informa- tion on the position of each node in the network, combined with a simple and flexible approach to group mobility modeling. A new model for group mobility in wireless networks, named MoMo, is proposed in this paper, based on the combination of a memory-based individual mobility model with a flexible group behavior model. MoMo is capable of accurately describing all mobility scenarios, from individual mobility, in which nodes move inde- pendently one from the other, to tight group mobility, where mobility patterns of different nodes are strictly correlated. A new set of intrinsic properties for a mobility model is proposed and adopted in the analysis and comparison of MoMo with existing models. Next, MoMo is compared with existing group mobility models in a typical 5G network scenario, in which a set of mobile nodes cooperate in the realization of a distributed MIMO link. Results show that MoMo leads to accurate, robust and flexible modeling of mobility of groups of nodes in discrete event simulators, making it suitable for the performance evaluation of networking protocols and resource allocation algorithms in the wide range of network scenarios expected to characterize 5G networks.Comment: 25 pages, 17 figure
    • …
    corecore