56,381 research outputs found

    Seeded Graph Matching: Efficient Algorithms and Theoretical Guarantees

    Full text link
    In this paper, a new information theoretic framework for graph matching is introduced. Using this framework, the graph isomorphism and seeded graph matching problems are studied. The maximum degree algorithm for graph isomorphism is analyzed and sufficient conditions for successful matching are rederived using type analysis. Furthermore, a new seeded matching algorithm with polynomial time complexity is introduced. The algorithm uses `typicality matching' and techniques from point-to-point communications for reliable matching. Assuming an Erdos-Renyi model on the correlated graph pair, it is shown that successful matching is guaranteed when the number of seeds grows logarithmically with the number of vertices in the graphs. The logarithmic coefficient is shown to be inversely proportional to the mutual information between the edge variables in the two graphs

    Matching Theory for Future Wireless Networks: Fundamentals and Applications

    Full text link
    The emergence of novel wireless networking paradigms such as small cell and cognitive radio networks has forever transformed the way in which wireless systems are operated. In particular, the need for self-organizing solutions to manage the scarce spectral resources has become a prevalent theme in many emerging wireless systems. In this paper, the first comprehensive tutorial on the use of matching theory, a Nobelprize winning framework, for resource management in wireless networks is developed. To cater for the unique features of emerging wireless networks, a novel, wireless-oriented classification of matching theory is proposed. Then, the key solution concepts and algorithmic implementations of this framework are exposed. Then, the developed concepts are applied in three important wireless networking areas in order to demonstrate the usefulness of this analytical tool. Results show how matching theory can effectively improve the performance of resource allocation in all three applications discussed

    Entropy-scaling search of massive biological data

    Get PDF
    Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

    The matching polytope does not admit fully-polynomial size relaxation schemes

    Full text link
    The groundbreaking work of Rothvo{\ss} [arxiv:1311.2369] established that every linear program expressing the matching polytope has an exponential number of inequalities (formally, the matching polytope has exponential extension complexity). We generalize this result by deriving strong bounds on the polyhedral inapproximability of the matching polytope: for fixed 0<ε<10 < \varepsilon < 1, every polyhedral (1+ε/n)(1 + \varepsilon / n)-approximation requires an exponential number of inequalities, where nn is the number of vertices. This is sharp given the well-known ρ\rho-approximation of size O((nρ/(ρ1)))O(\binom{n}{\rho/(\rho-1)}) provided by the odd-sets of size up to ρ/(ρ1)\rho/(\rho-1). Thus matching is the first problem in PP, whose natural linear encoding does not admit a fully polynomial-size relaxation scheme (the polyhedral equivalent of an FPTAS), which provides a sharp separation from the polynomial-size relaxation scheme obtained e.g., via constant-sized odd-sets mentioned above. Our approach reuses ideas from Rothvo{\ss} [arxiv:1311.2369], however the main lower bounding technique is different. While the original proof is based on the hyperplane separation bound (also called the rectangle corruption bound), we employ the information-theoretic notion of common information as introduced in Braun and Pokutta [http://eccc.hpi-web.de/report/2013/056/], which allows to analyze perturbations of slack matrices. It turns out that the high extension complexity for the matching polytope stem from the same source of hardness as for the correlation polytope: a direct sum structure.Comment: 21 pages, 3 figure

    Graph ambiguity

    Get PDF
    In this paper, we propose a rigorous way to define the concept of ambiguity in the domain of graphs. In past studies, the classical definition of ambiguity has been derived starting from fuzzy set and fuzzy information theories. Our aim is to show that also in the domain of the graphs it is possible to derive a formulation able to capture the same semantic and mathematical concept. To strengthen the theoretical results, we discuss the application of the graph ambiguity concept to the graph classification setting, conceiving a new kind of inexact graph matching procedure. The results prove that the graph ambiguity concept is a characterizing and discriminative property of graphs. (C) 2013 Elsevier B.V. All rights reserved

    Coherent frequentism

    Full text link
    By representing the range of fair betting odds according to a pair of confidence set estimators, dual probability measures on parameter space called frequentist posteriors secure the coherence of subjective inference without any prior distribution. The closure of the set of expected losses corresponding to the dual frequentist posteriors constrains decisions without arbitrarily forcing optimization under all circumstances. This decision theory reduces to those that maximize expected utility when the pair of frequentist posteriors is induced by an exact or approximate confidence set estimator or when an automatic reduction rule is applied to the pair. In such cases, the resulting frequentist posterior is coherent in the sense that, as a probability distribution of the parameter of interest, it satisfies the axioms of the decision-theoretic and logic-theoretic systems typically cited in support of the Bayesian posterior. Unlike the p-value, the confidence level of an interval hypothesis derived from such a measure is suitable as an estimator of the indicator of hypothesis truth since it converges in sample-space probability to 1 if the hypothesis is true or to 0 otherwise under general conditions.Comment: The confidence-measure theory of inference and decision is explicitly extended to vector parameters of interest. The derivation of upper and lower confidence levels from valid and nonconservative set estimators is formalize
    corecore