56,381 research outputs found
Seeded Graph Matching: Efficient Algorithms and Theoretical Guarantees
In this paper, a new information theoretic framework for graph matching is
introduced. Using this framework, the graph isomorphism and seeded graph
matching problems are studied. The maximum degree algorithm for graph
isomorphism is analyzed and sufficient conditions for successful matching are
rederived using type analysis. Furthermore, a new seeded matching algorithm
with polynomial time complexity is introduced. The algorithm uses `typicality
matching' and techniques from point-to-point communications for reliable
matching. Assuming an Erdos-Renyi model on the correlated graph pair, it is
shown that successful matching is guaranteed when the number of seeds grows
logarithmically with the number of vertices in the graphs. The logarithmic
coefficient is shown to be inversely proportional to the mutual information
between the edge variables in the two graphs
Matching Theory for Future Wireless Networks: Fundamentals and Applications
The emergence of novel wireless networking paradigms such as small cell and
cognitive radio networks has forever transformed the way in which wireless
systems are operated. In particular, the need for self-organizing solutions to
manage the scarce spectral resources has become a prevalent theme in many
emerging wireless systems. In this paper, the first comprehensive tutorial on
the use of matching theory, a Nobelprize winning framework, for resource
management in wireless networks is developed. To cater for the unique features
of emerging wireless networks, a novel, wireless-oriented classification of
matching theory is proposed. Then, the key solution concepts and algorithmic
implementations of this framework are exposed. Then, the developed concepts are
applied in three important wireless networking areas in order to demonstrate
the usefulness of this analytical tool. Results show how matching theory can
effectively improve the performance of resource allocation in all three
applications discussed
Entropy-scaling search of massive biological data
Many datasets exhibit a well-defined structure that can be exploited to
design faster search tools, but it is not always clear when such acceleration
is possible. Here, we introduce a framework for similarity search based on
characterizing a dataset's entropy and fractal dimension. We prove that
searching scales in time with metric entropy (number of covering hyperspheres),
if the fractal dimension of the dataset is low, and scales in space with the
sum of metric entropy and information-theoretic entropy (randomness of the
data). Using these ideas, we present accelerated versions of standard tools,
with no loss in specificity and little loss in sensitivity, for use in three
domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics
(MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search
(esFragBag, 10x speedup of FragBag). Our framework can be used to achieve
"compressive omics," and the general theory can be readily applied to data
science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo
The matching polytope does not admit fully-polynomial size relaxation schemes
The groundbreaking work of Rothvo{\ss} [arxiv:1311.2369] established that
every linear program expressing the matching polytope has an exponential number
of inequalities (formally, the matching polytope has exponential extension
complexity). We generalize this result by deriving strong bounds on the
polyhedral inapproximability of the matching polytope: for fixed , every polyhedral -approximation
requires an exponential number of inequalities, where is the number of
vertices. This is sharp given the well-known -approximation of size
provided by the odd-sets of size up to
. Thus matching is the first problem in , whose natural
linear encoding does not admit a fully polynomial-size relaxation scheme (the
polyhedral equivalent of an FPTAS), which provides a sharp separation from the
polynomial-size relaxation scheme obtained e.g., via constant-sized odd-sets
mentioned above.
Our approach reuses ideas from Rothvo{\ss} [arxiv:1311.2369], however the
main lower bounding technique is different. While the original proof is based
on the hyperplane separation bound (also called the rectangle corruption
bound), we employ the information-theoretic notion of common information as
introduced in Braun and Pokutta [http://eccc.hpi-web.de/report/2013/056/],
which allows to analyze perturbations of slack matrices. It turns out that the
high extension complexity for the matching polytope stem from the same source
of hardness as for the correlation polytope: a direct sum structure.Comment: 21 pages, 3 figure
Graph ambiguity
In this paper, we propose a rigorous way to define the concept of ambiguity in the domain of graphs. In past studies, the classical definition of ambiguity has been derived starting from fuzzy set and fuzzy information theories. Our aim is to show that also in the domain of the graphs it is possible to derive a formulation able to capture the same semantic and mathematical concept. To strengthen the theoretical results, we discuss the application of the graph ambiguity concept to the graph classification setting, conceiving a new kind of inexact graph matching procedure. The results prove that the graph ambiguity concept is a characterizing and discriminative property of graphs. (C) 2013 Elsevier B.V. All rights reserved
Coherent frequentism
By representing the range of fair betting odds according to a pair of
confidence set estimators, dual probability measures on parameter space called
frequentist posteriors secure the coherence of subjective inference without any
prior distribution. The closure of the set of expected losses corresponding to
the dual frequentist posteriors constrains decisions without arbitrarily
forcing optimization under all circumstances. This decision theory reduces to
those that maximize expected utility when the pair of frequentist posteriors is
induced by an exact or approximate confidence set estimator or when an
automatic reduction rule is applied to the pair. In such cases, the resulting
frequentist posterior is coherent in the sense that, as a probability
distribution of the parameter of interest, it satisfies the axioms of the
decision-theoretic and logic-theoretic systems typically cited in support of
the Bayesian posterior. Unlike the p-value, the confidence level of an interval
hypothesis derived from such a measure is suitable as an estimator of the
indicator of hypothesis truth since it converges in sample-space probability to
1 if the hypothesis is true or to 0 otherwise under general conditions.Comment: The confidence-measure theory of inference and decision is explicitly
extended to vector parameters of interest. The derivation of upper and lower
confidence levels from valid and nonconservative set estimators is formalize
- …