2 research outputs found
Info-Clustering: A Mathematical Theory for Data Clustering
We formulate an info-clustering paradigm based on a multivariate information
measure, called multivariate mutual information, that naturally extends
Shannon's mutual information between two random variables to the multivariate
case involving more than two random variables. With proper model reductions, we
show that the paradigm can be applied to study the human genome and connectome
in a more meaningful way than the conventional algorithmic approach. Not only
can info-clustering provide justifications and refinements to some existing
techniques, but it also inspires new computationally feasible solutions.Comment: In celebration of Claude Shannon's Centenar
Secret key agreement for hypergraphical sources with limited total discussion
This work considers the problem of multiterminal secret key agreement by
limited total public discussion under the hypergraphical source model. The
secrecy capacity as a function of the total discussion rate is completely
characterized by a polynomial-time computable linear program. Compared to the
existing solution for a particular hypergraphical source model called the
pairwise independent network (PIN) model, the current result is a non-trivial
extension as it applies to a strictly larger class of sources and a more
general scenario involving helpers and wiretapper's side information. In
particular, while the existing solution by tree-packing can be strictly
suboptimal for the PIN model with helpers and the hypergraphical source model
in general, we can show that decremental secret key agreement and linear
network coding is optimal, resolving a previous conjecture in the affirmative.
The converse is established by a single-letter upper bound on the secrecy
capacity for discrete memoryless multiple sources and individual discussion
rate constraints. The minimax optimization involved in the bound can be relaxed
to give the best existing upper bounds on secrecy capacities such as the
lamination bounds for hypergraphical sources, helper-set bound for general
sources, the bound at asymptotically zero discussion rate via the multivariate
G\'ac--K\"orner common information, and the lower bound on communication
complexity via a multivariate extension of the Wyner common information. These
reductions unify existing bounding techniques and reveal surprising connections
between seemingly different information-theoretic notions. Further challenges
are posed in this work along with a simple example of finite linear source
where the current converse techniques fail even though the proposed achieving
scheme remains optimal