58 research outputs found
Zero-error communication over adder MAC
Adder MAC is a simple noiseless multiple-access channel (MAC), where if users
send messages , then the receiver receives with addition over . Communication over the
noiseless adder MAC has been studied for more than fifty years. There are two
models of particular interest: uniquely decodable code tuples, and -codes.
In spite of the similarities between these two models, lower bounds and upper
bounds of the optimal sum rate of uniquely decodable code tuple asymptotically
match as number of users goes to infinity, while there is a gap of factor two
between lower bounds and upper bounds of the optimal rate of -codes.
The best currently known -codes for are constructed using
random coding. In this work, we study variants of the random coding method and
related problems, in hope of achieving -codes with better rate. Our
contribution include the following. (1) We prove that changing the underlying
distribution used in random coding cannot improve the rate. (2) We determine
the rate of a list-decoding version of -codes achieved by the random
coding method. (3) We study several related problems about R\'{e}nyi entropy.Comment: An updated version of author's master thesi
Non-linear Log-Sobolev inequalities for the Potts semigroup and applications to reconstruction problems
Consider a Markov process with state space , which jumps continuously to
a new state chosen uniformly at random and regardless of the previous state.
The collection of transition kernels (indexed by time ) is the Potts
semigroup. Diaconis and Saloff-Coste computed the maximum of the ratio of the
relative entropy and the Dirichlet form obtaining the constant in
the -log-Sobolev inequality (-LSI). In this paper, we obtain the best
possible non-linear inequality relating entropy and the Dirichlet form (i.e.,
-NLSI, ). As an example, we show . The more precise NLSIs have been shown by Polyanskiy and Samorodnitsky to
imply various geometric and Fourier-analytic results.
Beyond the Potts semigroup, we also analyze Potts channels -- Markov
transition matrices constant on and off diagonal. (Potts
semigroup corresponds to a (ferromagnetic) subset of matrices with positive
second eigenvalue). By integrating the -NLSI we obtain the new strong data
processing inequality (SDPI), which in turn allows us to improve results on
reconstruction thresholds for Potts models on trees. A special case is the
problem of reconstructing color of the root of a -colored tree given
knowledge of colors of all the leaves. We show that to have a non-trivial
reconstruction probability the branching number of the tree should be at least
This extends previous
results (of Sly and Bhatnagar et al.) to general trees, and avoids the need for
any specialized arguments. Similarly, we improve the state-of-the-art on
reconstruction threshold for the stochastic block model with balanced
groups, for all . These improvements advocate information-theoretic
methods as a useful complement to the conventional techniques originating from
the statistical physics
Community detection in the hypergraph stochastic block model and reconstruction on hypertrees
We study the weak recovery problem on the -uniform hypergraph stochastic
block model (-HSBM) with two balanced communities. In this model,
vertices are randomly divided into two communities, and size- hyperedges are
added randomly depending on whether all vertices in the hyperedge are in the
same community. The goal of weak recovery is to recover a non-trivial fraction
of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu
(2022) established that weak recovery is always possible above a natural
threshold called the Kesten-Stigum (KS) threshold. For assortative models
(i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved
that the KS threshold is tight if or the expected degree is small.
For other cases, the tightness of the KS threshold remained open.
In this paper we determine the tightness of the KS threshold for a wide range
of parameters. We prove that for and large enough, the KS
threshold is tight. This shows that there is no information-computation gap in
this regime and partially confirms a conjecture of Angelini et al. (2015). On
the other hand, we show that for , there exist parameters for which the
KS threshold is not tight. In particular, for , the KS threshold is not
tight if the model is disassortative (i.e., polychromatic hyperedges are
preferred) or is large enough. This provides more evidence supporting the
existence of an information-computation gap in these cases.
Furthermore, we establish asymptotic bounds on the weak recovery threshold
for fixed and large . We also obtain a number of results regarding the
broadcasting on hypertrees (BOHT) model, including the asymptotics of the
reconstruction threshold for and impossibility of robust
reconstruction at criticality
Faster Algorithms for Structured Linear and Kernel Support Vector Machines
Quadratic programming is a ubiquitous prototype in convex programming. Many
combinatorial optimizations on graphs and machine learning problems can be
formulated as quadratic programming; for example, Support Vector Machines
(SVMs). Linear and kernel SVMs have been among the most popular models in
machine learning over the past three decades, prior to the deep learning era.
Generally, a quadratic program has an input size of , where
is the number of variables. Assuming the Strong Exponential Time Hypothesis
(), it is known that no algorithm exists
(Backurs, Indyk, and Schmidt, NIPS'17). However, problems such as SVMs usually
feature much smaller input sizes: one is given data points, each of
dimension , with . Furthermore, SVMs are variants with only
linear constraints. This suggests that faster algorithms are feasible, provided
the program exhibits certain underlying structures.
In this work, we design the first nearly-linear time algorithm for solving
quadratic programs whenever the quadratic objective has small treewidth or
admits a low-rank factorization, and the number of linear constraints is small.
Consequently, we obtain a variety of results for SVMs:
* For linear SVM, where the quadratic constraint matrix has treewidth ,
we can solve the corresponding program in time ;
* For linear SVM, where the quadratic constraint matrix admits a low-rank
factorization of rank-, we can solve the corresponding program in time
;
* For Gaussian kernel SVM, where the data dimension and
the squared dataset radius is small, we can solve it in time
. We also prove that when the squared dataset
radius is large, then time is required.Comment: New results: almost-linear time algorithm for Gaussian kernel SVM and
complementary lower bounds. Abstract shortened to meet arxiv requiremen
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Softmax distributions are widely used in machine learning, including Large
Language Models (LLMs) where the attention unit uses softmax distributions. We
abstract the attention unit as the softmax model, where given a vector input,
the model produces an output drawn from the softmax distribution (which depends
on the vector input). We consider the fundamental problem of binary hypothesis
testing in the setting of softmax models. That is, given an unknown softmax
model, which is known to be one of the two given softmax models, how many
queries are needed to determine which one is the truth? We show that the sample
complexity is asymptotically where is a certain
distance between the parameters of the models.
Furthermore, we draw analogy between the softmax model and the leverage score
model, an important tool for algorithm design in linear algebra and graph
theory. The leverage score model, on a high level, is a model which, given
vector input, produces an output drawn from a distribution dependent on the
input. We obtain similar results for the binary hypothesis testing problem for
leverage score models
Spanoids - An Abstraction of Spanning Structures, and a Barrier for LCCs
We introduce a simple logical inference structure we call a spanoid (generalizing the notion of a matroid), which captures well-studied problems in several areas. These include combinatorial geometry (point-line incidences), algebra (arrangements of hypersurfaces and ideals), statistical physics (bootstrap percolation), network theory (gossip / infection processes) and coding theory. We initiate a thorough investigation of spanoids, from computational and structural viewpoints, focusing on parameters relevant to the applications areas above and, in particular, to questions regarding Locally Correctable Codes (LCCs).
One central parameter we study is the rank of a spanoid, extending the rank of a matroid and related to the dimension of codes. This leads to one main application of our work, establishing the first known barrier to improving the nearly 20-year old bound of Katz-Trevisan (KT) on the dimension of LCCs. On the one hand, we prove that the KT bound (and its more recent refinements) holds for the much more general setting of spanoid rank. On the other hand we show that there exist (random) spanoids whose rank matches these bounds. Thus, to significantly improve the known bounds one must step out of the spanoid framework.
Another parameter we explore is the functional rank of a spanoid, which captures the possibility of turning a given spanoid into an actual code. The question of the relationship between rank and functional rank is one of the main questions we raise as it may reveal new avenues for constructing new LCCs (perhaps even matching the KT bound). As a first step, we develop an entropy relaxation of functional rank to create a small constant gap and amplify it by tensoring to construct a spanoid whose functional rank is smaller than rank by a polynomial factor. This is evidence that the entropy method we develop can prove polynomially better bounds than KT-type methods on the dimension of LCCs.
To facilitate the above results we also develop some basic structural results on spanoids including an equivalent formulation of spanoids as set systems and properties of spanoid products. We feel that given these initial findings and their motivations, the abstract study of spanoids merits further investigation. We leave plenty of concrete open problems and directions
Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time
Given a matrix , the low rank matrix completion
problem asks us to find a rank- approximation of as for and by only observing a
few entries specified by a set of entries . In
particular, we examine an approach that is widely used in practice -- the
alternating minimization framework. Jain, Netrapalli and Sanghavi~\cite{jns13}
showed that if has incoherent rows and columns, then alternating
minimization provably recovers the matrix by observing a nearly linear in
number of entries. While the sample complexity has been subsequently
improved~\cite{glz17}, alternating minimization steps are required to be
computed exactly. This hinders the development of more efficient algorithms and
fails to depict the practical implementation of alternating minimization, where
the updates are usually performed approximately in favor of efficiency.
In this paper, we take a major step towards a more efficient and error-robust
alternating minimization framework. To this end, we develop an analytical
framework for alternating minimization that can tolerate moderate amount of
errors caused by approximate updates. Moreover, our algorithm runs in time
, which is nearly linear in the time to verify the
solution while preserving the sample complexity. This improves upon all prior
known alternating minimization approaches which require time.Comment: Improve the runtime from to $O|\Omega| k)
Faster Monotone Min-Plus Product, Range Mode, and Single Source Replacement Paths
One of the most basic graph problems, All-Pairs Shortest Paths (APSP) is known to be solvable in n^{3-o(1)} time, and it is widely open whether it has an O(n^{3-ε}) time algorithm for ε > 0. To better understand APSP, one often strives to obtain subcubic time algorithms for structured instances of APSP and problems equivalent to it, such as the Min-Plus matrix product. A natural structured version of Min-Plus product is Monotone Min-Plus product which has been studied in the context of the Batch Range Mode [SODA'20] and Dynamic Range Mode [ICALP'20] problems. This paper improves the known algorithms for Monotone Min-Plus Product and for Batch and Dynamic Range Mode, and establishes a connection between Monotone Min-Plus Product and the Single Source Replacement Paths (SSRP) problem on an n-vertex graph with potentially negative edge weights in {-M, …, M}. SSRP with positive integer edge weights bounded by M can be solved in Õ(Mn^ω) time, whereas the prior fastest algorithm for graphs with possibly negative weights [FOCS'12] runs in O(M^{0.7519} n^{2.5286}) time, the current best running time for directed APSP with small integer weights. Using Monotone Min-Plus Product, we obtain an improved O(M^{0.8043} n^{2.4957}) time SSRP algorithm, showing that SSRP with constant negative integer weights is likely easier than directed unweighted APSP, a problem that is believed to require n^{2.5-o(1)} time. Complementing our algorithm for SSRP, we give a reduction from the Bounded-Difference Min-Plus Product problem studied by Bringmann et al. [FOCS'16] to negative weight SSRP. This reduction shows that it might be difficult to obtain an Õ(M n^{ω}) time algorithm for SSRP with negative weight edges, thus separating the problem from SSRP with only positive weight edges
- …