139 research outputs found
Fair redistricting is hard
Gerrymandering is a long-standing issue within the U.S. political system, and
it has received scrutiny recently by the U.S. Supreme Court. In this note, we
prove that deciding whether there exists a fair redistricting among legal maps
is NP-hard. To make this precise, we use simplified notions of "legal" and
"fair" that account for desirable traits such as geographic compactness of
districts and sufficient representation of voters. The proof of our result is
inspired by the work of Mahanjan, Minbhorkar and Varadarajan that proves that
planar k-means is NP-hard
On the equivalence between graph isomorphism testing and function approximation with GNNs
Graph neural networks (GNNs) have achieved lots of success on
graph-structured data. In the light of this, there has been increasing interest
in studying their representation power. One line of work focuses on the
universal approximation of permutation-invariant functions by certain classes
of GNNs, and another demonstrates the limitation of GNNs via graph isomorphism
tests.
Our work connects these two perspectives and proves their equivalence. We
further develop a framework of the representation power of GNNs with the
language of sigma-algebra, which incorporates both viewpoints. Using this
framework, we compare the expressive power of different classes of GNNs as well
as other methods on graphs. In particular, we prove that order-2 Graph
G-invariant networks fail to distinguish non-isomorphic regular graphs with the
same degree. We then extend them to a new architecture, Ring-GNNs, which
succeeds on distinguishing these graphs and provides improvements on real-world
social network datasets
Shuffled linear regression through graduated convex relaxation
The shuffled linear regression problem aims to recover linear relationships
in datasets where the correspondence between input and output is unknown. This
problem arises in a wide range of applications including survey data, in which
one needs to decide whether the anonymity of the responses can be preserved
while uncovering significant statistical connections. In this work, we propose
a novel optimization algorithm for shuffled linear regression based on a
posterior-maximizing objective function assuming Gaussian noise prior. We
compare and contrast our approach with existing methods on synthetic and real
data. We show that our approach performs competitively while achieving
empirical running-time improvements. Furthermore, we demonstrate that our
algorithm is able to utilize the side information in the form of seeds, which
recently came to prominence in related problems
Recommended from our members
Relax, descend and certify : optimization techniques for typically tractable data problems
In this thesis we explore different mathematical techniques for extracting information from data. In particular we focus in machine learning problems such as clustering and data cloud alignment. Both problems are intractable in the "worst case", but we show that convex relaxations can efficiently find the exact or almost exact solution for classes of "typical" instances.
We study different roles that optimization techniques can play in understanding and processing data. These include efficient algorithms with mathematical guarantees, a posteriori methods for quality evaluation of solutions, and algorithmic relaxation of mathematical models.
We develop probabilistic and data-driven techniques to model data and evaluate performance of algorithms for data problems.Mathematic
A Short Tutorial on The Weisfeiler-Lehman Test And Its Variants
Graph neural networks are designed to learn functions on graphs. Typically,
the relevant target functions are invariant with respect to actions by
permutations. Therefore the design of some graph neural network architectures
has been inspired by graph-isomorphism algorithms. The classical
Weisfeiler-Lehman algorithm (WL) -- a graph-isomorphism test based on color
refinement -- became relevant to the study of graph neural networks. The WL
test can be generalized to a hierarchy of higher-order tests, known as -WL.
This hierarchy has been used to characterize the expressive power of graph
neural networks, and to inspire the design of graph neural network
architectures. A few variants of the WL hierarchy appear in the literature. The
goal of this short note is pedagogical and practical: We explain the
differences between the WL and folklore-WL formulations, with pointers to
existing discussions in the literature. We illuminate the differences between
the formulations by visualizing an example
- …