5,654 research outputs found
Spectral MLE: Top- Rank Aggregation from Pairwise Comparisons
This paper explores the preference-based top- rank aggregation problem.
Suppose that a collection of items is repeatedly compared in pairs, and one
wishes to recover a consistent ordering that emphasizes the top- ranked
items, based on partially revealed preferences. We focus on the
Bradley-Terry-Luce (BTL) model that postulates a set of latent preference
scores underlying all items, where the odds of paired comparisons depend only
on the relative scores of the items involved.
We characterize the minimax limits on identifiability of top- ranked
items, in the presence of random and non-adaptive sampling. Our results
highlight a separation measure that quantifies the gap of preference scores
between the and ranked items. The minimum
sample complexity required for reliable top- ranking scales inversely with
the separation measure irrespective of other preference distribution metrics.
To approach this minimax limit, we propose a nearly linear-time ranking scheme,
called \emph{Spectral MLE}, that returns the indices of the top- items in
accordance to a careful score estimate. In a nutshell, Spectral MLE starts with
an initial score estimate with minimal squared loss (obtained via a spectral
method), and then successively refines each component with the assistance of
coordinate-wise MLEs. Encouragingly, Spectral MLE allows perfect top- item
identification under minimal sample complexity. The practical applicability of
Spectral MLE is further corroborated by numerical experiments.Comment: accepted to International Conference on Machine Learning (ICML), 201
Intermittency of surface layer wind velocity series in the mesoscale range
We study various time series of surface layer wind velocity at different
locations and provide evidences for the intermittent nature of the wind
fluctuations in mesoscale range. By means of the magnitude covariance analysis,
which is shown to be a more efficient tool to study intermittency than
classical scaling analysis, we find that all wind series exhibit similar
features than those observed for laboratory turbulence. Our findings suggest
the existence of a "universal" cascade mechanism associated with the energy
transfer between synoptic motions and turbulent microscales in the atmospheric
boundary layer.Comment: 6 figure
Answering Range Queries Under Local Differential Privacy
Counting the fraction of a population having an input within a specified
interval i.e. a \emph{range query}, is a fundamental data analysis primitive.
Range queries can also be used to compute other interesting statistics such as
\emph{quantiles}, and to build prediction models. However, frequently the data
is subject to privacy concerns when it is drawn from individuals, and relates
for example to their financial, health, religious or political status. In this
paper, we introduce and analyze methods to support range queries under the
local variant of differential privacy, an emerging standard for
privacy-preserving data analysis.
The local model requires that each user releases a noisy view of her private
data under a privacy guarantee. While many works address the problem of range
queries in the trusted aggregator setting, this problem has not been addressed
specifically under untrusted aggregation (local DP) model even though many
primitives have been developed recently for estimating a discrete distribution.
We describe and analyze two classes of approaches for range queries, based on
hierarchical histograms and the Haar wavelet transform. We show that both have
strong theoretical accuracy guarantees on variance. In practice, both methods
are fast and require minimal computation and communication resources. Our
experiments show that the wavelet approach is most accurate in high privacy
settings, while the hierarchical approach dominates for weaker privacy
requirements
Graph Neural Networks Exponentially Lose Expressive Power for Node Classification
Graph Neural Networks (graph NNs) are a promising deep learning approach for
analyzing graph-structured data. However, it is known that they do not improve
(or sometimes worsen) their predictive performance as we pile up many layers
and add non-lineality. To tackle this problem, we investigate the expressive
power of graph NNs via their asymptotic behaviors as the layer size tends to
infinity. Our strategy is to generalize the forward propagation of a Graph
Convolutional Network (GCN), which is a popular graph NN variant, as a specific
dynamical system. In the case of a GCN, we show that when its weights satisfy
the conditions determined by the spectra of the (augmented) normalized
Laplacian, its output exponentially approaches the set of signals that carry
information of the connected components and node degrees only for
distinguishing nodes. Our theory enables us to relate the expressive power of
GCNs with the topological information of the underlying graphs inherent in the
graph spectra. To demonstrate this, we characterize the asymptotic behavior of
GCNs on the Erd\H{o}s -- R\'{e}nyi graph. We show that when the Erd\H{o}s --
R\'{e}nyi graph is sufficiently dense and large, a broad range of GCNs on it
suffers from the "information loss" in the limit of infinite layers with high
probability. Based on the theory, we provide a principled guideline for weight
normalization of graph NNs. We experimentally confirm that the proposed weight
scaling enhances the predictive performance of GCNs in real data. Code is
available at https://github.com/delta2323/gnn-asymptotics.Comment: 9 pages, Supplemental material 28 pages. Accepted in International
Conference on Learning Representations (ICLR) 202
Democracy under uncertainty: The ‘wisdom of crowds’ and the free-rider problem in group decision making
We introduce a game theory model of individual decisions to cooperate by contributing personal resources to group decisions versus by free-riding on the contributions of other members. In contrast to most public-goods games that assume group returns are linear in individual contributions, the present model assumes decreasing marginal group production as a function of aggregate individual contributions. This diminishing marginal returns assumption is more realistic and generates starkly different predictions compared to the linear model. One important implication is that, under most conditions, there exist equilibria where some, but not all members of a group contribute, even with completely self-interested motives. An agent-based simulation confirms the individual and group advantages of the equilibria in which behavioral asymmetry emerges from a game structure that is a priori perfectly symmetric for all agents (all agents have the same payoff function and action space, but take different actions in equilibria). And a behavioral experiment demonstrates that cooperators and free-riders coexist in a stable manner in groups performing with the non-linear production function. A collateral result demonstrates that, compared to a ―dictatorial‖ decision scheme guided by the best member in a group, the majority-plurality decision rules can pool information effectively and produce greater individual net welfare at equilibrium, even if free-riding is not sanctioned. This is an original proof that cooperation in ad hoc decision-making groups can be understood in terms of self-interested motivations and that, despite the free-rider problem, majority-plurality decision rules can function robustly as simple, efficient social decision heuristics.group decision making under uncertainty, free-rider problem, majority-plurality rules, marginally-diminishing group returns, evolutionary games, behavioral experiment
Optimizing Locally Differentially Private Protocols
Protocols satisfying Local Differential Privacy (LDP) enable parties to
collect aggregate information about a population while protecting each user's
privacy, without relying on a trusted third party. LDP protocols (such as
Google's RAPPOR) have been deployed in real-world scenarios. In these
protocols, a user encodes his private information and perturbs the encoded
value locally before sending it to an aggregator, who combines values that
users contribute to infer statistics about the population. In this paper, we
introduce a framework that generalizes several LDP protocols proposed in the
literature. Our framework yields a simple and fast aggregation algorithm, whose
accuracy can be precisely analyzed. Our in-depth analysis enables us to choose
optimal parameters, resulting in two new protocols (i.e., Optimized Unary
Encoding and Optimized Local Hashing) that provide better utility than
protocols previously proposed. We present precise conditions for when each
proposed protocol should be used, and perform experiments that demonstrate the
advantage of our proposed protocols
Power-law scaling in dimension-to-biomass relationship of fish schools
Motivated by the finding that there is some biological universality in the
relationship between school geometry and school biomass of various pelagic
fishes in various conditions, I here establish a scaling law for school
dimensions: the school diameter increases as a power-law function of school
biomass. The power-law exponent is extracted through the data collapse, and is
close to 3/5. This value of the exponent implies that the mean packing density
decreases as the school biomass increases, and the packing structure displays a
mass-fractal dimension of 5/3. By exploiting an analogy between school geometry
and polymer chain statistics, I examine the behavioral algorithm governing the
swollen conformation of large-sized schools of pelagics, and I explain the
value of the exponent.Comment: 25 pages, 6 figures, to appear in J. Theor. Bio
Optimization Methods for Large-Scale Machine Learning
This paper provides a review and commentary on the past, present, and future
of numerical optimization algorithms in the context of machine learning
applications. Through case studies on text classification and the training of
deep neural networks, we discuss how optimization problems arise in machine
learning and what makes them challenging. A major theme of our study is that
large-scale machine learning represents a distinctive setting in which the
stochastic gradient (SG) method has traditionally played a central role while
conventional gradient-based nonlinear optimization techniques typically falter.
Based on this viewpoint, we present a comprehensive theory of a
straightforward, yet versatile SG algorithm, discuss its practical behavior,
and highlight opportunities for designing algorithms with improved performance.
This leads to a discussion about the next generation of optimization methods
for large-scale machine learning, including an investigation of two main
streams of research on techniques that diminish noise in the stochastic
directions and methods that make use of second-order derivative approximations
Learning More Universal Representations for Transfer-Learning
A representation is supposed universal if it encodes any element of the
visual world (e.g., objects, scenes) in any configuration (e.g., scale,
context). While not expecting pure universal representations, the goal in the
literature is to improve the universality level, starting from a representation
with a certain level. To do so, the state-of-the-art consists in learning
CNN-based representations on a diversified training problem (e.g., ImageNet
modified by adding annotated data). While it effectively increases
universality, such approach still requires a large amount of efforts to satisfy
the needs in annotated data. In this work, we propose two methods to improve
universality, but pay special attention to limit the need of annotated data. We
also propose a unified framework of the methods based on the diversifying of
the training problem. Finally, to better match Atkinson's cognitive study about
universal human representations, we proposed to rely on the transfer-learning
scheme as well as a new metric to evaluate universality. This latter, aims us
to demonstrates the interest of our methods on 10 target-problems, relating to
the classification task and a variety of visual domains.Comment: Submitted to IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence
Data in the form of pairwise comparisons arises in many domains, including
preference elicitation, sporting competitions, and peer grading among others.
We consider parametric ordinal models for such pairwise comparison data
involving a latent vector that represents the
"qualities" of the items being compared; this class of models includes the
two most widely used parametric models--the Bradley-Terry-Luce (BTL) and the
Thurstone models. Working within a standard minimax framework, we provide tight
upper and lower bounds on the optimal error in estimating the quality score
vector under this class of models. The bounds depend on the topology of
the comparison graph induced by the subset of pairs being compared via its
Laplacian spectrum. Thus, in settings where the subset of pairs may be chosen,
our results provide principled guidelines for making this choice. Finally, we
compare these error rates to those under cardinal measurement models and show
that the error rates in the ordinal and cardinal settings have identical
scalings apart from constant pre-factors.Comment: 39 pages, 5 figures. Significant extension of arXiv:1406.661
- …