521 research outputs found
On the capacity of information processing systems
We propose and analyze a family of information processing systems, where a
finite set of experts or servers are employed to extract information about a
stream of incoming jobs. Each job is associated with a hidden label drawn from
some prior distribution. An inspection by an expert produces a noisy outcome
that depends both on the job's hidden label and the type of the expert, and
occupies the expert for a finite time duration. A decision maker's task is to
dynamically assign inspections so that the resulting outcomes can be used to
accurately recover the labels of all jobs, while keeping the system stable.
Among our chief motivations are applications in crowd-sourcing, diagnostics,
and experiment designs, where one wishes to efficiently learn the nature of a
large number of items, using a finite pool of computational resources or human
agents.
We focus on the capacity of such an information processing system. Given a
level of accuracy guarantee, we ask how many experts are needed in order to
stabilize the system, and through what inspection architecture. Our main result
provides an adaptive inspection policy that is asymptotically optimal in the
following sense: the ratio between the required number of experts under our
policy and the theoretical optimal converges to one, as the probability of
error in label recovery tends to zero
Gossiping with Multiple Messages
This paper investigates the dissemination of multiple pieces of information
in large networks where users contact each other in a random uncoordinated
manner, and users upload one piece per unit time. The underlying motivation is
the design and analysis of piece selection protocols for peer-to-peer networks
which disseminate files by dividing them into pieces. We first investigate
one-sided protocols, where piece selection is based on the states of either the
transmitter or the receiver. We show that any such protocol relying only on
pushes, or alternatively only on pulls, is inefficient in disseminating all
pieces to all users. We propose a hybrid one-sided piece selection protocol --
INTERLEAVE -- and show that by using both pushes and pulls it disseminates
pieces from a single source to users in time, while obeying
the constraint that each user can upload at most one piece in one unit of time,
with high probability for large . An optimal, unrealistic centralized
protocol would take time in this setting. Moreover, efficient
dissemination is also possible if the source implements forward erasure coding,
and users push the latest-released coded pieces (but do not pull). We also
investigate two-sided protocols where piece selection is based on the states of
both the transmitter and the receiver. We show that it is possible to
disseminate pieces to users in time, starting from an
initial state where each user has a unique piece.Comment: Accepted to IEEE INFOCOM 200
An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums
Modern large-scale finite-sum optimization relies on two key aspects:
distribution and stochastic updates. For smooth and strongly convex problems,
existing decentralized algorithms are slower than modern accelerated
variance-reduced stochastic algorithms when run on a single machine, and are
therefore not efficient. Centralized algorithms are fast, but their scaling is
limited by global aggregation steps that result in communication bottlenecks.
In this work, we propose an efficient \textbf{A}ccelerated
\textbf{D}ecentralized stochastic algorithm for \textbf{F}inite \textbf{S}ums
named ADFS, which uses local stochastic proximal updates and randomized
pairwise communications between nodes. On machines, ADFS learns from
samples in the same time it takes optimal algorithms to learn from samples
on one machine. This scaling holds until a critical network size is reached,
which depends on communication delays, on the number of samples , and on the
network topology. We provide a theoretical analysis based on a novel augmented
graph approach combined with a precise evaluation of synchronization times and
an extension of the accelerated proximal coordinate gradient algorithm to
arbitrary sampling. We illustrate the improvement of ADFS over state-of-the-art
decentralized approaches with experiments.Comment: Code available in source files. arXiv admin note: substantial text
overlap with arXiv:1901.0986
Adaptive Matching for Expert Systems with Uncertain Task Types
A matching in a two-sided market often incurs an externality: a matched
resource may become unavailable to the other side of the market, at least for a
while. This is especially an issue in online platforms involving human experts
as the expert resources are often scarce. The efficient utilization of experts
in these platforms is made challenging by the fact that the information
available about the parties involved is usually limited.
To address this challenge, we develop a model of a task-expert matching
system where a task is matched to an expert using not only the prior
information about the task but also the feedback obtained from the past
matches. In our model the tasks arrive online while the experts are fixed and
constrained by a finite service capacity. For this model, we characterize the
maximum task resolution throughput a platform can achieve. We show that the
natural greedy approaches where each expert is assigned a task most suitable to
her skill is suboptimal, as it does not internalize the above externality. We
develop a throughput optimal backpressure algorithm which does so by accounting
for the `congestion' among different task types. Finally, we validate our model
and confirm our theoretical findings with data-driven simulations via logs of
Math.StackExchange, a StackOverflow forum dedicated to mathematics.Comment: A part of it presented at Allerton Conference 2017, 18 page
Group Synchronization on Grids
Group synchronization requires to estimate unknown elements
of a compact group associated to the
vertices of a graph , using noisy observations of the group
differences associated to the edges. This model is relevant to a variety of
applications ranging from structure from motion in computer vision to graph
localization and positioning, to certain families of community detection
problems.
We focus on the case in which the graph is the -dimensional grid.
Since the unknowns are only determined up to a global
action of the group, we consider the following weak recovery question. Can we
determine the group difference between far apart
vertices better than by random guessing? We prove that weak recovery is
possible (provided the noise is small enough) for and, for certain
finite groups, for . Viceversa, for some continuous groups, we prove
that weak recovery is impossible for . Finally, for strong enough noise,
weak recovery is always impossible.Comment: 21 page
Exponential random graphs as models of overlay networks
In this paper, we give an analytic solution for graphs with n nodes and E
edges for which the probability of obtaining a given graph G is specified in
terms of the degree sequence of G. We describe how this model naturally appears
in the context of load balancing in communication networks, namely Peer-to-Peer
overlays. We then analyse the degree distribution of such graphs and show that
the degrees are concentrated around their mean value. Finally, we derive
asymptotic results on the number of edges crossing a graph cut and use these
results to compute the graph expansion and conductance, and to
analyse the graph resilience to random failures.Comment: 18 page
Recommended from our members
Faithfulness in Internet Algorithms
Proving or disproving faithfulness (a property describing robustness to rational manipulation in action as well as information revelation) is an appealing goal when reasoning about distributed systems containing rational participants. Recent work formalizes the notion of faithfulness and its foundation properties, and presents a general proof technique in the course of proving the ex post Nash faithfulness of a theoretical routing problem [11].In this paper, we use a less formal approach and take some first steps in faithfulness analysis for existing algorithms running on the Internet. To this end, we consider the expected faithfulness of BitTorrent, a popular file download system, and show how manual backtracing (similar to the the ideas behind program slicing [22]) can be used to find rational manipulation problems. Although this primitive technique has serious drawbacks, it can be useful in disproving faithfulness.Building provably faithful Internet protocols and their corresponding specifications can be quite difficult depending on the system knowledge assumptions and problem complexity. We present some of the open problems that are associated with these challenges.Engineering and Applied Science
- …