55,686 research outputs found
Optimal Principal Component Analysis in Distributed and Streaming Models
We study the Principal Component Analysis (PCA) problem in the distributed
and streaming models of computation. Given a matrix a
rank parameter , and an accuracy parameter , we
want to output an orthonormal matrix for which where is the best rank- approximation to .
This paper provides improved algorithms for distributed PCA and streaming
PCA.Comment: STOC2016 full versio
Making recommendations bandwidth aware
This paper asks how much we can gain in terms of bandwidth and user
satisfaction, if recommender systems became bandwidth aware and took into
account not only the user preferences, but also the fact that they may need to
serve these users under bandwidth constraints, as is the case over wireless
networks. We formulate this as a new problem in the context of index coding: we
relax the index coding requirements to capture scenarios where each client has
preferences associated with messages. The client is satisfied to receive any
message she does not already have, with a satisfaction proportional to her
preference for that message. We consistently find, over a number of scenarios
we sample, that although the optimization problems are in general NP-hard,
significant bandwidth savings are possible even when restricted to polynomial
time algorithms
Large-scale Join-Idle-Queue system with general service times
A parallel server system with identical servers is considered. The
service time distribution has a finite mean , but otherwise is
arbitrary. Arriving customers are be routed to one of the servers immediately
upon arrival. Join-Idle-Queue routing algorithm is studied, under which an
arriving customer is sent to an idle server, if such is available, and to a
randomly uniformly chosen server, otherwise. We consider the asymptotic regime
where and the customer input flow rate is . Under the
condition , we prove that, as , the sequence of
(appropriately scaled) stationary distributions concentrates at the natural
equilibrium point, with the fraction of occupied servers being constant equal
. In particular, this implies that the steady-state probability of
an arriving customer waiting for service vanishes.Comment: Revision. 11 page
Online Distributed Sensor Selection
A key problem in sensor networks is to decide which sensors to query when, in
order to obtain the most useful information (e.g., for performing accurate
prediction), subject to constraints (e.g., on power and bandwidth). In many
applications the utility function is not known a priori, must be learned from
data, and can even change over time. Furthermore for large sensor networks
solving a centralized optimization problem to select sensors is not feasible,
and thus we seek a fully distributed solution. In this paper, we present
Distributed Online Greedy (DOG), an efficient, distributed algorithm for
repeatedly selecting sensors online, only receiving feedback about the utility
of the selected sensors. We prove very strong theoretical no-regret guarantees
that apply whenever the (unknown) utility function satisfies a natural
diminishing returns property called submodularity. Our algorithm has extremely
low communication requirements, and scales well to large sensor deployments. We
extend DOG to allow observation-dependent sensor selection. We empirically
demonstrate the effectiveness of our algorithm on several real-world sensing
tasks
Redundancy Scheduling with Locally Stable Compatibility Graphs
Redundancy scheduling is a popular concept to improve performance in
parallel-server systems. In the baseline scenario any job can be handled
equally well by any server, and is replicated to a fixed number of servers
selected uniformly at random. Quite often however, there may be heterogeneity
in job characteristics or server capabilities, and jobs can only be replicated
to specific servers because of affinity relations or compatibility constraints.
In order to capture such situations, we consider a scenario where jobs of
various types are replicated to different subsets of servers as prescribed by a
general compatibility graph. We exploit a product-form stationary distribution
and weak local stability conditions to establish a state space collapse in
heavy traffic. In this limiting regime, the parallel-server system with
graph-based redundancy scheduling operates as a multi-class single-server
system, achieving full resource pooling and exhibiting strong insensitivity to
the underlying compatibility constraints.Comment: 28 pages, 4 figure
Fundamental limits of failure identifiability by Boolean Network Tomography
Boolean network tomography is a powerful tool to infer the state (working/failed) of individual nodes from path-level measurements obtained by egde-nodes. We consider the problem of optimizing the capability of identifying network failures through the design of monitoring schemes. Finding an optimal solution is NP-hard and a large body of work has been devoted to heuristic approaches providing lower bounds. Unlike previous works, we provide upper bounds on the maximum number of identifiable nodes, given the number of monitoring paths and different constraints on the network topology, the routing scheme, and the maximum path length. The proposed upper bounds represent a fundamental limit on the identifiability of failures via Boolean network tomography. This analysis provides insights on how to design topologies and related monitoring schemes to achieve the maximum identifiability under various network settings. Through analysis and experiments we demonstrate the tightness of the bounds and efficacy of the design insights for engineered as well as real network
Age-Optimal Updates of Multiple Information Flows
In this paper, we study an age of information minimization problem, where
multiple flows of update packets are sent over multiple servers to their
destinations. Two online scheduling policies are proposed. When the packet
generation and arrival times are synchronized across the flows, the proposed
policies are shown to be (near) optimal for minimizing any time-dependent,
symmetric, and non-decreasing penalty function of the ages of the flows over
time in a stochastic ordering sense
- …