161,986 research outputs found
Compression in a Distributed Setting
Motivated by an attempt to understand the formation and development of (human) language, we introduce a "distributed compression" problem. In our problem a sequence of pairs of players from a set of K players are chosen and tasked to communicate messages drawn from an unknown distribution Q.
Arguably languages are created and evolve to compress frequently occurring messages, and we focus on this aspect.
The only knowledge that players have about the distribution Q is from previously drawn samples, but these samples differ from player to player.
The only common knowledge between the players is restricted to a common prior distribution P and some constant number
of bits of information (such as a learning algorithm).
Letting T_epsilon denote the number of iterations it would take for a typical player
to obtain an epsilon-approximation to Q in total variation distance, we ask
whether T_epsilon iterations suffice to compress the messages down roughly to their
entropy and give a partial positive answer.
We show that a natural uniform algorithm can compress the communication down to an average cost per
message of O(H(Q) + log (D(P || Q)) in tilde{O}(T_epsilon) iterations
while allowing for O(epsilon)-error,
where D(. || .) denotes the KL-divergence between distributions.
For large divergences
this compares favorably with the static algorithm that ignores all samples and
compresses down to H(Q) + D(P || Q) bits, while not requiring T_epsilon * K iterations that it would take players to develop optimal but separate compressions for
each pair of players.
Along the way we introduce a "data-structural" view of the task of
communicating with a natural language and show that our natural algorithm can also be
implemented by an efficient data structure, whose storage is comparable to the storage requirements of Q and whose query complexity is comparable to the lengths of the message to be
compressed.
Our results give a plausible mathematical analogy to the mechanisms by which
human languages get created and evolve, and in particular highlights the
possibility of coordination towards a joint task (agreeing on a language)
while engaging in distributed learning
A Bit of Secrecy for Gaussian Source Compression
In this paper, the compression of an independent and identically distributed
Gaussian source sequence is studied in an unsecure network. Within a game
theoretic setting for a three-party noiseless communication network (sender
Alice, legitimate receiver Bob, and eavesdropper Eve), the problem of how to
efficiently compress a Gaussian source with limited secret key in order to
guarantee that Bob can reconstruct with high fidelity while preventing Eve from
estimating an accurate reconstruction is investigated. It is assumed that Alice
and Bob share a secret key with limited rate. Three scenarios are studied, in
which the eavesdropper ranges from weak to strong in terms of the causal side
information she has. It is shown that one bit of secret key per source symbol
is enough to achieve perfect secrecy performance in the Gaussian squared error
setting, and the information theoretic region is not optimized by joint
Gaussian random variables
Centralized and distributed semi-parametric compression of piecewise smooth functions
This thesis introduces novel wavelet-based semi-parametric centralized and distributed
compression methods for a class of piecewise smooth functions. Our proposed compression schemes are based on a non-conventional transform coding structure with simple
independent encoders and a complex joint decoder.
Current centralized state-of-the-art compression schemes are based on the conventional structure where an encoder is relatively complex and nonlinear. In addition, the
setting usually allows the encoder to observe the entire source. Recently, there has been
an increasing need for compression schemes where the encoder is lower in complexity
and, instead, the decoder has to handle more computationally intensive tasks. Furthermore, the setup may involve multiple encoders, where each one can only partially
observe the source. Such scenario is often referred to as distributed source coding.
In the first part, we focus on the dual situation of the centralized compression where
the encoder is linear and the decoder is nonlinear. Our analysis is centered around a
class of 1-D piecewise smooth functions. We show that, by incorporating parametric
estimation into the decoding procedure, it is possible to achieve the same distortion-
rate performance as that of a conventional wavelet-based compression scheme. We also
present a new constructive approach to parametric estimation based on the sampling
results of signals with finite rate of innovation.
The second part of the thesis focuses on the distributed compression scenario, where
each independent encoder partially observes the 1-D piecewise smooth function. We
propose a new wavelet-based distributed compression scheme that uses parametric estimation to perform joint decoding. Our distortion-rate analysis shows that it is possible
for the proposed scheme to achieve that same compression performance as that of a
joint encoding scheme.
Lastly, we apply the proposed theoretical framework in the context of distributed
image and video compression. We start by considering a simplified model of the video
signal and show that we can achieve distortion-rate performance close to that of a joint
encoding scheme. We then present practical compression schemes for real world signals.
Our simulations confirm the improvement in performance over classical schemes, both
in terms of the PSNR and the visual quality
Bidirectional compression in heterogeneous settings for distributed or federated learning with partial participation: tight convergence guarantees
We introduce a framework - Artemis - to tackle the problem of learning in a
distributed or federated setting with communication constraints and device
partial Several workers (randomly sampled) perform the optimization process
using a central server to aggregate their computations. To alleviate the
communication cost, Artemis allows to compresses the information sent in both
directions (from the workers to the server and conversely) combined with a
memory It improves on existing algorithms that only consider unidirectional
compression (to the server), or use very strong assumptions on the compression
operator, and often do not take into account devices partial participation. We
provide fast rates of convergence (linear up to a threshold) under weak
assumptions on the stochastic gradients (noise's variance bounded only at
optimal point) in non-i.i.d. setting, highlight the impact of memory for
unidirectional and bidirectional compression, analyze Polyak-Ruppert averaging.
We use convergence in distribution to obtain a lower bound of the asymptotic
variance that highlights practical limits of compression. And we provide
experimental results to demonstrate the validity of our analysis.Comment: 56 pages, 4 theorems, 1 algorithm, code source on GitHu
- …