1,911 research outputs found
Learning-Based Optimization of Cache Content in a Small Cell Base Station
Optimal cache content placement in a wireless small cell base station (sBS)
with limited backhaul capacity is studied. The sBS has a large cache memory and
provides content-level selective offloading by delivering high data rate
contents to users in its coverage area. The goal of the sBS content controller
(CC) is to store the most popular contents in the sBS cache memory such that
the maximum amount of data can be fetched directly form the sBS, not relying on
the limited backhaul resources during peak traffic periods. If the popularity
profile is known in advance, the problem reduces to a knapsack problem.
However, it is assumed in this work that, the popularity profile of the files
is not known by the CC, and it can only observe the instantaneous demand for
the cached content. Hence, the cache content placement is optimised based on
the demand history. By refreshing the cache content at regular time intervals,
the CC tries to learn the popularity profile, while exploiting the limited
cache capacity in the best way possible. Three algorithms are studied for this
cache content placement problem, leading to different exploitation-exploration
trade-offs. We provide extensive numerical simulations in order to study the
time-evolution of these algorithms, and the impact of the system parameters,
such as the number of files, the number of users, the cache size, and the
skewness of the popularity profile, on the performance. It is shown that the
proposed algorithms quickly learn the popularity profile for a wide range of
system parameters.Comment: Accepted to IEEE ICC 2014, Sydney, Australia. Minor typos corrected.
Algorithm MCUCB correcte
Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets
This paper introduces new algorithms and data structures for quick counting
for machine learning datasets. We focus on the counting task of constructing
contingency tables, but our approach is also applicable to counting the number
of records in a dataset that match conjunctive queries. Subject to certain
assumptions, the costs of these operations can be shown to be independent of
the number of records in the dataset and loglinear in the number of non-zero
entries in the contingency table. We provide a very sparse data structure, the
ADtree, to minimize memory use. We provide analytical worst-case bounds for
this structure for several models of data distribution. We empirically
demonstrate that tractably-sized data structures can be produced for large
real-world datasets by (a) using a sparse tree structure that never allocates
memory for counts of zero, (b) never allocating memory for counts that can be
deduced from other counts, and (c) not bothering to expand the tree fully near
its leaves. We show how the ADtree can be used to accelerate Bayes net
structure finding algorithms, rule learning algorithms, and feature selection
algorithms, and we provide a number of empirical results comparing ADtree
methods against traditional direct counting approaches. We also discuss the
possible uses of ADtrees in other machine learning methods, and discuss the
merits of ADtrees in comparison with alternative representations such as
kd-trees, R-trees and Frequent Sets.Comment: See http://www.jair.org/ for any accompanying file
A Low-Complexity Approach to Distributed Cooperative Caching with Geographic Constraints
We consider caching in cellular networks in which each base station is
equipped with a cache that can store a limited number of files. The popularity
of the files is known and the goal is to place files in the caches such that
the probability that a user at an arbitrary location in the plane will find the
file that she requires in one of the covering caches is maximized.
We develop distributed asynchronous algorithms for deciding which contents to
store in which cache. Such cooperative algorithms require communication only
between caches with overlapping coverage areas and can operate in asynchronous
manner. The development of the algorithms is principally based on an
observation that the problem can be viewed as a potential game. Our basic
algorithm is derived from the best response dynamics. We demonstrate that the
complexity of each best response step is independent of the number of files,
linear in the cache capacity and linear in the maximum number of base stations
that cover a certain area. Then, we show that the overall algorithm complexity
for a discrete cache placement is polynomial in both network size and catalog
size. In practical examples, the algorithm converges in just a few iterations.
Also, in most cases of interest, the basic algorithm finds the best Nash
equilibrium corresponding to the global optimum. We provide two extensions of
our basic algorithm based on stochastic and deterministic simulated annealing
which find the global optimum.
Finally, we demonstrate the hit probability evolution on real and synthetic
networks numerically and show that our distributed caching algorithm performs
significantly better than storing the most popular content, probabilistic
content placement policy and Multi-LRU caching policies.Comment: 24 pages, 9 figures, presented at SIGMETRICS'1
Exact Analysis of TTL Cache Networks: The Case of Caching Policies driven by Stopping Times
TTL caching models have recently regained significant research interest,
largely due to their ability to fit popular caching policies such as LRU. This
paper advances the state-of-the-art analysis of TTL-based cache networks by
developing two exact methods with orthogonal generality and computational
complexity. The first method generalizes existing results for line networks
under renewal requests to the broad class of caching policies whereby evictions
are driven by stopping times. The obtained results are further generalized,
using the second method, to feedforward networks with Markov arrival processes
(MAP) requests. MAPs are particularly suitable for non-line networks because
they are closed not only under superposition and splitting, as known, but also
under input-output caching operations as proven herein for phase-type TTL
distributions. The crucial benefit of the two closure properties is that they
jointly enable the first exact analysis of feedforward networks of TTL caches
in great generality
- …