2,509 research outputs found
Parallel Maximum Clique Algorithms with Applications to Network Analysis and Storage
We propose a fast, parallel maximum clique algorithm for large sparse graphs
that is designed to exploit characteristics of social and information networks.
The method exhibits a roughly linear runtime scaling over real-world networks
ranging from 1000 to 100 million nodes. In a test on a social network with 1.8
billion edges, the algorithm finds the largest clique in about 20 minutes. Our
method employs a branch and bound strategy with novel and aggressive pruning
techniques. For instance, we use the core number of a vertex in combination
with a good heuristic clique finder to efficiently remove the vast majority of
the search space. In addition, we parallelize the exploration of the search
tree. During the search, processes immediately communicate changes to upper and
lower bounds on the size of maximum clique, which occasionally results in a
super-linear speedup because vertices with large search spaces can be pruned by
other processes. We apply the algorithm to two problems: to compute temporal
strong components and to compress graphs.Comment: 11 page
Resilient Source Coding
This paper provides a source coding theorem for multi-dimensional information
signals when, at a given instant, the distribution associated with one
arbitrary component of the signal to be compressed is not known and a side
information is available at the destination. This new framework appears to be
both of information-theoretical and game-theoretical interest: it provides a
new type of constraints to compress an information source; it is useful for
designing certain types of mediators in games and characterize utility regions
for games with signals. Regarding the latter aspect, we apply the derived
source coding theorem to the prisoner's dilemma and the battle of the sexes
On Coding for Cache-Aided Delivery of Dynamic Correlated Content
Cache-aided coded multicast leverages side information at wireless edge
caches to efficiently serve multiple unicast demands via common multicast
transmissions, leading to load reductions that are proportional to the
aggregate cache size. However, the increasingly dynamic, unpredictable, and
personalized nature of the content that users consume challenges the efficiency
of existing caching-based solutions in which only exact content reuse is
explored. This paper generalizes the cache-aided coded multicast problem to
specifically account for the correlation among content files, such as, for
example, the one between updated versions of dynamic data. It is shown that (i)
caching content pieces based on their correlation with the rest of the library,
and (ii) jointly compressing requested files using cached information as
references during delivery, can provide load reductions that go beyond those
achieved with existing schemes. This is accomplished via the design of a class
of correlation-aware achievable schemes, shown to significantly outperform
state-of-the-art correlation-unaware solutions. Our results show that as we
move towards real-time and/or personalized media dominated services, where
exact cache hits are almost non-existent but updates can exhibit high levels of
correlation, network cached information can still be useful as references for
network compression.Comment: To apear in IEEE Journal on Selected Areas in Communication
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra
We propose GraphMineSuite (GMS): the first benchmarking suite for graph
mining that facilitates evaluating and constructing high-performance graph
mining algorithms. First, GMS comes with a benchmark specification based on
extensive literature review, prescribing representative problems, algorithms,
and datasets. Second, GMS offers a carefully designed software platform for
seamless testing of different fine-grained elements of graph mining algorithms,
such as graph representations or algorithm subroutines. The platform includes
parallel implementations of more than 40 considered baselines, and it
facilitates developing complex and fast mining algorithms. High modularity is
possible by harnessing set algebra operations such as set intersection and
difference, which enables breaking complex graph mining algorithms into simple
building blocks that can be separately experimented with. GMS is supported with
a broad concurrency analysis for portability in performance insights, and a
novel performance metric to assess the throughput of graph mining algorithms,
enabling more insightful evaluation. As use cases, we harness GMS to rapidly
redesign and accelerate state-of-the-art baselines of core graph mining
problems: degeneracy reordering (by up to >2x), maximal clique listing (by up
to >9x), k-clique listing (by 1.1x), and subgraph isomorphism (by up to 2.5x),
also obtaining better theoretical performance bounds
Smoothed Complexity Theory
Smoothed analysis is a new way of analyzing algorithms introduced by Spielman
and Teng (J. ACM, 2004). Classical methods like worst-case or average-case
analysis have accompanying complexity classes, like P and AvgP, respectively.
While worst-case or average-case analysis give us a means to talk about the
running time of a particular algorithm, complexity classes allows us to talk
about the inherent difficulty of problems.
Smoothed analysis is a hybrid of worst-case and average-case analysis and
compensates some of their drawbacks. Despite its success for the analysis of
single algorithms and problems, there is no embedding of smoothed analysis into
computational complexity theory, which is necessary to classify problems
according to their intrinsic difficulty.
We propose a framework for smoothed complexity theory, define the relevant
classes, and prove some first hardness results (of bounded halting and tiling)
and tractability results (binary optimization problems, graph coloring,
satisfiability). Furthermore, we discuss extensions and shortcomings of our
model and relate it to semi-random models.Comment: to be presented at MFCS 201
- …