370 research outputs found
Optimal Elephant Flow Detection
Monitoring the traffic volumes of elephant flows, including the total byte
count per flow, is a fundamental capability for online network measurements. We
present an asymptotically optimal algorithm for solving this problem in terms
of both space and time complexity. This improves on previous approaches, which
can only count the number of packets in constant time. We evaluate our work on
real packet traces, demonstrating an up to X2.5 speedup compared to the best
alternative.Comment: Accepted to IEEE INFOCOM 201
Compact Tensor Pooling for Visual Question Answering
Performing high level cognitive tasks requires the integration of feature
maps with drastically different structure. In Visual Question Answering (VQA)
image descriptors have spatial structures, while lexical inputs inherently
follow a temporal sequence. The recently proposed Multimodal Compact Bilinear
pooling (MCB) forms the outer products, via count-sketch approximation, of the
visual and textual representation at each spatial location. While this
procedure preserves spatial information locally, outer-products are taken
independently for each fiber of the activation tensor, and therefore do not
include spatial context. In this work, we introduce multi-dimensional sketch
({MD-sketch}), a novel extension of count-sketch to tensors. Using this new
formulation, we propose Multimodal Compact Tensor Pooling (MCT) to fully
exploit the global spatial context during bilinear pooling operations.
Contrarily to MCB, our approach preserves spatial context by directly
convolving the MD-sketch from the visual tensor features with the text vector
feature using higher order FFT. Furthermore we apply MCT incrementally at each
step of the question embedding and accumulate the multi-modal vectors with a
second LSTM layer before the final answer is chosen
Catching the head, tail, and everything in between: a streaming algorithm for the degree distribution
The degree distribution is one of the most fundamental graph properties of
interest for real-world graphs. It has been widely observed in numerous domains
that graphs typically have a tailed or scale-free degree distribution. While
the average degree is usually quite small, the variance is quite high and there
are vertices with degrees at all scales. We focus on the problem of
approximating the degree distribution of a large streaming graph, with small
storage. We design an algorithm headtail, whose main novelty is a new estimator
of infrequent degrees using truncated geometric random variables. We give a
mathematical analysis of headtail and show that it has excellent behavior in
practice. We can process streams will millions of edges with storage less than
1% and get extremely accurate approximations for all scales in the degree
distribution.
We also introduce a new notion of Relative Hausdorff distance between tailed
histograms. Existing notions of distances between distributions are not
suitable, since they ignore infrequent degrees in the tail. The Relative
Hausdorff distance measures deviations at all scales, and is a more suitable
distance for comparing degree distributions. By tracking this new measure, we
are able to give strong empirical evidence of the convergence of headtail
- …