13,616 research outputs found
Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks
Infrared (IR) imaging has the potential to enable more robust action
recognition systems compared to visible spectrum cameras due to lower
sensitivity to lighting conditions and appearance variability. While the action
recognition task on videos collected from visible spectrum imaging has received
much attention, action recognition in IR videos is significantly less explored.
Our objective is to exploit imaging data in this modality for the action
recognition task. In this work, we propose a novel two-stream 3D convolutional
neural network (CNN) architecture by introducing the discriminative code layer
and the corresponding discriminative code loss function. The proposed network
processes IR image and the IR-based optical flow field sequences. We pretrain
the 3D CNN model on the visible spectrum Sports-1M action dataset and finetune
it on the Infrared Action Recognition (InfAR) dataset. To our best knowledge,
this is the first application of the 3D CNN to action recognition in the IR
domain. We conduct an elaborate analysis of different fusion schemes (weighted
average, single and double-layer neural nets) applied to different 3D CNN
outputs. Experimental results demonstrate that our approach can achieve
state-of-the-art average precision (AP) performances on the InfAR dataset: (1)
the proposed two-stream 3D CNN achieves the best reported 77.5% AP, and (2) our
3D CNN model applied to the optical flow fields achieves the best reported
single stream 75.42% AP
Universal and Robust Distributed Network Codes
Random linear network codes can be designed and implemented in a distributed
manner, with low computational complexity. However, these codes are classically
implemented over finite fields whose size depends on some global network
parameters (size of the network, the number of sinks) that may not be known
prior to code design. Also, if new nodes join the entire network code may have
to be redesigned.
In this work, we present the first universal and robust distributed linear
network coding schemes. Our schemes are universal since they are independent of
all network parameters. They are robust since if nodes join or leave, the
remaining nodes do not need to change their coding operations and the receivers
can still decode. They are distributed since nodes need only have topological
information about the part of the network upstream of them, which can be
naturally streamed as part of the communication protocol.
We present both probabilistic and deterministic schemes that are all
asymptotically rate-optimal in the coding block-length, and have guarantees of
correctness. Our probabilistic designs are computationally efficient, with
order-optimal complexity. Our deterministic designs guarantee zero error
decoding, albeit via codes with high computational complexity in general. Our
coding schemes are based on network codes over ``scalable fields". Instead of
choosing coding coefficients from one field at every node, each node uses
linear coding operations over an ``effective field-size" that depends on the
node's distance from the source node. The analysis of our schemes requires
technical tools that may be of independent interest. In particular, we
generalize the Schwartz-Zippel lemma by proving a non-uniform version, wherein
variables are chosen from sets of possibly different sizes. We also provide a
novel robust distributed algorithm to assign unique IDs to network nodes.Comment: 12 pages, 7 figures, 1 table, under submission to INFOCOM 201
Structured Random Linear Codes (SRLC): Bridging the Gap between Block and Convolutional Codes
Several types of AL-FEC (Application-Level FEC) codes for the Packet Erasure
Channel exist. Random Linear Codes (RLC), where redundancy packets consist of
random linear combinations of source packets over a certain finite field, are a
simple yet efficient coding technique, for instance massively used for Network
Coding applications. However the price to pay is a high encoding and decoding
complexity, especially when working on , which seriously limits the
number of packets in the encoding window. On the opposite, structured block
codes have been designed for situations where the set of source packets is
known in advance, for instance with file transfer applications. Here the
encoding and decoding complexity is controlled, even for huge block sizes,
thanks to the sparse nature of the code and advanced decoding techniques that
exploit this sparseness (e.g., Structured Gaussian Elimination). But their
design also prevents their use in convolutional use-cases featuring an encoding
window that slides over a continuous set of incoming packets.
In this work we try to bridge the gap between these two code classes,
bringing some structure to RLC codes in order to enlarge the use-cases where
they can be efficiently used: in convolutional mode (as any RLC code), but also
in block mode with either tiny, medium or large block sizes. We also
demonstrate how to design compact signaling for these codes (for
encoder/decoder synchronization), which is an essential practical aspect.Comment: 7 pages, 12 figure
eCMT-SCTP: Improving Performance of Multipath SCTP with Erasure Coding Over Lossy Links
Performance of transport protocols on lossy links is a well-researched topic, however there are only a few proposals making use of the opportunities of erasure coding within the multipath transport protocol context. In this paper, we investigate performance improvements of multipath CMT-SCTP with the novel integration of the on-the-fly erasure code within congestion control and reliability mechanisms. Our contributions include: integration of transport protocol and erasure codes with regards to congestion control; proposal for a variable retransmission delay parameter (aRTX) adjustment; performance evaluation of CMT-SCTP with erasure coding with simulations. We have implemented the explicit congestion notification (ECN) and erasure coding schemes in NS-2, evaluated and demonstrated results of improvement both for application goodput and decline of spurious retransmission. Our results show that we can achieve from 10% to 80% improvements in goodput under lossy network conditions without a significant penalty and minimal overhead due to the encoding-decoding process
Localized Dimension Growth in Random Network Coding: A Convolutional Approach
We propose an efficient Adaptive Random Convolutional Network Coding (ARCNC)
algorithm to address the issue of field size in random network coding. ARCNC
operates as a convolutional code, with the coefficients of local encoding
kernels chosen randomly over a small finite field. The lengths of local
encoding kernels increase with time until the global encoding kernel matrices
at related sink nodes all have full rank. Instead of estimating the necessary
field size a priori, ARCNC operates in a small finite field. It adapts to
unknown network topologies without prior knowledge, by locally incrementing the
dimensionality of the convolutional code. Because convolutional codes of
different constraint lengths can coexist in different portions of the network,
reductions in decoding delay and memory overheads can be achieved with ARCNC.
We show through analysis that this method performs no worse than random linear
network codes in general networks, and can provide significant gains in terms
of average decoding delay in combination networks.Comment: 7 pages, 1 figure, submitted to IEEE ISIT 201
Full Resolution Image Compression with Recurrent Neural Networks
This paper presents a set of full-resolution lossy image compression methods
based on neural networks. Each of the architectures we describe can provide
variable compression rates during deployment without requiring retraining of
the network: each network need only be trained once. All of our architectures
consist of a recurrent neural network (RNN)-based encoder and decoder, a
binarizer, and a neural network for entropy coding. We compare RNN types (LSTM,
associative LSTM) and introduce a new hybrid of GRU and ResNet. We also study
"one-shot" versus additive reconstruction architectures and introduce a new
scaled-additive framework. We compare to previous work, showing improvements of
4.3%-8.8% AUC (area under the rate-distortion curve), depending on the
perceptual metric used. As far as we know, this is the first neural network
architecture that is able to outperform JPEG at image compression across most
bitrates on the rate-distortion curve on the Kodak dataset images, with and
without the aid of entropy coding.Comment: Updated with content for CVPR and removed supplemental material to an
external link for size limitation
- …