13,616 research outputs found

    Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks

    Full text link
    Infrared (IR) imaging has the potential to enable more robust action recognition systems compared to visible spectrum cameras due to lower sensitivity to lighting conditions and appearance variability. While the action recognition task on videos collected from visible spectrum imaging has received much attention, action recognition in IR videos is significantly less explored. Our objective is to exploit imaging data in this modality for the action recognition task. In this work, we propose a novel two-stream 3D convolutional neural network (CNN) architecture by introducing the discriminative code layer and the corresponding discriminative code loss function. The proposed network processes IR image and the IR-based optical flow field sequences. We pretrain the 3D CNN model on the visible spectrum Sports-1M action dataset and finetune it on the Infrared Action Recognition (InfAR) dataset. To our best knowledge, this is the first application of the 3D CNN to action recognition in the IR domain. We conduct an elaborate analysis of different fusion schemes (weighted average, single and double-layer neural nets) applied to different 3D CNN outputs. Experimental results demonstrate that our approach can achieve state-of-the-art average precision (AP) performances on the InfAR dataset: (1) the proposed two-stream 3D CNN achieves the best reported 77.5% AP, and (2) our 3D CNN model applied to the optical flow fields achieves the best reported single stream 75.42% AP

    Universal and Robust Distributed Network Codes

    Full text link
    Random linear network codes can be designed and implemented in a distributed manner, with low computational complexity. However, these codes are classically implemented over finite fields whose size depends on some global network parameters (size of the network, the number of sinks) that may not be known prior to code design. Also, if new nodes join the entire network code may have to be redesigned. In this work, we present the first universal and robust distributed linear network coding schemes. Our schemes are universal since they are independent of all network parameters. They are robust since if nodes join or leave, the remaining nodes do not need to change their coding operations and the receivers can still decode. They are distributed since nodes need only have topological information about the part of the network upstream of them, which can be naturally streamed as part of the communication protocol. We present both probabilistic and deterministic schemes that are all asymptotically rate-optimal in the coding block-length, and have guarantees of correctness. Our probabilistic designs are computationally efficient, with order-optimal complexity. Our deterministic designs guarantee zero error decoding, albeit via codes with high computational complexity in general. Our coding schemes are based on network codes over ``scalable fields". Instead of choosing coding coefficients from one field at every node, each node uses linear coding operations over an ``effective field-size" that depends on the node's distance from the source node. The analysis of our schemes requires technical tools that may be of independent interest. In particular, we generalize the Schwartz-Zippel lemma by proving a non-uniform version, wherein variables are chosen from sets of possibly different sizes. We also provide a novel robust distributed algorithm to assign unique IDs to network nodes.Comment: 12 pages, 7 figures, 1 table, under submission to INFOCOM 201

    Structured Random Linear Codes (SRLC): Bridging the Gap between Block and Convolutional Codes

    Get PDF
    Several types of AL-FEC (Application-Level FEC) codes for the Packet Erasure Channel exist. Random Linear Codes (RLC), where redundancy packets consist of random linear combinations of source packets over a certain finite field, are a simple yet efficient coding technique, for instance massively used for Network Coding applications. However the price to pay is a high encoding and decoding complexity, especially when working on GF(28)GF(2^8), which seriously limits the number of packets in the encoding window. On the opposite, structured block codes have been designed for situations where the set of source packets is known in advance, for instance with file transfer applications. Here the encoding and decoding complexity is controlled, even for huge block sizes, thanks to the sparse nature of the code and advanced decoding techniques that exploit this sparseness (e.g., Structured Gaussian Elimination). But their design also prevents their use in convolutional use-cases featuring an encoding window that slides over a continuous set of incoming packets. In this work we try to bridge the gap between these two code classes, bringing some structure to RLC codes in order to enlarge the use-cases where they can be efficiently used: in convolutional mode (as any RLC code), but also in block mode with either tiny, medium or large block sizes. We also demonstrate how to design compact signaling for these codes (for encoder/decoder synchronization), which is an essential practical aspect.Comment: 7 pages, 12 figure

    eCMT-SCTP: Improving Performance of Multipath SCTP with Erasure Coding Over Lossy Links

    Get PDF
    Performance of transport protocols on lossy links is a well-researched topic, however there are only a few proposals making use of the opportunities of erasure coding within the multipath transport protocol context. In this paper, we investigate performance improvements of multipath CMT-SCTP with the novel integration of the on-the-fly erasure code within congestion control and reliability mechanisms. Our contributions include: integration of transport protocol and erasure codes with regards to congestion control; proposal for a variable retransmission delay parameter (aRTX) adjustment; performance evaluation of CMT-SCTP with erasure coding with simulations. We have implemented the explicit congestion notification (ECN) and erasure coding schemes in NS-2, evaluated and demonstrated results of improvement both for application goodput and decline of spurious retransmission. Our results show that we can achieve from 10% to 80% improvements in goodput under lossy network conditions without a significant penalty and minimal overhead due to the encoding-decoding process

    Localized Dimension Growth in Random Network Coding: A Convolutional Approach

    Get PDF
    We propose an efficient Adaptive Random Convolutional Network Coding (ARCNC) algorithm to address the issue of field size in random network coding. ARCNC operates as a convolutional code, with the coefficients of local encoding kernels chosen randomly over a small finite field. The lengths of local encoding kernels increase with time until the global encoding kernel matrices at related sink nodes all have full rank. Instead of estimating the necessary field size a priori, ARCNC operates in a small finite field. It adapts to unknown network topologies without prior knowledge, by locally incrementing the dimensionality of the convolutional code. Because convolutional codes of different constraint lengths can coexist in different portions of the network, reductions in decoding delay and memory overheads can be achieved with ARCNC. We show through analysis that this method performs no worse than random linear network codes in general networks, and can provide significant gains in terms of average decoding delay in combination networks.Comment: 7 pages, 1 figure, submitted to IEEE ISIT 201

    Full Resolution Image Compression with Recurrent Neural Networks

    Full text link
    This paper presents a set of full-resolution lossy image compression methods based on neural networks. Each of the architectures we describe can provide variable compression rates during deployment without requiring retraining of the network: each network need only be trained once. All of our architectures consist of a recurrent neural network (RNN)-based encoder and decoder, a binarizer, and a neural network for entropy coding. We compare RNN types (LSTM, associative LSTM) and introduce a new hybrid of GRU and ResNet. We also study "one-shot" versus additive reconstruction architectures and introduce a new scaled-additive framework. We compare to previous work, showing improvements of 4.3%-8.8% AUC (area under the rate-distortion curve), depending on the perceptual metric used. As far as we know, this is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.Comment: Updated with content for CVPR and removed supplemental material to an external link for size limitation
    • …
    corecore