497 research outputs found

    Coding local and global binary visual features extracted from video sequences

    Get PDF
    Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks, while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW) model. Several applications, including for example visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget, while attaining a target level of efficiency. In this paper we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the Compress-Then-Analyze (CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: homography estimation and content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin

    Compress-then-analyze vs. analyze-then-compress: Two paradigms for image analysis in visual sensor networks

    Get PDF
    We compare two paradigms for image analysis in vi- sual sensor networks (VSN). In the compress-then-analyze (CTA) paradigm, images acquired from camera nodes are compressed and sent to a central controller for further analysis. Conversely, in the analyze-then-compress (ATC) approach, camera nodes perform visual feature extraction and transmit a compressed version of these features to a central controller. We focus on state-of-the-art binary features which are particularly suitable for resource-constrained VSNs, and we show that the ”winning” paradigm depends primarily on the network conditions. Indeed, while the ATC approach might be the only possible way to perform analysis at low available bitrates, the CTA approach reaches the best results when the available bandwidth enables the transmission of high-quality images

    Enabling visual analysis in wireless sensor networks

    Get PDF
    This demo showcases some of the results obtained by the GreenEyes project, whose main objective is to enable visual analysis on resource-constrained multimedia sensor networks. The demo features a multi-hop visual sensor network operated by BeagleBones Linux computers with IEEE 802.15.4 communication capabilities, and capable of recognizing and tracking objects according to two different visual paradigms. In the traditional compress-then-analyze (CTA) paradigm, JPEG compressed images are transmitted through the network from a camera node to a central controller, where the analysis takes place. In the alternative analyze-then-compress (ATC) paradigm, the camera node extracts and compresses local binary visual features from the acquired images (either locally or in a distributed fashion) and transmits them to the central controller, where they are used to perform object recognition/tracking. We show that, in a bandwidth constrained scenario, the latter paradigm allows to reach better results in terms of application frame rates, still ensuring excellent analysis performance

    Multi-view coding of local features in visual sensor networks

    Get PDF
    Local visual features extracted from multiple camera views are employed nowadays in several application scenarios, such as object recognition, disparity matching, image stitching and many others. In several cases, local features need to be transmitted or stored on resource-limited devices, thus calling for efficient coding techniques. While recent works have addressed the problem of efficiently compressing local features extracted from still images or video sequences, in this paper we propose and evaluate an architecture for coding features extracted from multiple, overlapping views. The proposed Multi-View Feature Coding architecture can be applied to either real-valued or binary features, and allows to obtain bitrate reductions in the order of 10-20% with respect to simulcast coding

    Coding binary local features extracted from video sequences

    Get PDF
    Local features represent a powerful tool which is exploited in several applications such as visual search, object recognition and tracking, etc. In this context, binary descriptors provide an efficient alternative to real-valued descriptors, due to low computational complexity, limited memory footprint and fast matching algorithms. The descriptor consists of a binary vector, in which each bit is the result of a pairwise comparison between smoothed pixel intensities. In several cases, visual features need to be transmitted over a bandwidth-limited network. To this end, it is useful to compress the descriptor to reduce the required rate, while attaining a target accuracy for the task at hand. The past literature thoroughly addressed the problem of coding visual features extracted from still images and, only very recently, the problem of coding real-valued features (e.g., SIFT, SURF) extracted from video sequences. In this paper we propose a coding architecture specifically designed for binary local features extracted from video content. We exploit both spatial and temporal redundancy by means of intra-frame and inter-frame coding modes, showing that significant coding gains can be attained for a target level of accuracy of the visual analysis task

    Rate-accuracy optimization of binary descriptors

    Get PDF
    Binary descriptors have recently emerged as low-complexity alternatives to state-of-the-art descriptors such as SIFT. The descriptor is represented by means of a binary string, in which each bit is the result of the pairwise comparison of smoothed pixel values properly selected in a patch around each keypoint. Previous works have focused on the construction of the descriptor neglecting the opportunity of performing lossless compression. In this paper, we propose two contributions. First, design an entropy coding scheme that seeks the internal ordering of the descriptor that minimizes the number of bits necessary to represent it. Second, we compare different selection strategies that can be adopted to identify which pairwise comparisons to use when building the descriptor. Unlike previous works, we evaluate the discriminative power of descriptors as a function of rate, in order to investigate the trade-offs in a bandwidth constrained scenario

    Coding mode decision algorithm for binary descriptor coding

    Get PDF
    In visual sensor networks, local feature descriptors can be computed at the sensing nodes, which work collaboratively on the data obtained to make an efficient visual analysis. In fact, with a minimal amount of computational effort, the detection and extraction of local features, such as binary descriptors, can provide a reliable and compact image representation. In this paper, it is proposed to extract and code binary descriptors to meet the energy and bandwidth constraints at each sensing node. The major contribution is a binary descriptor coding technique that exploits the correlation using two different coding modes: Intra, which exploits the correlation between the elements that compose a descriptor; and Inter, which exploits the correlation between descriptors of the same image. The experimental results show bitrate savings up to 35% without any impact in the performance efficiency of the image retrieval task. © 2014 EURASIP

    Compress-then-analyze vs. analyze-then-compress: Two paradigms for image analysis in visual sensor networks

    Full text link
    corecore