2,285 research outputs found

    Full Reference Objective Quality Assessment for Reconstructed Background Images

    Full text link
    With an increased interest in applications that require a clean background image, such as video surveillance, object tracking, street view imaging and location-based services on web-based maps, multiple algorithms have been developed to reconstruct a background image from cluttered scenes. Traditionally, statistical measures and existing image quality techniques have been applied for evaluating the quality of the reconstructed background images. Though these quality assessment methods have been widely used in the past, their performance in evaluating the perceived quality of the reconstructed background image has not been verified. In this work, we discuss the shortcomings in existing metrics and propose a full reference Reconstructed Background image Quality Index (RBQI) that combines color and structural information at multiple scales using a probability summation model to predict the perceived quality in the reconstructed background image given a reference image. To compare the performance of the proposed quality index with existing image quality assessment measures, we construct two different datasets consisting of reconstructed background images and corresponding subjective scores. The quality assessment measures are evaluated by correlating their objective scores with human subjective ratings. The correlation results show that the proposed RBQI outperforms all the existing approaches. Additionally, the constructed datasets and the corresponding subjective scores provide a benchmark to evaluate the performance of future metrics that are developed to evaluate the perceived quality of reconstructed background images.Comment: Associated source code: https://github.com/ashrotre/RBQI, Associated Database: https://drive.google.com/drive/folders/1bg8YRPIBcxpKIF9BIPisULPBPcA5x-Bk?usp=sharing (Email for permissions at: ashrotreasuedu

    A fast and robust hand-driven 3D mouse

    Get PDF
    The development of new interaction paradigms requires a natural interaction. This means that people should be able to interact with technology with the same models used to interact with everyday real life, that is through gestures, expressions, voice. Following this idea, in this paper we propose a non intrusive vision based tracking system able to capture hand motion and simple hand gestures. The proposed device allows to use the hand as a "natural" 3D mouse, where the forefinger tip or the palm centre are used to identify a 3D marker and the hand gesture can be used to simulate the mouse buttons. The approach is based on a monoscopic tracking algorithm which is computationally fast and robust against noise and cluttered backgrounds. Two image streams are processed in parallel exploiting multi-core architectures, and their results are combined to obtain a constrained stereoscopic problem. The system has been implemented and thoroughly tested in an experimental environment where the 3D hand mouse has been used to interact with objects in a virtual reality application. We also provide results about the performances of the tracker, which demonstrate precision and robustness of the proposed syste

    Towards Benchmarking Scene Background Initialization

    Full text link
    Given a set of images of a scene taken at different times, the availability of an initial background model that describes the scene without foreground objects is the prerequisite for a wide range of applications, ranging from video surveillance to computational photography. Even though several methods have been proposed for scene background initialization, the lack of a common groundtruthed dataset and of a common set of metrics makes it difficult to compare their performance. To move first steps towards an easy and fair comparison of these methods, we assembled a dataset of sequences frequently adopted for background initialization, selected or created ground truths for quantitative evaluation through a selected suite of metrics, and compared results obtained by some existing methods, making all the material publicly available.Comment: 6 pages, SBI dataset, SBMI2015 Worksho

    Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval

    Full text link
    Humans use context and scene knowledge to easily localize moving objects in conditions of complex illumination changes, scene clutter and occlusions. In this paper, we present a method to leverage human knowledge in the form of annotated video libraries in a novel search and retrieval based setting to track objects in unseen video sequences. For every video sequence, a document that represents motion information is generated. Documents of the unseen video are queried against the library at multiple scales to find videos with similar motion characteristics. This provides us with coarse localization of objects in the unseen video. We further adapt these retrieved object locations to the new video using an efficient warping scheme. The proposed method is validated on in-the-wild video surveillance datasets where we outperform state-of-the-art appearance-based trackers. We also introduce a new challenging dataset with complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for Video Technolog

    A Framework for Symmetric Part Detection in Cluttered Scenes

    Full text link
    The role of symmetry in computer vision has waxed and waned in importance during the evolution of the field from its earliest days. At first figuring prominently in support of bottom-up indexing, it fell out of favor as shape gave way to appearance and recognition gave way to detection. With a strong prior in the form of a target object, the role of the weaker priors offered by perceptual grouping was greatly diminished. However, as the field returns to the problem of recognition from a large database, the bottom-up recovery of the parts that make up the objects in a cluttered scene is critical for their recognition. The medial axis community has long exploited the ubiquitous regularity of symmetry as a basis for the decomposition of a closed contour into medial parts. However, today's recognition systems are faced with cluttered scenes, and the assumption that a closed contour exists, i.e. that figure-ground segmentation has been solved, renders much of the medial axis community's work inapplicable. In this article, we review a computational framework, previously reported in Lee et al. (2013), Levinshtein et al. (2009, 2013), that bridges the representation power of the medial axis and the need to recover and group an object's parts in a cluttered scene. Our framework is rooted in the idea that a maximally inscribed disc, the building block of a medial axis, can be modeled as a compact superpixel in the image. We evaluate the method on images of cluttered scenes.Comment: 10 pages, 8 figure

    General Dynamic Scene Reconstruction from Multiple View Video

    Get PDF
    This paper introduces a general approach to dynamic scene reconstruction from multiple moving cameras without prior knowledge or limiting constraints on the scene structure, appearance, or illumination. Existing techniques for dynamic scene reconstruction from multiple wide-baseline camera views primarily focus on accurate reconstruction in controlled environments, where the cameras are fixed and calibrated and background is known. These approaches are not robust for general dynamic scenes captured with sparse moving cameras. Previous approaches for outdoor dynamic scene reconstruction assume prior knowledge of the static background appearance and structure. The primary contributions of this paper are twofold: an automatic method for initial coarse dynamic scene segmentation and reconstruction without prior knowledge of background appearance or structure; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes from multiple wide-baseline static or moving cameras. Evaluation is performed on a variety of indoor and outdoor scenes with cluttered backgrounds and multiple dynamic non-rigid objects such as people. Comparison with state-of-the-art approaches demonstrates improved accuracy in both multiple view segmentation and dense reconstruction. The proposed approach also eliminates the requirement for prior knowledge of scene structure and appearance
    • …
    corecore