710 research outputs found

    SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

    Get PDF
    Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure

    Human object annotation for surveillance video forensics

    Get PDF
    A system that can automatically annotate surveillance video in a manner useful for locating a person with a given description of clothing is presented. Each human is annotated based on two appearance features: primary colors of clothes and the presence of text/logos on clothes. The annotation occurs after a robust foreground extraction stage employing a modified Gaussian mixture model-based approach. The proposed pipeline consists of a preprocessing stage where color appearance of an image is improved using a color constancy algorithm. In order to annotate color information for human clothes, we use the color histogram feature in HSV space and find local maxima to extract dominant colors for different parts of a segmented human object. To detect text/logos on clothes, we begin with the extraction of connected components of enhanced horizontal, vertical, and diagonal edges in the frames. These candidate regions are classified as text or nontext on the basis of their local energy-based shape histogram features. Further, to detect humans, a novel technique has been proposed that uses contourlet transform-based local binary pattern (CLBP) features. In the proposed method, we extract the uniform direction invariant LBP feature descriptor for contourlet transformed high-pass subimages from vertical and diagonal directional bands. In the final stage, extracted CLBP descriptors are classified by a trained support vector machine. Experimental results illustrate the superiority of our method on large-scale surveillance video data

    Color-contrast landmark detection and encoding in outdoor images

    Get PDF
    International Conference on Computer Analysis of Images and Patterns (CAIP), 2005, Versalles (Francia)This paper describes a system to extract salient regions from an outdoor image and match them against a database of previously acquired landmarks. Region saliency is based mainly on color contrast, although intensity and texture orientation are also taken into account. Remarkably, color constancy is embedded in the saliency detection process through a novel color ratio algorithm that makes the system robust to illumination changes, so common in outdoor environments. A region is characterized by a combination of its saliency and its color distribution in chromaticity space. The newly acquired landmarks are compared with those already stored in a database, through a quadratic distance metric of their characterizations. Experimentation with a database containing 68 natural landmarks acquired with the system yielded good recognition results, in terms of both recall and rank indices. However, the discrimination between landmarks should be improved to avoid false positives, as suggested by the low precision index.This work was supported by the project 'Sistema reconfigurable para la navegación basada en visión de robots caminantes y rodantes en entornos naturales.' (00).Peer Reviewe

    Automatic Color Inspection for Colored Wires in Electric Cables

    Get PDF
    In this paper, an automatic optical inspection system for checking the sequence of colored wires in electric cable is presented. The system is able to inspect cables with flat connectors differing in the type and number of wires. This variability is managed in an automatic way by means of a self-learning subsystem and does not require manual input from the operator or loading new data to the machine. The system is coupled to a connector crimping machine and once the model of a correct cable is learned, it can automatically inspect each cable assembled by the machine. The main contributions of this paper are: (i) the self-learning system; (ii) a robust segmentation algorithm for extracting wires from images even if they are strongly bent and partially overlapped; (iii) a color recognition algorithm able to cope with highlights and different finishing of the wire insulation. We report the system evaluation over a period of several months during the actual production of large batches of different cables; tests demonstrated a high level of accuracy and the absence of false negatives, which is a key point in order to guarantee defect-free productions

    OBJECT MATCHING IN DISJOINT CAMERAS USING A COLOR TRANSFER APPROACH

    Get PDF
    Object appearance models are a consequence of illumination, viewing direction, camera intrinsics, and other conditions that are specific to a particular camera. As a result, a model acquired in one view is often inappropriate for use in other viewpoints. In this work we treat this appearance model distortion between two non-overlapping cameras as one in which some unknown color transfer function warps a known appearance model from one view to another. We demonstrate how to recover this function in the case where the distortion function is approximated as general affine and object appearance is represented as a mixture of Gaussians. Appearance models are brought into correspondence by searching for a bijection function that best minimizes an entropic metric for model dissimilarity. These correspondences lead to a solution for the transfer function that brings the parameters of the models into alignment in the UV chromaticity plane. Finally, a set of these transfer functions acquired from a collection of object pairs are generalized to a single camera-pair-specific transfer function via robust fitting. We demonstrate the method in the context of a video surveillance network and show that recognition of subjects in disjoint views can be significantly improved using the new color transfer approach

    Endoscopic Vision Augmentation Using Multiscale Bilateral-Weighted Retinex for Robotic Surgery

    Get PDF
    医疗机器人手术视觉是微创外科手术成功与否的关键所在。由于手术器械医学电子内镜自身内在的局限性,导致了手术视野不清晰、光照不均、多烟雾等诸多问题,使得外科医生无法准确快速感知与识别人体内部器官中的神经血管以及病灶位置等结构信息,这无疑增加了手术风险和手术时间。针对这些手术视觉问题,本论文提出了一种基于双边滤波权重分析的多尺度Retinex模型方法,对达芬奇医疗机器人手术过程中所采集到的病患视频进行处理与分析。经过外科医生对实验结果的主观评价,一致认为该方法能够大幅度地增强手术视野质量;同时客观评价实验结果表明本论文所提出方法优于目前计算机视觉领域内的图像增强与恢复方法。 厦门大学信息科学与技术学院计算机科学系罗雄彪教授为本文第一作者。【Abstract】Endoscopic vision plays a significant role in minimally invasive surgical procedures. The visibility and maintenance of such direct in-situ vision is paramount not only for safety by preventing inadvertent injury, but also to improve precision and reduce operating time. Unfortunately, endoscopic vision is unavoidably degraded due to illumination variations during surgery. This work aims to restore or augment such degraded visualization and quantitatively evaluate it during robotic surgery. A multiscale bilateral-weighted retinex method is proposed to remove non-uniform and highly directional illumination and enhance surgical vision, while an objective noreference image visibility assessment method is defined in terms of sharpness, naturalness, and contrast, to quantitatively and objectively evaluate endoscopic visualization on surgical video sequences. The methods were validated on surgical data, with the experimental results showing that our method outperforms existent retinex approaches. In particular, the combined visibility was improved from 0.81 to 1.06, while three surgeons generally agreed that the results were restored with much better visibility.The authors thank the assistance of Dr. Stephen Pautler for facilitating the data acquisition, Dr. A. Jonathan McLeod and Dr.Uditha Jayarathne for helpful discussions

    Robust Specularity Removal from Hand-held Videos

    Get PDF
    Specular reflection exists when one tries to record a photo or video through a transparent glass medium or opaque surfaces such as plastics, ceramics, polyester and human skin, which can be well described as the superposition of a transmitted layer and a reflection layer. These specular reflections often confound the algorithms developed for image analysis, computer vision and pattern recognition. To obtain a pure diffuse reflection component, specularity (highlights) needs to be removed. To handle this problem, a novel and robust algorithm is formulated. The contributions of this work are three-fold.;First, the smoothness of the video along with the temporal coherence and illumination changes are preserved by reducing the flickering and jagged edges caused by hand-held video acquisition and homography transformation respectively.;Second, this algorithm is designed to improve upon the state-of-art algorithms by automatically selecting the region of interest (ROI) for all the frames, reducing the computational time and complexity by utilizing the luminance (Y) channel and exploiting the Augmented Lagrange Multiplier (ALM) with Alternating Direction Minimizing (ADM) to facilitate the derivation of solution algorithms.;Third, a quantity metrics is devised, which objectively quantifies the amount of specularity in each frame of a hand-held video. The proposed specularity removal algorithm is compared against existing state-of-art algorithms using the newly-developed quantity metrics. Experimental results validate that the developed algorithm has superior performance in terms of computation time, quality and accuracy
    corecore