11 research outputs found

    Application Dependent Video Segmentation Evaluation - A Case Study for Video Surveillance

    Get PDF
    Evaluation of the performance of video segmentation algorithms is important in both theoretical and practical considerations. This paper addresses the problem of video segmentation assessment, through both subjective and objective approaches, for the specific application of video surveillance. After an overview of the state of the art technique in video segmentation objective evaluation metrics, a general framework is proposed to cope with application dependent evaluation assessment. Finally, the performance of the proposed scheme is compared to state of the art technique and various conclusions are drawn

    New disagreement metrics incorporating spatial detail – applications to lung imaging

    Get PDF
    Evaluation of medical image segmentation is increasingly important. While set-based agreement metrics are widespread, they assess the absolute overlap, but fail to account for any spatial information related to the differences or to the shapes being analyzed. In this paper, we propose a family of new metrics that can be tailored to deal with a broad class of assessment needs

    Improved motion segmentation based on shadow detection

    Get PDF
    In this paper, we discuss common colour models for background subtraction and problems related to their utilisation are discussed. A novel approach to represent chrominance information more suitable for robust background modelling and shadow suppression is proposed. Our method relies on the ability to represent colours in terms of a 3D-polar coordinate system having saturation independent of the brightness function; specifically, we build upon an Improved Hue, Luminance, and Saturation space (IHLS). The additional peculiarity of the approach is that we deal with the problem of unstable hue values at low saturation by modelling the hue-saturation relationship using saturation-weighted hue statistics. The effectiveness of the proposed method is shown in an experimental comparison with approaches based on RGB, Normalised RGB and HSV

    On the evaluation of background subtraction algorithms without ground-truth

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. San Miguel, and J. M. Martínez, "On the evaluation of background subtraction algorithms without ground-truth" in 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2013, 180 - 187In video-surveillance systems, the moving object segmentation stage (commonly based on background subtraction) has to deal with several issues like noise, shadows and multimodal backgrounds. Hence, its failure is inevitable and its automatic evaluation is a desirable requirement for online analysis. In this paper, we propose a hierarchy of existing performance measures not-based on ground-truth for video object segmentation. Then, four measures based on color and motion are selected and examined in detail with different segmentation algorithms and standard test sequences for video object segmentation. Experimental results show that color-based measures perform better than motion-based measures and background multimodality heavily reduces the accuracy of all obtained evaluation results.This work is partially supported by the Spanish Government (TEC2007- 65400 SemanticVideo), by Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, by the Consejería de Educación of the Comunidad de Madrid and by the European Social Fund

    A supervised visual model for finding regions of interest in basal cell carcinoma images

    Get PDF
    This paper introduces a supervised learning method for finding diagnostic regions of interest in histopathological images. The method is based on the cognitive process of visual selection of relevant regions that arises during a pathologist's image examination. The proposed strategy emulates the interaction of the visual cortex areas V1, V2 and V4, being the V1 cortex responsible for assigning local levels of relevance to visual inputs while the V2 cortex gathers together these small regions according to some weights modulated by the V4 cortex, which stores some learned rules. This novel strategy can be considered as a complex mix of "bottom-up" and "top-down" mechanisms, integrated by calculating a unique index inside each region. The method was evaluated on a set of 338 images in which an expert pathologist had drawn the Regions of Interest. The proposed method outperforms two state-of-the-art methods devised to determine Regions of Interest (RoIs) in natural images. The quality gain with respect to an adaptated Itti's model which found RoIs was 3.6 dB in average, while with respect to the Achanta's proposal was 4.9 dB

    On Evaluating Video Object Segmentation Quality: A Perceptually driven Objective Metric

    Get PDF
    Segmentation of moving objects in image sequences plays an important role in video processing and analysis. Evaluating the quality of segmentation results is necessary to allow the appropriate selection of segmentation algorithms and to tune their parameters for optimal performance. Many segmentation algorithms have been proposed along with a number of evaluation criteria. Nevertheless, no formal psychophysical experiments evaluating the quality of different video object segmentation results have been conducted. In this paper, a generic framework for segmentation quality evaluation is presented. A perceptually driven automatic method for segmentation evaluation is proposed and compared against state-of-the-art. Moreover, on the basis of subjective results, weighting strategies are introduced into the proposed objective metric to meet the specificity of different segmentation applications such as video compression and mixed reality. Experimental results confirm the efficiency of the proposed approach

    Semi-Automatic Video Object Extraction Menggunakan Alpha Matting Berbasis Motion Estimation

    Get PDF
    Ekstraksi objek merupakan pekerjaan penting dalam aplikasi video editing, karena objek independen diperlukan untuk proses compositing. Proses ekstraksi dilakukan dengan image matting diawali dengan mendefinisikan scribble manual untuk mewakili daerah foreground dan background, sedangkan daerah unknown ditentukan dengan estimasi alpha. Permasalahan dalam image matting adalah piksel dalam daerah unknown tidak secara tegas menjadi bagian foreground atau background. Sedangkan dalam domain temporal, scribble tidak memungkinkan untuk didefinisikan secara independen di seluruh frame. Untuk mengatasi permasalahan tersebut, diusulkan metode ekstraksi objek dengan tahapan estimasi adaptive threshold untuk alpha matting, perbaikan akurasi image matting, dan estimasi temporal constraint untuk propagasi scribble. Algoritma Fuzzy C-Means (FCM) dan Otsu diaplikasikan untuk estimasi adaptive threshold. Dengan FCM hasil evaluasi menggunakan Means Squared Error (MSE) menunjukkan bahwa rata-rata kesalahan piksel di setiap frame berkurang dari 30.325,10 menjadi 26.999,33, sedangkan dengan Otsu menjadi 28.921,70. Kualitas matting yang menurun akibat perubahan intensitas pada image terkompresi diperbaiki menggunakan Discrete Cosine Transform (DCT-2D). Algoritma ini menurunkan Root Means Squared Error (RMSE) dari 16.68 menjadi 11.44. Estimasi temporal constraint untuk propagasi scribble dilakukan dengan memprediksi motion vector dari frame sekarang ke frame selanjutnya. Prediksi motion vector yang v dilakukan menggunakan exhaustive search diperbaiki dengan mendefinisikan matrik yang berukuran dinamis terhadap ukuran scribble, motion vector ditentukan dengan Sum of Absolute Difference (SAD) antara frame sekarang dan frame berikutnya. Hasilnya ketika diaplikasikan pada ruang warna RGB dapat menurunkan rata-rata kesalahan piksel setiap frame dari 3.058,55 menjadi 1.533,35, sedangkan dalam ruang waktu HSV menjadi 1.662,83. KiMoHar yang merupakan framework yang diusulkan meliputi tiga hal sebagai berikut. Pertama adalah image matting dengan adaptive threshold FCM dapat meningkatkan akurasi sebesar 11.05 %. Kedua, perbaikan kualitas matting pada image terkompresi menggunakan DCT-2D meningkatkan akurasi sebesar 31.41%. Sedangkan yang ketiga, estimasi temporal constraint pada ruang warna RGB meningkatkan akurasi 56.30%, dan dalam ruang HSV 52.61%. ======================================================================================================== It is important to have object extraction in video editing application because compositing process is necessary for independent object. Extraction process is performed by image matting which is defining manual scribble to represent the foreground and background area, and alpha estimation to determine the unknown area. In image matting, there are problem which are pixel in unknown area is not firmly being the part of foreground or background, whereas, in temporal domain, it is not possible to define the scribble independently in whole frame. In order to overcome the problem, object extraction model with adaptive threshold estimation phase for alpha matting, accuracy improvement for image matting, and temporal constraint estimation for scribble propagation is proposed. Fuzzy C-Means (FCM) Algorithm and Otsu are applied for adaptive threshold estimation. By FCM,the evaluationresult byusingMeansSquaredError(MSE) showsthatthe averageerrorof pixelsineachframeis reducedfrom30.325,10 to 26.999,33, while in the use of Otsu, the result shows 28.921,70. The matting quality is reducing since the intensity changing in compressed image improved by Discrete Cosine Transform (DCT-2D). The algorithm reduces Root Means Squared Error (RMSE) value from 16.68 to 11.4. Temporal constraint estimation for scribble propagation is performed by predicting motion vector from recent frame and forward. Motion vector prediction performed using exhaustive search is improved by defining the matrix in dynamic size to scribble; motion vector is determined by Sum of Absolute Difference (SAD) v between recent frame and forward. In its application to RGB space, it results the averageerrorof pixelsineachframe from 3.058,55 to 1.533,35, and 1.662,83 in HSV time space. KiMoHar, the proposed framework, includes three things which are: First, image matting by adaptive threshold FCM increases the accuracy to 11.05%. Second, matting quality improvement in compressed image by DCT-2D increases the accuracy to 31,41%. Three, temporal constraint estimation in RGB space increases the accuracy to 56.30%, and 52.61% in HSV space

    Perceptually-weighted evaluation criteria for segmentation masks in video sequences

    No full text
    In order to complement subjective evaluation of the quality of segmentation masks, this paper introduces a procedure for automatically assessing this quality. Algorithmically computed figures of merit are proposed. Assuming the existence of a perfect reference mask (ground truth), generated manually or with a reliable procedure over a test set, these figures of merit take into account visually desirable properties of a segmentation mask in order to provide the user with metrics that best quantify the spatial and temporal accuracy of the segmentation masks. For the sake of easy interpretation, results are presented on a peaked signal-to-noise ratio-like logarithmic scale

    Watershed from propagated markers to interactive segmentation of objects in image sequences

    Get PDF
    Orientador: Roberto de Alencar LotufoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de ComputaçãoResumo: Esta tese de doutorado apresenta um método interativo para segmentação de objetos em sequências de imagens - o watershed com marcadores propagados. Este método, uma combinação de segmentação morfológica clássica com estimação de movimento, possui quatro características importantes: i) interatividade, ii) generalidade, iii) resposta rápida e iv) edição manual progressiva. Watershed com marcadores propagados consiste em segmentar interativamente os objetos de interesse no primeiro quadro e, subsequentemente, computar e propagar marcadores para segmentar os mesmos objetos nos quadros seguintes. Além da proposta do paradigma do watershed com marcadores propagados, esta tese também apresenta variações para o paradigma citado e um novo benchmark para avaliação quantitativa de métodos interativos para segmentação de objetos em sequências de imagensAbstract: This doctorate thesis introduces an assisted method to object segmentation in image sequences - the watershed from propagated markers. This method, a combination of classical morphological segmentation withmotion estimation, has four important characteristics: i) interactivity, ii) generality, iii) rapid response and iv) progressive manual edition. Watershed from propagated markers consists in to segment interactively the objects of interest in the first frame and, subsequently, to compute and propagate markers in order to segment the same objects in the next frames. Besides the proposal of the watershed from propagated markers paradigm, this thesis also presents variaions to the cited paradigm and a new benchmark to quantitative evaluation of interactive object segmentation methods applied to image sequencesDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétric

    Image segmentation evaluation and its application to object detection

    Get PDF
    The first parts of this Thesis are focused on the study of the supervised evaluation of image segmentation algorithms. Supervised in the sense that the segmentation results are compared to a human-made annotation, known as ground truth, by means of different measures of similarity. The evaluation depends, therefore, on three main points. First, the image segmentation techniques we evaluate. We review the state of the art in image segmentation, making an explicit difference between those techniques that provide a flat output, that is, a single clustering of the set of pixels into regions; and those that produce a hierarchical segmentation, that is, a tree-like structure that represents regions at different scales from the details to the whole image. Second, ground-truth databases are of paramount importance in the evaluation. They can be divided into those annotated only at object level, that is, with marked sets of pixels that refer to objects that do not cover the whole image; or those with annotated full partitions, which provide a full clustering of all pixels in an image. Depending on the type of database, we say that the analysis is done from an object perspective or from a partition perspective. Finally, the similarity measures used to compare the generated results to the ground truth are what will provide us with a quantitative tool to evaluate whether our results are good, and in which way they can be improved. The main contributions of the first parts of the thesis are in the field of the similarity measures. First of all, from an object perspective, we review the used basic measures to compare two object representations and show that some of them are equivalent. In order to evaluate full partitions and hierarchies against an object, one needs to select which of their regions form the object to be assessed. We review and improve these techniques by means of a mathematical model of the problem. This analysis allows us to show that hierarchies can represent objects much better with much less number of regions than flat partitions. From a partition perspective, the literature about evaluation measures is large and entangled. Our first contribution is to review, structure, and deduplicate the measures available. We provide a new measure that we show that improves previous ones in terms of a set of qualitative and quantitative meta-measures. We also extend the measures on flat partitions to cover hierarchical segmentations. The second part of this Thesis moves from the evaluation of image segmentation to its application to object detection. In particular, we build on some of the conclusions extracted in the first part to generate segmented object candidates. Given a set of hierarchies, we build the pairs and triplets of regions, we learn to combine the set from each hierarchy, and we rank them using low-level and mid-level cues. We conduct an extensive experimental validation that show that our method outperforms the state of the art in many metrics tested
    corecore