3,276 research outputs found

    The aceToolbox: low-level audiovisual feature extraction for retrieval and classification

    Get PDF
    In this paper we present an overview of a software platform that has been developed within the aceMedia project, termed the aceToolbox, that provides global and local lowlevel feature extraction from audio-visual content. The toolbox is based on the MPEG-7 eXperimental Model (XM), with extensions to provide descriptor extraction from arbitrarily shaped image segments, thereby supporting local descriptors reflecting real image content. We describe the architecture of the toolbox as well as providing an overview of the descriptors supported to date. We also briefly describe the segmentation algorithm provided. We then demonstrate the usefulness of the toolbox in the context of two different content processing scenarios: similarity-based retrieval in large collections and scene-level classification of still images

    K-Space at TRECVid 2007

    Get PDF
    In this paper we describe K-Space participation in TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance. The first of the two systems was a ‘shot’ based interface, where the results from a query were presented as a ranked list of shots. The second interface was ‘broadcast’ based, where results were presented as a ranked list of broadcasts. Both systems made use of the outputs of our high-level feature submission as well as low-level visual features

    Scene modelling using an adaptive mixture of Gaussians in colour and space

    Get PDF
    We present an integrated pixel segmentation and region tracking algorithm, designed for indoor environments. Visual monitoring systems often use frame differencing techniques to independently classify each image pixel as either foreground or background. Typically, this level of processing does not take account of the global image structure, resulting in frequent misclassification. We use an adaptive Gaussian mixture model in colour and space to represent background and foreground regions of the scene. This model is used to probabilistically classify observed pixel values, incorporating the global scene structure into pixel-level segmentation. We evaluate our system over 4 sequences and show that it successfully segments foreground pixels and tracks major foreground regions as they move through the scene

    Semi-automatic video object segmentation for multimedia applications

    Get PDF
    A semi-automatic video object segmentation tool is presented for segmenting both still pictures and image sequences. The approach comprises both automatic segmentation algorithms and manual user interaction. The still image segmentation component is comprised of a conventional spatial segmentation algorithm (Recursive Shortest Spanning Tree (RSST)), a hierarchical segmentation representation method (Binary Partition Tree (BPT)), and user interaction. An initial segmentation partition of homogeneous regions is created using RSST. The BPT technique is then used to merge these regions and hierarchically represent the segmentation in a binary tree. The semantic objects are then manually built by selectively clicking on image regions. A video object-tracking component enables image sequence segmentation, and this subsystem is based on motion estimation, spatial segmentation, object projection, region classification, and user interaction. The motion between the previous frame and the current frame is estimated, and the previous object is then projected onto the current partition. A region classification technique is used to determine which regions in the current partition belong to the projected object. User interaction is allowed for object re-initialisation when the segmentation results become inaccurate. The combination of all these components enables offline video sequence segmentation. The results presented on standard test sequences illustrate the potential use of this system for object-based coding and representation of multimedia

    Quadrilateral-based region segmentation for tracking

    Get PDF
    We propose a novel quadrilateral based region segmentation method that is favorable for object tracking. Instead of using groups of pixels or regular blocks, it uses groups of connected quadrilaterals to represent regions. The proposed method derives the vertices of each quadrilateral from the edge map using the concept of center of masses. By merging the quadrilaterals, regions can be represented. The proposed method offers better data reduction than pixelwise region representation and better boundary approximation than block-based segmentation methods. Experimental results show that it generates a more reasonable region map, which is more suitable for object tracking, and a smaller number of regions than the seeded region growing, K-means clustering, and constrained gravitational clustering methods. © 2002 Society of Photo-Optical Instrumentation Engineers.published_or_final_versio

    Using contour information and segmentation for object registration, modeling and retrieval

    Get PDF
    This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios. There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections

    Virtual Reality applied to biomedical engineering

    Get PDF
    Actualment, la realitat virtual esta sent tendència i s'està expandint a l'àmbit mèdic, fent possible l'aparició de nombroses aplicacions dissenyades per entrenar metges i tractar pacients de forma més eficient, així com optimitzar els processos de planificació quirúrgica. La necessitat mèdica i objectiu d'aquest projecte és fer òptim el procés de planificació quirúrgica per a cardiopaties congènites, que compren la reconstrucció en 3D del cor del pacient i la seva integració en una aplicació de realitat virtual. Seguint aquesta línia s’ha combinat un procés de modelat 3D d’imatges de cors obtinguts gracies al Hospital Sant Joan de Déu i el disseny de l’aplicació mitjançant el software Unity 3D gracies a l’empresa VISYON. S'han aconseguit millores en quant al software emprat per a la segmentació i reconstrucció, i s’han assolit funcionalitats bàsiques a l’aplicació com importar, moure, rotar i fer captures de pantalla en 3D de l'òrgan cardíac i així, entendre millor la cardiopatia que s’ha de tractar. El resultat ha estat la creació d'un procés òptim, en el que la reconstrucció en 3D ha aconseguit ser ràpida i precisa, el mètode d’importació a l’app dissenyada molt senzill, i una aplicació que permet una interacció atractiva i intuïtiva, gracies a una experiència immersiva i realista per ajustar-se als requeriments d'eficiència i precisió exigits en el camp mèdic
    corecore