97 research outputs found

    MediAssist: Using content-based analysis and context to manage personal photo collections

    Get PDF
    We present work which organises personal digital photo collections based on contextual information, such as time and location, combined with content-based analysis such as face detection and other feature detectors. The MediAssist demonstration system illustrates the results of our research into digital photo management, showing how a combination of automatically extracted context and content-based information, together with user annotation, facilitates efficient searching of personal photo collections

    Analyzing image-text relations for semantic media adaptation and personalization

    Get PDF
    Progress in semantic media adaptation and personalisation requires that we know more about how different media types, such as texts and images, work together in multimedia communication. To this end, we present our ongoing investigation into image-text relations. Our idea is that the ways in which the meanings of images and texts relate in multimodal documents, such as web pages, can be classified on the basis of low-level media features and that this classification should be an early processing step in systems targeting semantic multimedia analysis. In this paper we present the first empirical evidence that humans can predict something about the main theme of a text from an accompanying image, and that this prediction can be emulated by a machine via analysis of low- level image features. We close by discussing how these findings could impact on applications for news adaptation and personalisation, and how they may generalise to other kinds of multimodal documents and to applications for semantic media retrieval, browsing, adaptation and creation

    Adaptive online performance evaluation of video trackers

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. SanMiguel, A. Caballaro, and J. M. Martínez, "Adaptive Online Performance Evaluation of Video Trackers", IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2812 - 2823. May 2012We propose an adaptive framework to estimate the quality of video tracking algorithms without ground-truth data. The framework is divided into two main stages, namely, the estimation of the tracker condition to identify temporal segments during which a target is lost and the measurement of the quality of the estimated track when the tracker is successful. A key novelty of the proposed framework is the capability of evaluating video trackers with multiple failures and recoveries over long sequences. Successful tracking is identified by analyzing the uncertainty of the tracker, whereas track recovery from errors is determined based on the time-reversibility constraint. The proposed approach is demonstrated on a particle filter tracker over a heterogeneous data set. Experimental results show the effectiveness and robustness of the proposed framework that improves state-of-the-art approaches in the presence of tracking challenges such as occlusions, illumination changes, and clutter and on sequences containing multiple tracking errors and recoveries.This work was partially supported by the Spanish Government (TEC2007- 65400 SemanticVideo), Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, Consejería de Educación of the Comunidad de Madrid and European Social Fund

    Context-aware person identification in personal photo collections

    Get PDF
    Identifying the people in photos is an important need for users of photo management systems. We present MediAssist, one such system which facilitates browsing, searching and semi-automatic annotation of personal photos, using analysis of both image content and the context in which the photo is captured. This semi-automatic annotation includes annotation of the identity of people in photos. In this paper, we focus on such person annotation, and propose person identiïŹcation techniques based on a combination of context and content. We propose language modelling and nearest neighbor approaches to context-based person identiïŹcation, in addition to novel face color and image color content-based features (used alongside face recognition and body patch features). We conduct a comprehensive empirical study of these techniques using the real private photo collections of a number of users, and show that combining context- and content-based analysis improves performance over content or context alone

    Evaluation of on-line quality estimators for object tracking

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. SanMiguel, A. Cavallaro, and J. M. Martínez, "Evaluation of on-line quality estimators for object tracking", in 17th IEEE International Conference on Image Processing, ICIP 2010, p. 825-828Failure of tracking algorithms is inevitable in real and on-line tracking systems. The online estimation of the track quality is therefore desirable for detecting tracking failures while the algorithm is operating. In this paper, we propose a taxonomy and present a comparative evaluation of online quality estimators for video object tracking. The measures are compared over a heterogeneous video dataset with standard sequences. Among other results, the experiments show, that the Observation Likelihood (OL) measure is an appropriate quality measure for overall tracking performance evaluation, while the Template Inverse Matching (TIM) measure is appropriate to detect the start and the end instants of tracking failures.Work partially supported by the Spanish Government (TEC2007- 65400 SemanticVideo), Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, Consejería de Educación of the Comunidad de Madrid and European Social Fund. Part of the work reported in this paper was done during a research stay of the first author under a research grant (funded by UAM) at Queen Mary University of London (UK)

    Speech emotion classification using SVM and MLP on prosodic and voice quality features

    Get PDF
    In this paper, a comparison of emotion classification undertaken by the Support Vector Machine (SVM) and the Multi-Layer Perceptron (MLP) Neural Network, using prosodic and voice quality features extracted from the Berlin Emotional Database, is reported. The features were extracted using PRAAT tools, while the WEKA tool was used for classification. Different parameters were set up for both SVM and MLP, which are used to obtain an optimized emotion classification. The results show that MLP overcomes SVM in overall emotion classification performance. Nevertheless, the training for SVM was much faster when compared to MLP. The overall accuracy was 76.82% for SVM and 78.69% for MLP. Sadness was the emotion most recognized by MLP, with accuracy of 89.0%, while anger was the emotion most recognized by SVM, with accuracy of 87.4%. The most confusing emotions using MLP classification were happiness and fear, while for SVM, the most confusing emotions were disgust and fear
    • 

    corecore