5 research outputs found

    Probabilistic visual concept trees

    Full text link
    This paper presents probabilistic visual concept trees, a model for large visual semantic taxonomy structures and its use in visual concept detection. Organizing visual semantic knowl-edge systematically is one of the key challenges towards large-scale concept detection, and one that is complemen-tary to optimizing visual classification for individual con-cepts. Semantic concepts have traditionally been treated as isolated nodes, a densely-connected web, or a tree. Our anal-ysis shows that none of these models are sufficient in mod-eling the typical relationships on a real-world visual taxon-omy, and these relationships belong to three broad categories – semantic, appearance and statistics. We propose proba-bilistic visual concept trees for modeling a taxonomy forest with observation uncertainty. As a Bayesian network with parameter constraints, this model is flexible enough to ac-count for the key assumptions in all three types of taxonomy relations, yet it is robust enough to accommodate expansion or deletion in a taxonomy. Our evaluation results on a large web image dataset show that the classification accuracy has considerably improved upon baselines without, or with only a subset of concept relationships

    Visual Transfer Learning: Informal Introduction and Literature Overview

    Full text link
    Transfer learning techniques are important to handle small training sets and to allow for quick generalization even from only a few examples. The following paper is the introduction as well as the literature overview part of my thesis related to the topic of transfer learning for visual recognition problems.Comment: part of my PhD thesi

    Kernel Sharing With Joint Boosting For Multi-Class Concept Detection

    No full text

    Deliverable D1.1 State of the art and requirements analysis for hypervideo

    Get PDF
    This deliverable presents a state-of-art and requirements analysis report for hypervideo authored as part of the WP1 of the LinkedTV project. Initially, we present some use-case (viewers) scenarios in the LinkedTV project and through the analysis of the distinctive needs and demands of each scenario we point out the technical requirements from a user-side perspective. Subsequently we study methods for the automatic and semi-automatic decomposition of the audiovisual content in order to effectively support the annotation process. Considering that the multimedia content comprises of different types of information, i.e., visual, textual and audio, we report various methods for the analysis of these three different streams. Finally we present various annotation tools which could integrate the developed analysis results so as to effectively support users (video producers) in the semi-automatic linking of hypervideo content, and based on them we report on the initial progress in building the LinkedTV annotation tool. For each one of the different classes of techniques being discussed in the deliverable we present the evaluation results from the application of one such method of the literature to a dataset well-suited to the needs of the LinkedTV project, and we indicate the future technical requirements that should be addressed in order to achieve higher levels of performance (e.g., in terms of accuracy and time-efficiency), as necessary
    corecore