3 research outputs found

    Combining Text Semantics and Image Geometry to Improve Scene Interpretation

    Get PDF
    Inthispaper,wedescribeanovelsystemthatidentifiesrelationsbetweentheobjectsextractedfromanimage. We started from the idea that in addition to the geometric and visual properties of the image objects, we could exploit lexical and semantic information from the text accompanying the image. As experimental set up, we gathered a corpus of images from Wikipedia as well as their associated articles. We extracted two types of objects: human beings and horses and we considered three relations that could hold between them: Ride, Lead, or None. We used geometric features as a baseline to identify the relations between the entities and we describe the improvements brought by the addition of bag-of-wordf eatures and predicate–arguments tructures we derived from the text. The best semantic model resulted in a relative error reduction of more than 18% over the baseline

    Text-aided object segmentation and classification in images

    Get PDF
    Object recognition in images is a popular research field with many applications including medicine, robotics and face recognition. The task of automatically finding and identifying objects in an image is challenging in the extreme. By looking at the problem from a new angle and including additional information beside the visual, the problem becomes less ill posed. In this thesis we investigate how the addition of text annotations to images affects the classification process. Classifications of different sets of labels as well as clusters of labels were carried out. A comparison between the results from using only visual information and from also including information from an image description is given. In most cases the additional information improved the accuracy of the classification. The obtained results were then used to design an algorithm that could, given an image with a description, find relevant words from the text and mark their presence in the image. A large set of overlapping segments is generated and each segment is classified into a set of categories. The image descriptions are parsed by an algorithm (a so called chunker) and visually relevant words (key-nouns) are extracted from the text. These key-nouns are then connected to the categories by metrics from WordNet. To create an optimal assignment of the visual segments to the key-nouns combinatorial optimization was used. The resulting system was compared to manually segmented and classified images. The results are promising and have given rise to several new ideas for continued research

    A System for Automatic Image Categorization

    No full text
    Traditional multimedia classification techniques are based on the analysis of either low-level features or annotated textual information. Instead, the semantic gap between rough data and its content is still a challenging task. In this paper, we describe a novel solution which automatically associates the image analysis and processing algorithms to keywords and human annotation. We use the well known Flickr system, that contains images, tags, keywords and sometimes useful annotation describing both the content of an image and personal interesting information describing the scene. We have carried out several experiments demonstrating that the proposed categorization process achieves quite good performances in terms of efficiency and effectiveness
    corecore