1,265 research outputs found

    Iconic Indexing for Video Search

    Get PDF
    Submitted for the degree of Doctor of Philosophy, Queen Mary, University of London

    Classification of iconic images

    Get PDF
    Iconic images represent an abstract topic and use a presentation that is intuitively understood within a certain cultural context. For example, the abstract topic “global warming” may be represented by a polar bear standing alone on an ice floe. Such images are widely used in media and their automatic classification can help to identify high-level semantic concepts. This paper presents a system for the classification of iconic images. It uses a variation of the Bag of Visual Words approach with enhanced feature descriptors. Our novel color pyramids feature incorporates color information into the classification scheme. It improves the average F1 measure of the classification by 0:117. The performance of our system is further evaluated under a variety of parameters

    Webly Supervised Learning of Convolutional Networks

    Full text link
    We present an approach to utilize large amounts of web data for learning CNNs. Specifically inspired by curriculum learning, we present a two-step approach for CNN training. First, we use easy images to train an initial visual representation. We then use this initial CNN and adapt it to harder, more realistic images by leveraging the structure of data and categories. We demonstrate that our two-stage CNN outperforms a fine-tuned CNN trained on ImageNet on Pascal VOC 2012. We also demonstrate the strength of webly supervised learning by localizing objects in web images and training a R-CNN style detector. It achieves the best performance on VOC 2007 where no VOC training data is used. Finally, we show our approach is quite robust to noise and performs comparably even when we use image search results from March 2013 (pre-CNN image search era)

    Fine-grained sketch-based image retrieval by matching deformable part models

    Get PDF
    (c) 2014. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.© 2014. The copyright of this document resides with its authors. An important characteristic of sketches, compared with text, rests with their ability to intrinsically capture object appearance and structure. Nonetheless, akin to traditional text-based image retrieval, conventional sketch-based image retrieval (SBIR) principally focuses on retrieving images of the same category, neglecting the fine-grained characteristics of sketches. In this paper, we advocate the expressiveness of sketches and examine their efficacy under a novel fine-grained SBIR framework. In particular, we study how sketches enable fine-grained retrieval within object categories. Key to this problem is introducing a mid-level sketch representation that not only captures object pose, but also possesses the ability to traverse sketch and image domains. Specifically, we learn deformable part-based model (DPM) as a mid-level representation to discover and encode the various poses in sketch and image domains independently, after which graph matching is performed on DPMs to establish pose correspondences across the two domains. We further propose an SBIR dataset that covers the unique aspects of fine-grained SBIR. Through in-depth experiments, we demonstrate the superior performance of our SBIR framework, and showcase its unique ability in fine-grained retrieval

    Voicing the Other: Mock AAVE on Social Media

    Full text link
    This project looks at the use on social media sites of features of African American Vernacular English by nonspeakers of it. This outgroup use of AAVE does not require nor reflect any true proficiency with the variety, but instead is often used to exaggerate the social distance between the stylizers using it on social media and the marginalized people for whom AAVE is a genuine mode of communication. Through double indexicality, nonspeakers of AAVE use features of it to annex certain positive qualities associated with Black or hip hop culture—toughness, coolness, an anti-establishment stance—for themselves, while reproducing negative stereotypes of the people generally thought to speak AAVE. An intertextual analysis of the data, made up of crowd-sourced social media posts exhibiting Mock AAVE, is used to establish the social meaning of the Mock register

    Automatic Metro Map Layout Using Multicriteria Optimization

    Get PDF
    This paper describes an automatic mechanism for drawing metro maps. We apply multicriteria optimization to find effective placement of stations with a good line layout and to label the map unambiguously. A number of metrics are defined, which are used in a weighted sum to find a fitness value for a layout of the map. A hill climbing optimizer is used to reduce the fitness value, and find improved map layouts. To avoid local minima, we apply clustering techniques to the map the hill climber moves both stations and clusters when finding improved layouts. We show the method applied to a number of metro maps, and describe an empirical study that provides some quantitative evidence that automatically-drawn metro maps can help users to find routes more efficiently than either published maps or undistorted maps. Moreover, we found that, in these cases, study subjects indicate a preference for automatically-drawn maps over the alternatives

    Efficient video collection association using geometry-aware Bag-of-Iconics representations

    Get PDF
    Abstract Recent years have witnessed the dramatic evolution in visual data volume and processing capabilities. For example, technical advances have enabled 3D modeling from large-scale crowdsourced photo collections. Compared to static image datasets, exploration and exploitation of Internet video collections are still largely unsolved. To address this challenge, we first propose to represent video contents using a histogram representation of iconic imagery attained from relevant visual datasets. We then develop a data-driven framework for a fully unsupervised extraction of such representations. Our novel Bag-of-Iconics (BoI) representation efficiently analyzes individual videos within a large-scale video collection. We demonstrate our proposed BoI representation with two novel applications: (1) finding video sequences connecting adjacent landmarks and aligning reconstructed 3D models and (2) retrieving geometrically relevant clips from video collections. Results on crowdsourced datasets illustrate the efficiency and effectiveness of our proposed Bag-of-Iconics representation

    Recognition of Characters from Streaming Videos

    Get PDF
    Non
    corecore