980 research outputs found

    Improving Small Object Proposals for Company Logo Detection

    Get PDF
    Many modern approaches for object detection are two-staged pipelines. The first stage identifies regions of interest which are then classified in the second stage. Faster R-CNN is such an approach for object detection which combines both stages into a single pipeline. In this paper we apply Faster R-CNN to the task of company logo detection. Motivated by its weak performance on small object instances, we examine in detail both the proposal and the classification stage with respect to a wide range of object sizes. We investigate the influence of feature map resolution on the performance of those stages. Based on theoretical considerations, we introduce an improved scheme for generating anchor proposals and propose a modification to Faster R-CNN which leverages higher-resolution feature maps for small objects. We evaluate our approach on the FlickrLogos dataset improving the RPN performance from 0.52 to 0.71 (MABO) and the detection performance from 0.52 to 0.67 (mAP).Comment: 8 Pages, ICMR 201

    Ranking News-Quality Multimedia

    Full text link
    News editors need to find the photos that best illustrate a news piece and fulfill news-media quality standards, while being pressed to also find the most recent photos of live events. Recently, it became common to use social-media content in the context of news media for its unique value in terms of immediacy and quality. Consequently, the amount of images to be considered and filtered through is now too much to be handled by a person. To aid the news editor in this process, we propose a framework designed to deliver high-quality, news-press type photos to the user. The framework, composed of two parts, is based on a ranking algorithm tuned to rank professional media highly and a visual SPAM detection module designed to filter-out low-quality media. The core ranking algorithm is leveraged by aesthetic, social and deep-learning semantic features. Evaluation showed that the proposed framework is effective at finding high-quality photos (true-positive rate) achieving a retrieval MAP of 64.5% and a classification precision of 70%.Comment: To appear in ICMR'1

    VideoAnalysis4ALL: An On-line Tool for the Automatic Fragmentation and Concept-based Annotation, and the Interactive Exploration of Videos.

    Get PDF
    This paper presents the VideoAnalysis4ALL tool that supports the automatic fragmentation and concept-based annotation of videos, and the exploration of the annotated video fragments through an interactive user interface. The developed web application decomposes the video into two different granularities, namely shots and scenes, and annotates each fragment by evaluating the existence of a number (several hundreds) of high-level visual concepts in the keyframes extracted from these fragments. Through the analysis the tool enables the identification and labeling of semantically coherent video fragments, while its user interfaces allow the discovery of these fragments with the help of human-interpretable concepts. The integrated state-of-the-art video analysis technologies perform very well and, by exploiting the processing capabilities of multi-thread / multi-core architectures, reduce the time required for analysis to approximately one third of the video’s duration, thus making the analysis three times faster than real-time processing

    PANEL: Challenges for multimedia/multimodal research in the next decade

    Get PDF
    The multimedia and multimodal community is witnessing an explosive transformation in the recent years with major societal impact. With the unprecedented deployment of multimedia devices and systems, multimedia research is critical to our abilities and prospects in advancing state-of-theart technologies and solving real-world challenges facing the society and the nation. To respond to these challenges and further advance the frontiers of the field of multimedia, this panel will discuss the challenges and visions that may guide future research in the next ten years

    Exquisitor: Breaking the Interaction Barrier for Exploration of 100 Million Images

    Get PDF
    International audienceIn this demonstration, we present Exquisitor, a media explorer capable of learning user preferences in real-time during interactions with the 99.2 million images of YFCC100M. Exquisitor owes its efficiency to innovations in data representation, compression, and indexing. Exquisitor can complete each interaction round, including learning preferences and presenting the most relevant results, in less than 30 ms using only a single CPU core and modest RAM. In short, Exquisitor can bring large-scale interactive learning to standard desktops and laptops, and even high-end mobile devices

    A benchmark of visual storytelling in social media

    Get PDF
    CMUP-ERI/TIC/0046/2014Media editors in the newsroom are constantly pressed to provide a "like-being there" coverage of live events. Social media provides a disorganised collection of images and videos that media professionals need to grasp before publishing their latest news updated. Automated news visual storyline editing with social media content can be very challenging, as it not only entails the task of finding the right content but also making sure that news content evolves coherently over time. To tackle these issues, this paper proposes a benchmark for assessing social media visual storylines. The SocialStories benchmark, comprised by total of 40 curated stories covering sports and cultural events, provides the experimental setup and introduces novel quantitative metrics to perform a rigorous evaluation of visual storytelling with social media data.publishersversionpublishe
    corecore