Search CORE

980 research outputs found

Improving Small Object Proposals for Company Logo Detection

Author: Bell S.
Eggert C.
Ioffe S.
Oliveira G.
Redmon J.
Simonyan K.
Zitnick L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Many modern approaches for object detection are two-staged pipelines. The first stage identifies regions of interest which are then classified in the second stage. Faster R-CNN is such an approach for object detection which combines both stages into a single pipeline. In this paper we apply Faster R-CNN to the task of company logo detection. Motivated by its weak performance on small object instances, we examine in detail both the proposal and the classification stage with respect to a wide range of object sizes. We investigate the influence of feature map resolution on the performance of those stages. Based on theoretical considerations, we introduce an improved scheme for generating anchor proposals and propose a modification to Faster R-CNN which leverages higher-resolution feature maps for small objects. We evaluate our approach on the FlickrLogos dataset improving the RPN performance from 0.52 to 0.71 (MABO) and the detection performance from 0.52 to 0.67 (mAP).Comment: 8 Pages, ICMR 201

arXiv.org e-Print Archive

OPUS Augsburg

Crossref

Ranking News-Quality Multimedia

Author: Arapakis Ioannis
Hasler David
Michael
Tang Z
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2018
Field of study

News editors need to find the photos that best illustrate a news piece and fulfill news-media quality standards, while being pressed to also find the most recent photos of live events. Recently, it became common to use social-media content in the context of news media for its unique value in terms of immediacy and quality. Consequently, the amount of images to be considered and filtered through is now too much to be handled by a person. To aid the news editor in this process, we propose a framework designed to deliver high-quality, news-press type photos to the user. The framework, composed of two parts, is based on a ranking algorithm tuned to rank professional media highly and a visual SPAM detection module designed to filter-out low-quality media. The core ranking algorithm is leveraged by aesthetic, social and deep-learning semantic features. Evaluation showed that the proposed framework is effective at finding high-quality photos (true-positive rate) achieving a retrieval MAP of 64.5% and a classification precision of 70%.Comment: To appear in ICMR'1

arXiv.org e-Print Archive

Crossref

VideoAnalysis4ALL: An On-line Tool for the Automatic Fragmentation and Concept-based Annotation, and the Interactive Exploration of Videos.

Author: Apostolidis EE
Collyda C
Markatopoulou F
Mezaris V
Patras I
Pournaras A
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2017
Field of study

This paper presents the VideoAnalysis4ALL tool that supports the automatic fragmentation and concept-based annotation of videos, and the exploration of the annotated video fragments through an interactive user interface. The developed web application decomposes the video into two different granularities, namely shots and scenes, and annotates each fragment by evaluating the existence of a number (several hundreds) of high-level visual concepts in the keyframes extracted from these fragments. Through the analysis the tool enables the identification and labeling of semantically coherent video fragments, while its user interfaces allow the discovery of these fragments with the help of human-interpretable concepts. The integrated state-of-the-art video analysis technologies perform very well and, by exploiting the processing capabilities of multi-thread / multi-core architectures, reduce the time required for analysis to approximately one third of the video’s duration, thus making the analysis three times faster than real-time processing

ZENODO

Queen Mary Research Online

Impact of Interaction Strategies on User Relevance Feedback

Author: Jónsson Björn Thór
Khan Omar Shahbaz
Rudinac Stevan
Worring Marcel
Zahálka Jan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

The IT University of Copenhagen's Repository

International Migration, Integration and Social Cohesion online publications

UvA-DARE

PANEL: Challenges for multimedia/multimodal research in the next decade

Author: Chang Shih-Fu
Del Bimbo Alberto
Gurrin Cathal
Hauptmann Alexander
Heng Ji
Hung Hayley
Morency L.P.
Smeaton Alan F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2019
Field of study

The multimedia and multimodal community is witnessing an explosive transformation in the recent years with major societal impact. With the unprecedented deployment of multimedia devices and systems, multimedia research is critical to our abilities and prospects in advancing state-of-theart technologies and solving real-world challenges facing the society and the nation. To respond to these challenges and further advance the frontiers of the field of multimedia, this panel will discuss the challenges and visions that may guide future research in the next ten years

Crossref

Irish Universities

DCU Online Research Access Service

Exquisitor: Breaking the Interaction Barrier for Exploration of 100 Million Images

Author: Amsaleg Laurent
Guðmundsson Gylfi Þór
Jónsson Björn Thór
Khan Omar Shahbaz
Ragnarsdóttir Hanna
Rudinac Stevan
Worring Marcel
Zahálka Jan
Þorleiksdóttir Þórhildur
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

International audienceIn this demonstration, we present Exquisitor, a media explorer capable of learning user preferences in real-time during interactions with the 99.2 million images of YFCC100M. Exquisitor owes its efficiency to innovations in data representation, compression, and indexing. Exquisitor can complete each interaction round, including learning preferences and presenting the most relevant results, in less than 30 ms using only a single CPU core and modest RAM. In short, Exquisitor can bring large-scale interactive learning to standard desktops and laptops, and even high-end mobile devices

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

The IT University of Copenhagen's Repository

International Migration, Integration and Social Cohesion online publications

UvA-DARE

HAL-Rennes 1

A benchmark of visual storytelling in social media

Author: Blasi Saverio
Magalhães João
Marcelino Gonçalo
Mourão André
Mrak Marta
Semedo David
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/06/2019
Field of study

CMUP-ERI/TIC/0046/2014Media editors in the newsroom are constantly pressed to provide a "like-being there" coverage of live events. Social media provides a disorganised collection of images and videos that media professionals need to grasp before publishing their latest news updated. Automated news visual storyline editing with social media content can be very challenging, as it not only entails the task of finding the right content but also making sure that news content evolves coherently over time. To tackle these issues, this paper proposes a benchmark for assessing social media visual storylines. The SocialStories benchmark, comprised by total of 40 curated stories covering sports and cultural events, provides the experimental setup and introduces novel quantitative metrics to perform a rigorous evaluation of visual storytelling with social media data.publishersversionpublishe

arXiv.org e-Print Archive

Crossref

Repositório da Universidade Nova de Lisboa