Search CORE

140 research outputs found

AXES at TRECVID 2012: KIS, INS, and MED

Author: Aly Robin
Arandjelovic Relja
Chatfield Ken
Chen Shu
Douze Matthijs
Fernando Basura
Harchaoui Zaid
McGuinness Kevin
O'Connor Noel E.
Oneata Dan
Parkhi Omkar M.
Potapov Danila
Revaud Jérôme
Schmid Cordelia
Schwenninger Jochen
Tuytelaars Tinne
Verbeek Jakob
Wang Heng
Zisserman Andrew
Publication venue
Publication date: 01/01/2012
Field of study

The AXES project participated in the interactive instance search task (INS), the known-item search task (KIS), and the multimedia event detection task (MED) for TRECVid 2012. As in our TRECVid 2011 system, we used nearly identical search systems and user interfaces for both INS and KIS. Our interactive INS and KIS systems focused this year on using classifiers trained at query time with positive examples collected from external search engines. Participants in our KIS experiments were media professionals from the BBC; our INS experiments were carried out by students and researchers at Dublin City University. We performed comparatively well in both experiments. Our best KIS run found 13 of the 25 topics, and our best INS runs outperformed all other submitted runs in terms of P@100. For MED, the system presented was based on a minimal number of low-level descriptors, which we chose to be as large as computationally feasible. These descriptors are aggregated to produce high-dimensional video-level signatures, which are used to train a set of linear classifiers. Our MED system achieved the second-best score of all submitted runs in the main track, and best score in the ad-hoc track, suggesting that a simple system based on state-of-the-art low-level descriptors can give relatively high performance. This paper describes in detail our KIS, INS, and MED systems and the results and findings of our experiments

Hal - Université Grenoble Alpes

Fraunhofer-ePrints

Irish Universities

INRIA a CCSD electronic archive server

DCU Online Research Access Service

HAL-Rennes 1

Circulant temporal encoding for video retrieval and temporal alignment

Author: Douze Matthijs
Jégou Hervé
Revaud Jérôme
Schmid Cordelia
Verbeek Jakob
Publication venue
Publication date: 30/11/2015
Field of study

We address the problem of specific video event retrieval. Given a query video of a specific event, e.g., a concert of Madonna, the goal is to retrieve other videos of the same event that temporally overlap with the query. Our approach encodes the frame descriptors of a video to jointly represent their appearance and temporal order. It exploits the properties of circulant matrices to efficiently compare the videos in the frequency domain. This offers a significant gain in complexity and accurately localizes the matching parts of videos. The descriptors can be compressed in the frequency domain with a product quantizer adapted to complex numbers. In this case, video retrieval is performed without decompressing the descriptors. We also consider the temporal alignment of a set of videos. We exploit the matching confidence and an estimate of the temporal offset computed for all pairs of videos by our retrieval approach. Our robust algorithm aligns the videos on a global timeline by maximizing the set of temporally consistent matches. The global temporal alignment enables synchronous playback of the videos of a given scene

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Symbiosis between the TRECVid benchmark and video libraries at the Netherlands Institute for Sound and Vision

Author: AF Smeaton
AF Smeaton
Alan F. Smeaton
B Huurnink
B Huurnink
CGM Snoek
CGM Snoek
CV Thornley
D. Tjondronegoro
H.-T. Pu
Johan Oomen
L. Hollink
M Hertzum
Paul Over
S Shatford
Wessel Kraaij
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Audiovisual archives are investing in large-scale digitisation efforts of their analogue holdings and, in parallel, ingesting an ever-increasing amount of born- digital files in their digital storage facilities. Digitisation opens up new access paradigms and boosted re-use of audiovisual content. Query-log analyses show the shortcomings of manual annotation, therefore archives are complementing these annotations by developing novel search engines that automatically extract information from both audio and the visual tracks. Over the past few years, the TRECVid benchmark has developed a novel relationship with the Netherlands Institute of Sound and Vision (NISV) which goes beyond the NISV just providing data and use cases to TRECVid. Prototype and demonstrator systems developed as part of TRECVid are set to become a key driver in improving the quality of search engines at the NISV and will ultimately help other audiovisual archives to offer more efficient and more fine-grained access to their collections. This paper reports the experiences of NISV in leveraging the activities of the TRECVid benchmark

Crossref

Irish Universities

DCU Online Research Access Service

Radboud Repository

Sound and Vision Publications

Trecvid 2019: an evaluation campaign to benchmark video activity detection, video captioning and matching, and video search & retrieval

Author: Awad George M.
Butt Asad A.
Delgado Andrew
Fiscus Jon
Godil Afzal
Graham Yvette
Lee Yooyoung
Smeaton Alan F.
Publication venue
Publication date: 12/11/2019
Field of study

DCU Online Research Access Service

The AXES submissions at TrecVid 2013

Author: Aly Robin
Arandjelovic Relja
Chatfield Ken
Douze Matthijs
Fernando Basura
Harchaoui Zaid
McGuinness Kevin
O'Connor Noel E.
Oneata Dan
Parkhi Omkar M.
Potapov Danila
Revaud Jérôme
Schmid Cordelia
Schwenninger Jochen
Scott David
Tuytelaars Tinne
Verbeek Jakob
Wang Heng
Zisserman Andrew
Publication venue
Publication date: 01/11/2013
Field of study

The AXES project participated in the interactive instance search task (INS), the semantic indexing task (SIN) the multimedia event recounting task (MER), and the multimedia event detection task (MED) for TRECVid 2013. Our interactive INS focused this year on using classifiers trained at query time with positive examples collected from external search engines. Participants in our INS experiments were carried out by students and researchers at Dublin City University. Our best INS runs performed on par with the top ranked INS runs in terms of P@10 and P@30, and around the median in terms of mAP. For SIN, MED and MER, we use systems based on state- of-the-art local low-level descriptors for motion, image, and sound, as well as high-level features to capture speech and text and the visual and audio stream respectively. The low-level descriptors were aggregated by means of Fisher vectors into high- dimensional video-level signatures, the high-level features are aggregated into bag-of-word histograms. Using these features we train linear classifiers, and use early and late-fusion to combine the different features. Our MED system achieved the best score of all submitted runs in the main track, as well as in the ad-hoc track. This paper describes in detail our INS, MER, and MED systems and the results and findings of our experimen

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Irish Universities

DCU Online Research Access Service

Ensemble Learning with LDA Topic Models for Visual Concept Detection

Author: Gang Cao
Jin-Tao Li
Sheng Tang
Yan-Tao Zheng
Yong-Dong Zhang
Publication venue: 'IntechOpen'
Publication date: 07/03/2012
Field of study

IntechOpen

IRIM at TRECVID 2012: Semantic Indexing and Instance Search

International audienceThe IRIM group is a consortium of French teams work- ing on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2012 se- mantic indexing and instance search tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likeli- hood of a video shot to contain a target concept. These scores are then used for producing a ranked list of im- ages or shots that are the most likely to contain the tar- get concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classi cation, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of dif- ferent descriptors and tried di erent fusion strategies. The best IRIM run has a Mean Inferred Average Pre- cision of 0.2378, which ranked us 4th out of 16 partici- pants. For the instance search task, our approach uses two steps. First individual methods of participants are used to compute similrity between an example image of in- stance and keyframes of a video clip. Then a two-step fusion method is used to combine these individual re- sults and obtain a score for the likelihood of an instance to appear in a video clip. These scores are used to ob- tain a ranked list of clips the most likely to contain the queried instance. The best IRIM run has a MAP of 0.1192, which ranked us 29th on 79 fully automatic runs

HAL-CentraleSupelec

Hal - Université Grenoble Alpes

HAL AMU

INRIA a CCSD electronic archive server

HAL

HAL Université de Savoie

HAL-CEA

Hal-Diderot

HAL-Rennes 1

Exploring EEG for object detection and retrieval

Author: Giró-i-Nieto Xavier
Healy Graham
McGuinness Kevin
Mohedano Eva
O'Connor Noel E.
Porta Caubet Sergi
Salvador Amaia
Smeaton Alan F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/06/2015
Field of study

This paper explores the potential for using Brain Computer Interfaces (BCI) as a relevance feedback mechanism in contentbased image retrieval. Several experiments are performed using a rapid serial visual presentation (RSVP) of images at different rates (5Hz and 10Hz) on 8 users with different degrees of familiarization with BCI and the dataset. We compare the feedback from the BCI and mouse-based interfaces in a subset of TRECVid images, finding that, when users have limited time to annotate the images, both interfaces are comparable in performance. Comparing our best users in a retrieval task, we found that EEG-based relevance feedback can outperform mouse-based feedback

DCU Online Research Access Service

A Data-Driven Approach for Tag Refinement and Localization in Web Videos

Author: Ballan Lamberto
Bertini Marco
Del Bimbo Alberto
Serra Giuseppe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Tagging of visual content is becoming more and more widespread as web-based services and social networks have popularized tagging functionalities among their users. These user-generated tags are used to ease browsing and exploration of media collections, e.g. using tag clouds, or to retrieve multimedia content. However, not all media are equally tagged by users. Using the current systems is easy to tag a single photo, and even tagging a part of a photo, like a face, has become common in sites like Flickr and Facebook. On the other hand, tagging a video sequence is more complicated and time consuming, so that users just tag the overall content of a video. In this paper we present a method for automatic video annotation that increases the number of tags originally provided by users, and localizes them temporally, associating tags to keyframes. Our approach exploits collective knowledge embedded in user-generated tags and web sources, and visual similarity of keyframes and images uploaded to social sites like YouTube and Flickr, as well as web sources like Google and Bing. Given a keyframe, our method is able to select on the fly from these visual sources the training exemplars that should be the most relevant for this test sample, and proceeds to transfer labels across similar images. Compared to existing video tagging approaches that require training classifiers for each tag, our system has few parameters, is easy to implement and can deal with an open vocabulary scenario. We demonstrate the approach on tag refinement and localization on DUT-WEBV, a large dataset of web videos, and show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Udine

Florence Research

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Archivio istituzionale della ricerca - Università di Padova