Search CORE

411 research outputs found

AXES at TRECVid 2011

Author: Aly Robin
Arandjelovic Relja
Beunders Henri
Chen Shu
Frappier Mathieu
Jawahar C. V.
Juneja Mayank
Lee Hyowon
Martijn Kleppe
McGuinness Kevin
O'Connor Noel E.
Ordelman Roeland
Schneider Daniel
Schwenninger Jochen
Smeaton Alan F.
Tschopel Sebastian
Vedaldi Andrea
Zisserman Andrew
Publication venue
Publication date: 01/01/2011
Field of study

The AXES project participated in the interactive known-item search task (KIS) and the interactive instance search task (INS) for TRECVid 2011. We used the same system architecture and a nearly identical user interface for both the KIS and INS tasks. Both systems made use of text search on ASR, visual concept detectors, and visual similarity search. The user experiments were carried out with media professionals and media students at the Netherlands Institute for Sound and Vision, with media professionals performing the KIS task and media students participating in the INS task. This paper describes the results and findings of our experiments

Fraunhofer-ePrints

Irish Universities

DCU Online Research Access Service

AXES at TRECVID 2012: KIS, INS, and MED

Author: Aly Robin
Arandjelovic Relja
Chatfield Ken
Chen Shu
Douze Matthijs
Fernando Basura
Harchaoui Zaid
McGuinness Kevin
O'Connor Noel E.
Oneata Dan
Parkhi Omkar M.
Potapov Danila
Revaud Jérôme
Schmid Cordelia
Schwenninger Jochen
Tuytelaars Tinne
Verbeek Jakob
Wang Heng
Zisserman Andrew
Publication venue
Publication date: 01/01/2012
Field of study

The AXES project participated in the interactive instance search task (INS), the known-item search task (KIS), and the multimedia event detection task (MED) for TRECVid 2012. As in our TRECVid 2011 system, we used nearly identical search systems and user interfaces for both INS and KIS. Our interactive INS and KIS systems focused this year on using classifiers trained at query time with positive examples collected from external search engines. Participants in our KIS experiments were media professionals from the BBC; our INS experiments were carried out by students and researchers at Dublin City University. We performed comparatively well in both experiments. Our best KIS run found 13 of the 25 topics, and our best INS runs outperformed all other submitted runs in terms of P@100. For MED, the system presented was based on a minimal number of low-level descriptors, which we chose to be as large as computationally feasible. These descriptors are aggregated to produce high-dimensional video-level signatures, which are used to train a set of linear classifiers. Our MED system achieved the second-best score of all submitted runs in the main track, and best score in the ad-hoc track, suggesting that a simple system based on state-of-the-art low-level descriptors can give relatively high performance. This paper describes in detail our KIS, INS, and MED systems and the results and findings of our experiments

Hal - Université Grenoble Alpes

Fraunhofer-ePrints

Irish Universities

INRIA a CCSD electronic archive server

DCU Online Research Access Service

HAL-Rennes 1

Simulated evaluation of faceted browsing based on feature selection

Author: Bernejo Lopez P.
Hopfgartner F.
Jose J.M.
Urruty T.
Villa R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

In this paper we explore the limitations of facet based browsing which uses sub-needs of an information need for querying and organising the search process in video retrieval. The underlying assumption of this approach is that the search effectiveness will be enhanced if such an approach is employed for interactive video retrieval using textual and visual features. We explore the performance bounds of a faceted system by carrying out a simulated user evaluation on TRECVid data sets, and also on the logs of a prior user experiment with the system. We first present a methodology to reduce the dimensionality of features by selecting the most important ones. Then, we discuss the simulated evaluation strategies employed in our evaluation and the effect on the use of both textual and visual features. Facets created by users are simulated by clustering video shots using textual and visual features. The experimental results of our study demonstrate that the faceted browser can potentially improve the search effectiveness

Enlighten

Learning to detect video events from zero or very few video examples

Author: Galanopoulos Damianos
Mezaris Vasileios
Patras Ioannis
Tzelepis Christos
Publication venue: 'Elsevier BV'
Publication date: 25/11/2015
Field of study

In this work we deal with the problem of high-level event detection in video. Specifically, we study the challenging problems of i) learning to detect video events from solely a textual description of the event, without using any positive video examples, and ii) additionally exploiting very few positive training samples together with a small number of ``related'' videos. For learning only from an event's textual description, we first identify a general learning framework and then study the impact of different design choices for various stages of this framework. For additionally learning from example videos, when true positive training samples are scarce, we employ an extension of the Support Vector Machine that allows us to exploit ``related'' event videos by automatically introducing different weights for subsets of the videos in the overall training set. Experimental evaluations performed on the large-scale TRECVID MED 2014 video dataset provide insight on the effectiveness of the proposed methods.Comment: Image and Vision Computing Journal, Elsevier, 2015, accepted for publicatio

arXiv.org e-Print Archive

City Research Online

TagBook: A Semantic Video Representation without Supervision for Event Detection

Author: Li Xirong
Mazloom Masoud
Snoek Cees G. M.
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of event detection in video for scenarios where only few, or even zero examples are available for training. For this challenging setting, the prevailing solutions in the literature rely on a semantic video representation obtained from thousands of pre-trained concept detectors. Different from existing work, we propose a new semantic video representation that is based on freely available social tagged videos only, without the need for training any intermediate concept detectors. We introduce a simple algorithm that propagates tags from a video's nearest neighbors, similar in spirit to the ones used for image retrieval, but redesign it for video event detection by including video source set refinement and varying the video tag assignment. We call our approach TagBook and study its construction, descriptiveness and detection performance on the TRECVID 2013 and 2014 multimedia event detection datasets and the Columbia Consumer Video dataset. Despite its simple nature, the proposed TagBook video representation is remarkably effective for few-example and zero-example event detection, even outperforming very recent state-of-the-art alternatives building on supervised representations.Comment: accepted for publication as a regular paper in the IEEE Transactions on Multimedi

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

TRECVid 2013 experiments at Dublin City University

Author: Albatal Rami
Gurrin Cathal
Smeaton Alan F.
Zhang Zhenxing
Publication venue
Publication date: 22/11/2013
Field of study

In a move away from previous years’ participation in TRECVid ([1] [2] [3]), this year our team focused on the instance search task (INS). We improved our system from last year by applying large vocabulary quantization, soft assignment of visual words, spatial verifications and query expansion. Overall, four automatic runs have been submitted for evaluation. In this paper, we present first our system, then we discuss the results and findings of our experiments

Irish Universities

DCU Online Research Access Service