2,452 research outputs found
Descriptor Optimization for Multimedia Indexing and Retrieval
International audienceIn this paper, we propose and evaluate a method for optimizing descriptors used for content-based multimedia indexing and retrieval. A large variety of descriptors are commonly used for this purpose. However, the most efficient ones often have characteristics preventing them to be easily used in large scale systems. They may have very high dimensionality (up to tens of thousands dimensions) and/or be suited for a distance costly to compute (e.g. fflchi-square). The proposed method combines a PCA-based dimensionality reduction with pre- and post-PCA non-linear transformations. The resulting transformation is globally optimized. The produced descriptors have a much lower dimensionality while performing at least as well, and often significantly better, with the Euclidean distance than the original high dimensionality descriptors with their optimal distance. The method has been validated and evaluated for a variety of descriptors using TRECVid 2010 semantic indexing task data. It has then be applied at large scale for the TRECVid 2012 semantic indexing task on tens of descriptors of various types and with initial dimensionalities from 15 up to 32,768. The same transformation can be used also for multimedia retrieval in the context of query by example and/or relevance feedback
IRIM at TRECVID 2013: Semantic Indexing and Instance Search
International audienceThe IRIM group is a consortium of French teams working on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2013 semantic indexing and instance search tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classiffication, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of different descriptors and tried different fusion strategies. The best IRIM run has a Mean Inferred Average Precision of 0.2796, which ranked us 4th out of 26 participants
Visual Information Retrieval in Endoscopic Video Archives
In endoscopic procedures, surgeons work with live video streams from the
inside of their subjects. A main source for documentation of procedures are
still frames from the video, identified and taken during the surgery. However,
with growing demands and technical means, the streams are saved to storage
servers and the surgeons need to retrieve parts of the videos on demand. In
this submission we present a demo application allowing for video retrieval
based on visual features and late fusion, which allows surgeons to re-find
shots taken during the procedure.Comment: Paper accepted at the IEEE/ACM 13th International Workshop on
Content-Based Multimedia Indexing (CBMI) in Prague (Czech Republic) between
10 and 12 June 201
Video Data Visualization System: Semantic Classification And Personalization
We present in this paper an intelligent video data visualization tool, based
on semantic classification, for retrieving and exploring a large scale corpus
of videos. Our work is based on semantic classification resulting from semantic
analysis of video. The obtained classes will be projected in the visualization
space. The graph is represented by nodes and edges, the nodes are the keyframes
of video documents and the edges are the relation between documents and the
classes of documents. Finally, we construct the user's profile, based on the
interaction with the system, to render the system more adequate to its
references.Comment: graphic
Scalable Image Retrieval by Sparse Product Quantization
Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional
feature indexing and retrieval is the crux of large-scale image retrieval. A
recent promising technique is Product Quantization, which attempts to index
high-dimensional image features by decomposing the feature space into a
Cartesian product of low dimensional subspaces and quantizing each of them
separately. Despite the promising results reported, their quantization approach
follows the typical hard assignment of traditional quantization methods, which
may result in large quantization errors and thus inferior search performance.
Unlike the existing approaches, in this paper, we propose a novel approach
called Sparse Product Quantization (SPQ) to encoding the high-dimensional
feature vectors into sparse representation. We optimize the sparse
representations of the feature vectors by minimizing their quantization errors,
making the resulting representation is essentially close to the original data
in practice. Experiments show that the proposed SPQ technique is not only able
to compress data, but also an effective encoding technique. We obtain
state-of-the-art results for ANN search on four public image datasets and the
promising results of content-based image retrieval further validate the
efficacy of our proposed method.Comment: 12 page
Creation of virtual worlds from 3D models retrieved from content aware networks based on sketch and image queries
The recent emergence of user generated content requires new content creation tools that will be both easy to learn and easy to use. These new tools should enable the user to construct new high-quality content with minimum effort; it is essential to allow existing multimedia content to be reused as building blocks when creating new content. In this work we present a new tool for automatically constructing virtual worlds with minimum user intervention. Users can create these worlds by drawing a simple sketch, or by using interactively segmented 2D objects from larger images. The system receives as a query the sketch or the segmented image, and uses it to find similar 3D models that are stored in a Content Centric Network. The user selects a suitable model from the retrieved models, and the system uses it to automatically construct a virtual 3D world
- …