44,251 research outputs found
Hausdorff-Distance Enhanced Matching of Scale Invariant Feature Transform Descriptors in Context of Image Querying
Reliable and effective matching of visual descriptors is a key step for many vision applications, e.g. image retrieval. In this paper, we propose to integrate the Hausdorff distance matching together with our pairing algorithm, in order to obtain a robust while computationally efficient process of matching feature descriptors for image-to-image querying in standards datasets. For this purpose, Scale Invariant Feature Transform (SIFT) descriptors have been matched using our presented algorithm, followed by the computation of our related similarity measure. This approach has shown excellent performance in both retrieval accuracy and speed
MaskSearch: Querying Image Masks at Scale
Machine learning tasks over image databases often generate masks that
annotate image content (e.g., saliency maps, segmentation maps, depth maps) and
enable a variety of applications (e.g., determine if a model is learning
spurious correlations or if an image was maliciously modified to mislead a
model). While queries that retrieve examples based on mask properties are
valuable to practitioners, existing systems do not support them efficiently. In
this paper, we formalize the problem and propose MaskSearch, a system that
focuses on accelerating queries over databases of image masks while
guaranteeing the correctness of query results. MaskSearch leverages a novel
indexing technique and an efficient filter-verification query execution
framework. Experiments with our prototype show that MaskSearch, using indexes
approximately 5% of the compressed data size, accelerates individual queries by
up to two orders of magnitude and consistently outperforms existing methods on
various multi-query workloads that simulate dataset exploration and analysis
processes
Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch
In this work we introduce a cross modal image retrieval system that allows
both text and sketch as input modalities for the query. A cross-modal deep
network architecture is formulated to jointly model the sketch and text input
modalities as well as the the image output modality, learning a common
embedding between text and images and between sketches and images. In addition,
an attention model is used to selectively focus the attention on the different
objects of the image, allowing for retrieval with multiple objects in the
query. Experiments show that the proposed method performs the best in both
single and multiple object image retrieval in standard datasets.Comment: Accepted at ICPR 201
Text-based Semantic Annotation Service for Multimedia Content in the Esperonto project
Within the Esperonto project, an integration of NLP, ontologies and other knowledge bases, is being performed with the goal to implement a semantic annotation service that upgrades the actual Web towards the emerging Semantic Web. Research is being currently conducted on how to apply the Esperonto semantic annotation service to text material associated with still images in web pages. In doing so, the project will allow for semantic querying of still images in the web, but also (automatically) create a large set of text-based semantic annotations of still images, which can be used by the Multimedia community in order to support the task of content indexing of image material, possibly combining the Esperonto type of annotations with the annotations resulting from image analysis
- …