Search CORE

137 research outputs found

Accessibility-based reranking in multimedia search engines

Author: Anastasios Drosou
Dimitrios Tzovaras
DS Friedman
EM Fine
F Liu
H Brettel
H Hirvelä
H Kim
I Kalamaras
Ilias Kalamaras
IY Kim
J Liu
J Sang
JR Lavery
KW-T Leung
L Zhang
M Wang
Nikolaos Dimitriou
NJ Belkin
PK Atrey
S Lawrence
S Tajima
S Yang
T-L Ji
Y Nikulin
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/08/2016
Field of study

Traditional multimedia search engines retrieve results based mostly on the query submitted by the user, or using a log of previous searches to provide personalized results, while not considering the accessibility of the results for users with vision or other types of impairments. In this paper, a novel approach is presented which incorporates the accessibility of images for users with various vision impairments, such as color blindness, cataract and glaucoma, in order to rerank the results of an image search engine. The accessibility of individual images is measured through the use of vision simulation filters. Multi-objective optimization techniques utilizing the image accessibility scores are used to handle users with multiple vision impairments, while the impairment profile of a specific user is used to select one from the Pareto-optimal solutions. The proposed approach has been tested with two image datasets, using both simulated and real impaired users, and the results verify its applicability. Although the proposed method has been used for vision accessibility-based reranking, it can also be extended for other types of personalization context

Crossref

Springer - Publisher Connector

Spiral - Imperial College Digital Repository

A Review on Video Search Engine Ranking

Author: Ms. Komal Naxine, Prof. Pragati Patil
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/01/2018
Field of study

Search reranking is considered as a best and basic approach to enhance recovery accuracy. The recordings are recovered utilizing the related literary data, for example, encompassing content from the website page. The execution of such frameworks basically depends on the importance between the content and the recordings. In any case, they may not generally coordinate all around ok, which causes boisterous positioning results. For example, outwardly comparative recordings may have altogether different positions. So reranking has been proposed to tackle the issue. Video reranking, as a compelling approach to enhance the consequences of electronic video look however the issue is not paltry particularly when we are thinking about different elements or modalities for pursuit in video and video recovery. This paper proposes another sort of reranking calculation, the round reranking, that backings the common trade of data over numerous modalities for enhancing seek execution and takes after the rationality of solid performing methodology could gain from weaker ones

International Journal on Future Revolution in Computer Science & Communication Engineering

Learning Object Categories From Internet Image Searches

Author: Fergus Rob
Li Fei-Fei
Perona Pietro
Zisserman Andrew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

In this paper, we describe a simple approach to learning models of visual object categories from images gathered from Internet image search engines. The images for a given keyword are typically highly variable, with a large fraction being unrelated to the query term, and thus pose a challenging environment from which to learn. By training our models directly from Internet images, we remove the need to laboriously compile training data sets, required by most other recognition approaches-this opens up the possibility of learning object category models “on-the-fly.” We describe two simple approaches, derived from the probabilistic latent semantic analysis (pLSA) technique for text document analysis, that can be used to automatically learn object models from these data. We show two applications of the learned model: first, to rerank the images returned by the search engine, thus improving the quality of the search engine; and second, to recognize objects in other image data sets

Crossref

Caltech Authors

Oxford University Research Archive

Multimodal Classification of Violent Online Political Extremism Content with Graph Convolutional Networks

Author: Gornishka I.
Rudinac S.
Worring M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Multimodal Classification of Violent Online Political Extremism Content with Graph Convolutional Networks

Author: Gornishka I.
Rudinac S.
Worring M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Recommended from our members

Understanding of Visual Domains via the Lens of Natural Language

Author: Wu Chenyun
Publication venue: ScholarWorks@UMass Amherst
Publication date: 21/10/2021
Field of study

A joint understanding of vision and language can enable intelligent systems to perceive, act, and communicate with humans for a wide range of applications. For example, they can assist a human to navigate in an environment, edit the content of an image through natural language commands, or search through image collections using natural language queries. In this thesis, we aim to improve our understanding of visual domains through the lens of natural language. We specifically look into (1) images of categories within a fine-grained taxonomy such as species of birds or variants of aircraft, (2) images of textures that describe local color, shape, and patterns, and (3) regions in images that correspond to objects, materials, and textures. In one line of work, we investigate ways to discover a domain-specific language by asking annotators to describe visual differences between instances within a fine-grained taxonomy. We show that a system trained to describe these differences leads to an accurate and interpretable basis for categorization. In another line of work, we investigate the effectiveness of language and vision models for describing textures, a problem that, despite the ubiquity of textures, has not been sufficiently studied in the literature. Textures are diverse, yet their local nature allows for the description of appearance of a wide range of visual categories. The locality also allows us to systematically generate synthetic variations to investigate how disentangled visual representations are for properties such as shape, color, and figure-ground segmentation. Finally, instead of modeling an image as a whole, we design a system that allows descriptions of regions within an image. A challenge is to handle the long-tail distribution of names and appearances of concepts within natural scenes. We design a modular framework that integrates object detection, semantic segmentation, and contextual reasoning with language that leads to better performance. In addition to methods and analysis, we contribute datasets and benchmarks to evaluate the performance of models in each of these domains. The availability of large-scale pre-trained models for vision (e.g., ResNet) and language (e.g., BERT) have catalyzed improvements and novel applications in computer vision and natural language processing, but until recently similar models that could jointly reason about language and vision were not available. This has changed through the availability of models such as CLIP, which have been trained on a massive number of images with associated texts. Therefore, we analyze the effectiveness of CLIP-based representations for tasks posed in our earlier work. By comparing and contrasting these with domain-specific ones we presented in the earlier chapters, we shed some light on the nature of the learned representations and the biases they encode

ScholarWorks@UMass Amherst

Image Retrieval based on Bag-of-Words model

Author: Liu Jialu
Publication venue
Publication date: 18/04/2013
Field of study

This article gives a survey for bag-of-words (BoW) or bag-of-features model in image retrieval system. In recent years, large-scale image retrieval shows significant potential in both industry applications and research problems. As local descriptors like SIFT demonstrate great discriminative power in solving vision problems like object recognition, image classification and annotation, more and more state-of-the-art large scale image retrieval systems are trying to rely on them. A common way to achieve this is first quantizing local descriptors into visual words, and then applying scalable textual indexing and retrieval schemes. We call this model as bag-of-words or bag-of-features model. The goal of this survey is to give an overview of this model and introduce different strategies when building the system based on this model

arXiv.org e-Print Archive

CiteSeerX

Mobile product search with bag of hash bits

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

Crossref