Search CORE

337 research outputs found

Accessibility-based reranking in multimedia search engines

Author: Anastasios Drosou
Dimitrios Tzovaras
DS Friedman
EM Fine
F Liu
H Brettel
H Hirvelä
H Kim
I Kalamaras
Ilias Kalamaras
IY Kim
J Liu
J Sang
JR Lavery
KW-T Leung
L Zhang
M Wang
Nikolaos Dimitriou
NJ Belkin
PK Atrey
S Lawrence
S Tajima
S Yang
T-L Ji
Y Nikulin
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/08/2016
Field of study

Traditional multimedia search engines retrieve results based mostly on the query submitted by the user, or using a log of previous searches to provide personalized results, while not considering the accessibility of the results for users with vision or other types of impairments. In this paper, a novel approach is presented which incorporates the accessibility of images for users with various vision impairments, such as color blindness, cataract and glaucoma, in order to rerank the results of an image search engine. The accessibility of individual images is measured through the use of vision simulation filters. Multi-objective optimization techniques utilizing the image accessibility scores are used to handle users with multiple vision impairments, while the impairment profile of a specific user is used to select one from the Pareto-optimal solutions. The proposed approach has been tested with two image datasets, using both simulated and real impaired users, and the results verify its applicability. Although the proposed method has been used for vision accessibility-based reranking, it can also be extended for other types of personalization context

Springer - Publisher Connector

Spiral - Imperial College Digital Repository

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Author: Huang Zhiheng
Mao Junhua
Wang Jiang
Xu Wei
Yang Yi
Yuille Alan
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on four benchmark datasets (IAPR TC-12, Flickr 8K, Flickr 30K and MS COCO). Our model outperforms the state-of-the-art methods. In addition, we apply the m-RNN model to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval. The project page of this work is: www.stat.ucla.edu/~junhua.mao/m-RNN.html .Comment: Add a simple strategy to boost the performance of image captioning task significantly. More details are shown in Section 8 of the paper. The code and related data are available at https://github.com/mjhucla/mRNN-CR ;. arXiv admin note: substantial text overlap with arXiv:1410.109

arXiv.org e-Print Archive

CiteSeerX

Multimedia question answering

Author: NIE LIQIANG
Publication venue
Publication date: 04/07/2013
Field of study

Ph.DDOCTOR OF PHILOSOPH