Search CORE

2,645 research outputs found

Medical Image Classification via SVM using LBP Features from Saliency-Based Folded Data

Author: Camlica Zehra
Khalvati Farzad
Tizhoosh H. R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/09/2015
Field of study

Good results on image classification and retrieval using support vector machines (SVM) with local binary patterns (LBPs) as features have been extensively reported in the literature where an entire image is retrieved or classified. In contrast, in medical imaging, not all parts of the image may be equally significant or relevant to the image retrieval application at hand. For instance, in lung x-ray image, the lung region may contain a tumour, hence being highly significant whereas the surrounding area does not contain significant information from medical diagnosis perspective. In this paper, we propose to detect salient regions of images during training and fold the data to reduce the effect of irrelevant regions. As a result, smaller image areas will be used for LBP features calculation and consequently classification by SVM. We use IRMA 2009 dataset with 14,410 x-ray images to verify the performance of the proposed approach. The results demonstrate the benefits of saliency-based folding approach that delivers comparable classification accuracies with state-of-the-art but exhibits lower computational cost and storage requirements, factors highly important for big data analytics.Comment: To appear in proceedings of The 14th International Conference on Machine Learning and Applications (IEEE ICMLA 2015), Miami, Florida, USA, 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

A words-of-interest model of sketch representation for image retrieval

Author: Guo Wen-Jin
Liu Yong-Jin
Luo Xi
Ma Cui-Xia
Song Dawei
Publication venue
Publication date: 01/08/2011
Field of study

In this paper we propose a method for sketch-based image retrieval. Sketch is a magical medium which is capable of conveying semantic messages for user. It’s in accordance with user’s cognitive psychology to retrieve images with sketch. In order to narrow down the semantic gap between the user and the images in database, we preprocess all the images into sketches by the coherent line drawing algorithm. During the process of sketches extraction, saliency maps are used to filter out the redundant background information, while preserve the important semantic information. We use a variant of Words-of-Interest model to retrieve relevant images for the user according to the query. Words-of-Interest (WoI) model is based on Bag-ofvisual Words (BoW) model, which has been proven successfully for information retrieval. Bag-of-Words ignores the spatial relationships among visual words, which are important for sketch representation. Our method takes advantage of the spatial information of the query to select words of interest. Experimental results demonstrate that our sketch-based retrieval method achieves a good tradeoff between retrieval accuracy and semantic representation of users’ query

Open Research Online (The Open University)

Salient Regions for Query by Image Content

Author: Hare Jonathon S.
Lewis Paul H.
Publication venue
Publication date: 01/01/2004
Field of study

Much previous work on image retrieval has used global features such as colour and texture to describe the content of the image. However, these global features are insufficient to accurately describe the image content when different parts of the image have different characteristics. This paper discusses how this problem can be circumvented by using salient interest points and compares and contrasts an extension to previous work in which the concept of scale is incorporated into the selection of salient regions to select the areas of the image that are most interesting and generate local descriptors to describe the image characteristics in that region. The paper describes and contrasts two such salient region descriptors and compares them through their repeatability rate under a range of common image transforms. Finally, the paper goes on to investigate the performance of one of the salient region detectors in an image retrieval situation

Southampton (e-Prints Soton)

Component-based Attention for Large-scale Trademark Retrieval

Author: Denman Simon
Fookes Clinton
Mau Sandra
Sivapalan Sabesan
Sridharan Sridha
Tursun Osman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/10/2019
Field of study

The demand for large-scale trademark retrieval (TR) systems has significantly increased to combat the rise in international trademark infringement. Unfortunately, the ranking accuracy of current approaches using either hand-crafted or pre-trained deep convolution neural network (DCNN) features is inadequate for large-scale deployments. We show in this paper that the ranking accuracy of TR systems can be significantly improved by incorporating hard and soft attention mechanisms, which direct attention to critical information such as figurative elements and reduce attention given to distracting and uninformative elements such as text and background. Our proposed approach achieves state-of-the-art results on a challenging large-scale trademark dataset.Comment: Fix typos related to authors' informatio

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Variational recurrent sequence-to-sequence retrieval for stepwise illustration

Author: Batra Vishwas
Ferhatosmanoglu Hakan
Guha Tanaya
Haldar Aparajita
He Yulan
Vogiatzis George
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/04/2020
Field of study

We address and formalise the task of sequence-to-sequence (seq2seq) cross-modal retrieval. Given a sequence of text passages as query, the goal is to retrieve a sequence of images that best describes and aligns with the query. This new task extends the traditional cross-modal retrieval, where each image-text pair is treated independently ignoring broader context. We propose a novel variational recurrent seq2seq (VRSS) retrieval model for this seq2seq task. Unlike most cross-modal methods, we generate an image vector corresponding to the latent topic obtained from combining the text semantics and context. This synthetic image embedding point associated with every text embedding point can then be employed for either image generation or image retrieval as desired. We evaluate the model for the application of stepwise illustration of recipes, where a sequence of relevant images are retrieved to best match the steps described in the text. To this end, we build and release a new Stepwise Recipe dataset for research purposes, containing 10K recipes (sequences of image-text pairs) having a total of 67K image-text pairs. To our knowledge, it is the first publicly available dataset to offer rich semantic descriptions in a focused category such as food or recipes. Our model is shown to outperform several competitive and relevant baselines in the experiments. We also provide qualitative analysis of how semantically meaningful the results produced by our model are through human evaluation and comparison with relevant existing methods

Keele Research Repository

Warwick Research Archives Portal Repository

ViTOR: Learning to Rank Webpages Based on Visual Features

Author: Akker Bram van den
de Rijke Maarten
Markov Ilya
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

The visual appearance of a webpage carries valuable information about its quality and can be used to improve the performance of learning to rank (LTR). We introduce the Visual learning TO Rank (ViTOR) model that integrates state-of-the-art visual features extraction methods by (i) transfer learning from a pre-trained image classification model, and (ii) synthetic saliency heat maps generated from webpage snapshots. Since there is currently no public dataset for the task of LTR with visual features, we also introduce and release the ViTOR dataset, containing visually rich and diverse webpages. The ViTOR dataset consists of visual snapshots, non-visual features and relevance judgments for ClueWeb12 webpages and TREC Web Track queries. We experiment with the proposed ViTOR model on the ViTOR dataset and show that it significantly improves the performance of LTR with visual featuresComment: In Proceedings of the 2019 World Wide Web Conference (WWW 2019), May 2019, San Francisc

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE