Search CORE

47,606 research outputs found

Image classification by visual bag-of-words refinement and reduction

Author: Lu Zhiwu
Wang Liwei
Wen Ji-Rong
Publication venue: 'Elsevier BV'
Publication date: 18/01/2015
Field of study

This paper presents a new framework for visual bag-of-words (BOW) refinement and reduction to overcome the drawbacks associated with the visual BOW model which has been widely used for image classification. Although very influential in the literature, the traditional visual BOW model has two distinct drawbacks. Firstly, for efficiency purposes, the visual vocabulary is commonly constructed by directly clustering the low-level visual feature vectors extracted from local keypoints, without considering the high-level semantics of images. That is, the visual BOW model still suffers from the semantic gap, and thus may lead to significant performance degradation in more challenging tasks (e.g. social image classification). Secondly, typically thousands of visual words are generated to obtain better performance on a relatively large image dataset. Due to such large vocabulary size, the subsequent image classification may take sheer amount of time. To overcome the first drawback, we develop a graph-based method for visual BOW refinement by exploiting the tags (easy to access although noisy) of social images. More notably, for efficient image classification, we further reduce the refined visual BOW model to a much smaller size through semantic spectral clustering. Extensive experimental results show the promising performance of the proposed framework for visual BOW refinement and reduction

arXiv.org e-Print Archive

Scene Graph Generation with External Knowledge and Image Reconstruction

Author: Cai Jianfei
Gu Jiuxiang
Li Sheng
Lin Zhe
Ling Mingyang
Zhao Handong
Publication venue
Publication date: 01/01/2019
Field of study

Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction,~\etc. However, existing datasets are biased in terms of object and relationship labels, or often come with noisy and missing annotations, which makes the development of a reliable scene graph prediction model very challenging. In this paper, we propose a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome these dataset issues. In particular, we extract commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability in scene graph generation. To address the bias of noisy object annotations, we introduce an auxiliary image reconstruction path to regularize the scene graph generation network. Extensive experiments show that our framework can generate better scene graphs, achieving the state-of-the-art performance on two benchmark datasets: Visual Relationship Detection and Visual Genome datasets.Comment: 10 pages, 5 figures, Accepted in CVPR 201

arXiv.org e-Print Archive

Crossref

Monash University Research Portal

A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images

Author: Akagunduz Erdem
Ulku Irem
Publication venue
Publication date: 14/05/2020
Field of study

Semantic segmentation is the pixel-wise labelling of an image. Since the problem is defined at the pixel level, determining image class labels only is not acceptable, but localising them at the original image pixel resolution is necessary. Boosted by the extraordinary ability of convolutional neural networks (CNN) in creating semantic, high level and hierarchical image features; excessive numbers of deep learning-based 2D semantic segmentation approaches have been proposed within the last decade. In this survey, we mainly focus on the recent scientific developments in semantic segmentation, specifically on deep learning-based methods using 2D images. We started with an analysis of the public image sets and leaderboards for 2D semantic segmantation, with an overview of the techniques employed in performance evaluation. In examining the evolution of the field, we chronologically categorised the approaches into three main periods, namely pre-and early deep learning era, the fully convolutional era, and the post-FCN era. We technically analysed the solutions put forward in terms of solving the fundamental problems of the field, such as fine-grained localisation and scale invariance. Before drawing our conclusions, we present a table of methods from all mentioned eras, with a brief summary of each approach that explains their contribution to the field. We conclude the survey by discussing the current challenges of the field and to what extent they have been solved.Comment: Updated with new studie

arXiv.org e-Print Archive

Directory of Open Access Journals

Deep Lesion Graphs in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database

Author: A Esteva
A Teramoto
AAA Setio
Geert Litjens
Guohui Wei
Hayit Greenspan
HC Shin
Helen Hong
Hoo-Chang Shin
JANE BROMLEY
JH Moltz
Jinzheng Cai
Johannes Hofmanninger
Jonathan Krause
Jose Ramos
JS Silva
Ke Yan
Kenneth Clark
M Everingham
M Sevenster
Nima Tajbakhsh
Tsung-Yi Lin
Zhe Wang
Zhongyu Li
Publication venue
Publication date: 28/07/2018
Field of study

Radiologists in their daily work routinely find and annotate significant abnormalities on a large number of radiology images. Such abnormalities, or lesions, have collected over years and stored in hospitals' picture archiving and communication systems. However, they are basically unsorted and lack semantic annotations like type and location. In this paper, we aim to organize and explore them by learning a deep feature representation for each lesion. A large-scale and comprehensive dataset, DeepLesion, is introduced for this task. DeepLesion contains bounding boxes and size measurements of over 32K lesions. To model their similarity relationship, we leverage multiple supervision information including types, self-supervised location coordinates and sizes. They require little manual annotation effort but describe useful attributes of the lesions. Then, a triplet network is utilized to learn lesion embeddings with a sequential sampling strategy to depict their hierarchical similarity structure. Experiments show promising qualitative and quantitative results on lesion retrieval, clustering, and classification. The learned embeddings can be further employed to build a lesion graph for various clinically useful applications. We propose algorithms for intra-patient lesion matching and missing annotation mining. Experimental results validate their effectiveness.Comment: Accepted by CVPR2018. DeepLesion url adde

arXiv.org e-Print Archive

Crossref

Context-Aware Zero-Shot Recognition

Author: Han Bohyung
Luo Ruotian
Yang Linjie
Zhang Ning
Publication venue
Publication date: 24/04/2019
Field of study

We present a novel problem setting in zero-shot learning, zero-shot object recognition and detection in the context. Contrary to the traditional zero-shot learning methods, which simply infers unseen categories by transferring knowledge from the objects belonging to semantically similar seen categories, we aim to understand the identity of the novel objects in an image surrounded by the known objects using the inter-object relation prior. Specifically, we leverage the visual context and the geometric relationships between all pairs of objects in a single image, and capture the information useful to infer unseen categories. We integrate our context-aware zero-shot learning framework into the traditional zero-shot learning techniques seamlessly using a Conditional Random Field (CRF). The proposed algorithm is evaluated on both zero-shot region classification and zero-shot detection tasks. The results on Visual Genome (VG) dataset show that our model significantly boosts performance with the additional visual context compared to traditional methods

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Automatic Leaf Extraction from Outdoor Images

Author: Anantrasirichai N.
Canagarajah Nishan
Hannuna Sion
Publication venue
Publication date: 19/09/2017
Field of study

Automatic plant recognition and disease analysis may be streamlined by an image of a complete, isolated leaf as an initial input. Segmenting leaves from natural images is a hard problem. Cluttered and complex backgrounds: often composed of other leaves are commonplace. Furthermore, their appearance is highly dependent upon illumination and viewing perspective. In order to address these issues we propose a methodology which exploits the leaves venous systems in tandem with other low level features. Background and leaf markers are created using colour, intensity and texture. Two approaches are investigated: watershed and graph-cut and results compared. Primary-secondary vein detection and a protrusion-notch removal are applied to refine the extracted leaf. The efficacy of our approach is demonstrated against existing work.Comment: 13 pages, India-UK Advanced Technology Centre of Excellence in Next Generation Networks, Systems and Services (IU-ATC), 201

arXiv.org e-Print Archive

CiteSeerX