Search CORE

9 research outputs found

Robust object representation by boosting-like deep learning architecture

Author: Han Jungong
Qian Cheng-shan
Shen Linlin
Wang Lei
Zhang Baochang
Publication venue: 'Elsevier BV'
Publication date: 03/06/2016
Field of study

This paper presents a new deep learning architecture for robust object representation, aiming at efficiently combining the proposed synchronized multi-stage feature (SMF) and a boosting-like algorithm. The SMF structure can capture a variety of characteristics from the inputting object based on the fusion of the handcraft features and deep learned features. With the proposed boosting-like algorithm, we can obtain more convergence stability on training multi-layer network by using the boosted samples. We show the generalization of our object representation architecture by applying it to undertake various tasks, i.e. pedestrian detection and action recognition. Our approach achieves 15.89% and 3.85% reduction in the average miss rate compared with ACF and JointDeep on the largest Caltech dataset, and acquires competitive results on the MSRAction3D dataset

Northumbria University Research Portal

Crossref

Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification

Author: Anwer Rao Muhammad
Khan Fahad Shahbaz
Laaksonen Jorma
Molinier Matthieu
van de Weijer Joost
Publication venue: 'Elsevier BV'
Publication date: 26/03/2018
Field of study

Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The d facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Our final combination outperforms the state-of-the-art without employing fine-tuning or ensemble of RGB network architectures.Comment: To appear in ISPRS Journal of Photogrammetry and Remote Sensin

arXiv.org e-Print Archive

Crossref

VTT Research System

Dynamic fast local Laplacian completed local ternary pattern (dynamic FLapCLTP) for face recognition

Author: Sam Yin Yee
Publication venue
Publication date: 01/01/2020
Field of study

Today, face recognition has become one of the typical biometric authentication systems used for high security. Some systems may use face recognition to enhance their security and provide high protection level. Feature extraction is considered to be one of the most important steps in face recognition systems. The important and interesting parts of the image in feature extraction are represented as a compact feature vector. Many features, such as texture, colour and shape, have been proposed in the image processing fields. These features can also be classified globally or locally depending on the image extraction area. Texture descriptors have recently played a crucial role as local descriptors. Different types of texture descriptors, such as local binary pattern (LBP), local ternary pattern (LTP), completed local binary pattern (CLBP) and completed local ternary pattern (CLTP), have been proposed and utilised for face recognition tasks. All these texture features have achieved good performance in terms of recognition accuracy. Although the LBP performed well in different tasks, it has two limitations. LBP is sensitive to noise and occasionally fails to clearly distinguish between two different texture patterns with the same LBP encoding code. Most of the texture descriptors inherited these limitations from LBP. CLTP is proposed to overcome the limitations of LBP. CLTP performed well with different image processing tasks, such as image classification and face recognition. However, CLTP suffers from two limitations that may affect its performance in these tasks: the fixed value of the threshold value that is used during the CLTP extraction process regardless of the type of dataset or system and the longer length of the CLTP histogram than that of previous descriptors. This study focused on handling the first limitation, which is the threshold selection. Firstly, a new texture descriptor is proposed by integrating the fast-local Laplacian filter and the CLTP descriptor, namely, fast-local Laplacian CLTP (FLapCLTP). The fast-local Laplacian filter can help in increasing the performance of the CLTP due to its extensive detail enhancements and tone mapping; this contribution is handled by the constant threshold value used in CLTP. A dynamic FLapCLTP is then proposed to address the aforementioned issue. Instead of using a fixed threshold value with all datasets, a dynamic value is selected based on the image pixel values. Therefore, each different texture pattern has different threshold values to extract FLapCLTP from the pattern. This dynamic value is automatically selected according to the centre value of the texture pattern. Therefore, a dynamic FLapCLTP is proposed in this study. Finally, the proposed FLapCLTP and dynamic FLapCLTP are evaluated for facial recognition systems using ORL Faces, Sheffield Face, Collection Facial Images, Georgia Tech Face, Caltech Pedestrian Faces 1999, JAFFE, FEI Face and YALE datasets. The results showed the priority of the proposed texture compared with previous texture descriptors. The dynamic FLapCLTP achieved the highest recognition accuracy rates with values of 100%, 99.96%, 99.75%, 99.69%, 94.86%, 90.33%, 86.86% and 82.43% using UMIST, Collection Facial Images, JAFFE, ORL, Georgia Tech, YALE, Caltech 1999 and FEI datasets, respectively

UMP Institutional Repository

Robust object representation by boosting-like deep learning architecture

Author: Appel
Baochang Zhang
Chen
Chen
Chen
Chen
Chen
Cheng-shan Qian
Dollar
Doll´ar
Doll´ar
Gu
Gu
Han
Hoiem
Jian
Jungong Han
Lai
LeCun
Lei Wang
Li
Li
Linlin Shen
Liu
Liu
Ma
Ojala
Pan
Sermanet
Shao
Shen
Tuzel
Vieira
Viola
Watanabe
Wen
Zhu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Remote Sensing Image Scene Classification: Benchmark and State of the Art

Author: Cheng Gong
Han Junwei
Lu Xiaoqiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/02/2017
Field of study

Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.Comment: This manuscript is the accepted version for Proceedings of the IEE

arXiv.org e-Print Archive

Institutional Repository of Xi'an Institute of Optics and Precision Mechanics, CAS

Land-use scene classification using multi-scale completed local binary patterns

Author: A Oliva
AM Cheriyadat
Baochang Zhang
Chen Chen
D Dai
DG Lowe
E Tola
F Zhang
G Sheng
GB Huang
GB Huang
Hongjun Su
I Jolliffe
L Zhao
L Zhao
L Zhou
Lu Wang
S Chen
T Ojala
V Risojevic
W Li
Wei Li
X Zheng
Z Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Very High Resolution (VHR) Satellite Imagery: Processing and Applications

Author: Eugenio Francisco
Marcello Javier
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Recently, growing interest in the use of remote sensing imagery has appeared to provide synoptic maps of water quality parameters in coastal and inner water ecosystems;, monitoring of complex land ecosystems for biodiversity conservation; precision agriculture for the management of soils, crops, and pests; urban planning; disaster monitoring, etc. However, for these maps to achieve their full potential, it is important to engage in periodic monitoring and analysis of multi-temporal changes. In this context, very high resolution (VHR) satellite-based optical, infrared, and radar imaging instruments provide reliable information to implement spatially-based conservation actions. Moreover, they enable observations of parameters of our environment at greater broader spatial and finer temporal scales than those allowed through field observation alone. In this sense, recent very high resolution satellite technologies and image processing algorithms present the opportunity to develop quantitative techniques that have the potential to improve upon traditional techniques in terms of cost, mapping fidelity, and objectivity. Typical applications include multi-temporal classification, recognition and tracking of specific patterns, multisensor data fusion, analysis of land/marine ecosystem processes and environment monitoring, etc. This book aims to collect new developments, methodologies, and applications of very high resolution satellite data for remote sensing. The works selected provide to the research community the most recent advances on all aspects of VHR satellite remote sensing

Directory of Open Access Books (DOAB)

Deep learning for land cover and land use classification

Author: Zhang Ce
Publication venue: Lancaster University
Publication date: 01/01/2018
Field of study

Recent advances in sensor technologies have witnessed a vast amount of very fine spatial resolution (VFSR) remotely sensed imagery being collected on a daily basis. These VFSR images present fine spatial details that are spectrally and spatially complicated, thus posing huge challenges in automatic land cover (LC) and land use (LU) classification. Deep learning reignited the pursuit of artificial intelligence towards a general purpose machine to be able to perform any human-related tasks in an automated fashion. This is largely driven by the wave of excitement in deep machine learning to model the high-level abstractions through hierarchical feature representations without human-designed features or rules, which demonstrates great potential in identifying and characterising LC and LU patterns from VFSR imagery. In this thesis, a set of novel deep learning methods are developed for LC and LU image classification based on the deep convolutional neural networks (CNN) as an example. Several difficulties, however, are encountered when trying to apply the standard pixel-wise CNN for LC and LU classification using VFSR images, including geometric distortions, boundary uncertainties and huge computational redundancy. These technical challenges for LC classification were solved either using rule-based decision fusion or through uncertainty modelling using rough set theory. For land use, an object-based CNN method was proposed, in which each segmented object (a group of homogeneous pixels) was sampled and predicted by CNN with both within-object and between-object information. LU was, thus, classified with high accuracy and efficiency. Both LC and LU formulate a hierarchical ontology at the same geographical space, and such representations are modelled by their joint distribution, in which LC and LU are classified simultaneously through iteration. These developed deep learning techniques achieved by far the highest classification accuracy for both LC and LU, up to around 90% accuracy, about 5% higher than the existing deep learning methods, and 10% greater than traditional pixel-based and object-based approaches. This research made a significant contribution in LC and LU classification through deep learning based innovations, and has great potential utility in a wide range of geospatial applications

Lancaster E-Prints

Explore Bristol Research