Search CORE

69,228 research outputs found

Joint-SRVDNet: Joint Super Resolution and Vehicle Detection Network

Author: Ferdous Syeda Nyma
Mostofa Moktari
Nasrabadi Nasser M.
Riggan Benjamin S.
Publication venue
Publication date: 03/05/2020
Field of study

In many domestic and military applications, aerial vehicle detection and super-resolutionalgorithms are frequently developed and applied independently. However, aerial vehicle detection on super-resolved images remains a challenging task due to the lack of discriminative information in the super-resolved images. To address this problem, we propose a Joint Super-Resolution and Vehicle DetectionNetwork (Joint-SRVDNet) that tries to generate discriminative, high-resolution images of vehicles fromlow-resolution aerial images. First, aerial images are up-scaled by a factor of 4x using a Multi-scaleGenerative Adversarial Network (MsGAN), which has multiple intermediate outputs with increasingresolutions. Second, a detector is trained on super-resolved images that are upscaled by factor 4x usingMsGAN architecture and finally, the detection loss is minimized jointly with the super-resolution loss toencourage the target detector to be sensitive to the subsequent super-resolution training. The network jointlylearns hierarchical and discriminative features of targets and produces optimal super-resolution results. Weperform both quantitative and qualitative evaluation of our proposed network on VEDAI, xView and DOTAdatasets. The experimental results show that our proposed framework achieves better visual quality than thestate-of-the-art methods for aerial super-resolution with 4x up-scaling factor and improves the accuracy ofaerial vehicle detection

arXiv.org e-Print Archive

DigitalCommons@University of Nebraska

Ancient Coin Classification Using Graph Transduction Games

Author: Aslan Sinem
Pelillo Marcello
Vascon Sebastiano
Publication venue
Publication date: 01/01/2018
Field of study

Recognizing the type of an ancient coin requires theoretical expertise and years of experience in the field of numismatics. Our goal in this work is automatizing this time consuming and demanding task by a visual classification framework. Specifically, we propose to model ancient coin image classification using Graph Transduction Games (GTG). GTG casts the classification problem as a non-cooperative game where the players (the coin images) decide their strategies (class labels) according to the choices made by the others, which results with a global consensus at the final labeling. Experiments are conducted on the only publicly available dataset which is composed of 180 images of 60 types of Roman coins. We demonstrate that our approach outperforms the literature work on the same dataset with the classification accuracy of 73.6% and 87.3% when there are one and two images per class in the training set, respectively

arXiv.org e-Print Archive

Crossref

Ege University Institutional Repository

Automatic Discovery, Association Estimation and Learning of Semantic Attributes for a Thousand Categories

Author: Al-Halah Ziad
Stiefelhagen Rainer
Publication venue
Publication date: 11/04/2017
Field of study

Attribute-based recognition models, due to their impressive performance and their ability to generalize well on novel categories, have been widely adopted for many computer vision applications. However, usually both the attribute vocabulary and the class-attribute associations have to be provided manually by domain experts or large number of annotators. This is very costly and not necessarily optimal regarding recognition performance, and most importantly, it limits the applicability of attribute-based models to large scale data sets. To tackle this problem, we propose an end-to-end unsupervised attribute learning approach. We utilize online text corpora to automatically discover a salient and discriminative vocabulary that correlates well with the human concept of semantic attributes. Moreover, we propose a deep convolutional model to optimize class-attribute associations with a linguistic prior that accounts for noise and missing data in text. In a thorough evaluation on ImageNet, we demonstrate that our model is able to efficiently discover and learn semantic attributes at a large scale. Furthermore, we demonstrate that our model outperforms the state-of-the-art in zero-shot learning on three data sets: ImageNet, Animals with Attributes and aPascal/aYahoo. Finally, we enable attribute-based learning on ImageNet and will share the attributes and associations for future research.Comment: Accepted as a conference paper at CVPR 201

arXiv.org e-Print Archive

Crossref

Analysis of Deep Neural Networks for Military Target Classification using Synthetic Aperture Radar Images

Author: Jacob S.
Jacob S.
Sharif S.
Sharif S.
Wall J.
Wall J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Target detection and classification in the military is an area that is very significant in modern battlefields. Using Synthetic Aperture Radar images for classifying targets adds to its significance, as these images are high-resolution images of the surface of the earth created using microwave radiation and they can be used anytime, anywhere, and in any weather conditions. A target classification system using deep learning to classify military vehicles from Synthetic Aperture Radar images is proposed in this study. The system uses a baseline Convolutional Neural Network to classify the images of military vehicles from the MSTAR dataset, achieving a baseline accuracy of 90%. Further transfer learning was applied to the system by using 5 different pre-trained networks, namely the InceptionV3, VGG16, VGG19, ResNet50, and MobileNet. These models were analysed and evaluated using 3 different evaluation metrics, the Confusion matrix, Classification report, and Mean Average Precision to discover the most accurate and efficient model for this task. The models VGG16 and MobileNet displayed the best performance on the dataset achieving accuracies of 98% and 97%, respectively. The ResNet50 model displayed the worst performance among the models, achieving an accuracy of 82%. While the other models, InceptionV3 and VGG19, achieved accuracies of 92% and 96% respectively

UEL Research Repository at University of East London

Deep object classification in low resolution LWIR imagery via transfer learning

Author: Abbott R.
Connor B.
Del Rincon J. M.
Robertson N.
Publication venue
Publication date: 23/11/2017
Field of study

Queen's University Belfast Research Portal

Applying psychological science to the CCTV review process: a review of cognitive and ergonomic literature

Author: Hillstrom Anne
Hope Lorraine
Nee Claire
Publication venue: The Stationary Office
Publication date: 21/03/2008
Field of study

As CCTV cameras are used more and more often to increase security in communities, police are spending a larger proportion of their resources, including time, in processing CCTV images when investigating crimes that have occurred (Levesley & Martin, 2005; Nichols, 2001). As with all tasks, there are ways to approach this task that will facilitate performance and other approaches that will degrade performance, either by increasing errors or by unnecessarily prolonging the process. A clearer understanding of psychological factors influencing the effectiveness of footage review will facilitate future training in best practice with respect to the review of CCTV footage. The goal of this report is to provide such understanding by reviewing research on footage review, research on related tasks that require similar skills, and experimental laboratory research about the cognitive skills underpinning the task. The report is organised to address five challenges to effectiveness of CCTV review: the effects of the degraded nature of CCTV footage, distractions and interrupts, the length of the task, inappropriate mindset, and variability in people’s abilities and experience. Recommendations for optimising CCTV footage review include (1) doing a cognitive task analysis to increase understanding of the ways in which performance might be limited, (2) exploiting technology advances to maximise the perceptual quality of the footage (3) training people to improve the flexibility of their mindset as they perceive and interpret the images seen, (4) monitoring performance either on an ongoing basis, by using psychophysiological measures of alertness, or periodically, by testing screeners’ ability to find evidence in footage developed for such testing, and (5) evaluating the relevance of possible selection tests to screen effective from ineffective screener

Southampton (e-Prints Soton)

Portsmouth University Research Portal (Pure)