17 research outputs found
Handwritten and machine-printed text discrimination using a template matching approach
We propose a novel template matching approach for the discrimination of handwritten and machine-printed text. We first pre-process the scanned document images by performing denoising, circles/lines exclusion and word-block level segmentation. We then align and match characters in a flexible sized gallery with the segmented regions, using parallelised normalised cross-correlation. The experimental results over the Pattern Recognition & Image Analysis Research Lab-Natural History Museum (PRImA-NHM) dataset show remarkably high robustness of the algorithm in classifying cluttered, occluded and noisy samples, in addition to those with significant high missing data. The algorithm, which gives 84.0% classification rate with false positive rate 0.16 over the dataset, does not require training samples and generates compelling results as opposed to the training-based approaches, which have used the same benchmark
POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors
For vehicle autonomy, driver assistance and situational awareness, it is
necessary to operate at day and night, and in all weather conditions. In
particular, long wave infrared (LWIR) sensors that receive predominantly
emitted radiation have the capability to operate at night as well as during the
day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data
from a mobile vehicle, to compare and contrast four different convolutional
neural network (CNN) configurations to detect other vehicles in video
sequences. We evaluate two distinct and promising approaches, two-stage
detection (Faster-RCNN) and one-stage detection (SSD), in four different
configurations. We also employ two different image decompositions: the first
based on the polarisation ellipse and the second on the Stokes parameters
themselves. To evaluate our approach, the experimental trials were quantified
by mean average precision (mAP) and processing time, showing a clear trade-off
between the two factors. For example, the best mAP result of 80.94% was
achieved using Faster-RCNN, but at a frame rate of 6.4 fps. In contrast,
MobileNet SSD achieved only 64.51% mAP, but at 53.4 fps.Comment: Computer Vision and Pattern Recognition Workshop 201
Noise modelling for denoising and 3D face recognition algorithms performance evaluation
This study proposes an algorithm is proposed to quantitatively evaluate the performance of three‐dimensional (3D) holistic face recognition algorithms when various denoising methods are used. First, a method is proposed to model the noise on the 3D face datasets. The model not only identifies those regions on the face which are sensitive to the noise but can also be used to simulate noise for any given 3D face. Then, by incorporating the noise model in a novel 3D face recognition pipeline, seven different classification and matching methods and six denoising techniques are used to quantify the face recognition algorithms performance for different powers of the noise. The outcome: (i) shows the most reliable parameters for the denoising methods to be used in a 3D face recognition pipeline; (ii) shows which parts of the face are more vulnerable to noise and require further post‐processing after data acquisition; and (iii) compares the performance of three different categories of recognition algorithms: training‐free matching‐based, subspace projection‐based and training‐based (without projection) classifiers. The results show the high performance of the bootstrap aggregating tree classifiers and median filtering for very high intensity noise. Moreover, when different noisy/denoised samples are used as probes or in the gallery, the matching algorithms significantly outperform the training‐based (including the subspace projection) methods
Image processing for surveillance and security
Security is a fundamental issue in today's world. In this chapter we discuss various aspects of security in daily life that can be solved using image processing techniques by grouping in three main categories: visual tracking, biometrics and digital media security. Visual tracking refers to computer vision techniques that analyses the scene to extract features representing objects (e.g., pedestrian) and track them to provide input to analyse any anomalous behaviour. Biometrics is the technology of detecting, extracting and analysing human's physical or behavioural features for identification purposes. Digital media security typically includes multimedia signal processing techniques that can protect copyright by embedding information within the media content using watermarking approaches. Individual topics are discussed referring recent literature