1,889 research outputs found
Pattern Recognition
Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition
Towards robust 3D face recognition from noisy range images with low resolution
For a number of different security and industrial applications, there is the need for reliable person identification methods. Among these methods, face recognition has a number of advantages such as being non-invasive and potentially covert. Since the device for data acquisition is a conventional camera, other advantages of a 2D face recognition system are its low data capture duration and its low cost. However, the recent introduction of fast and comparatively inexpensive time-of-flight (TOF) cameras for the recording of 2.5D range data calls for a closer look at 3D face recognition in this context. One major disadvantage, however, is the low quality of the data aquired with such cameras. In this paper, we introduce a robust 3D face recognition system based on such noisy range images with low resolution
Robust watermarking for magnetic resonance images with automatic region of interest detection
Medical image watermarking requires special considerations compared to ordinary watermarking methods. The first issue is the detection of an important area of the image called the Region of Interest (ROI) prior to starting the watermarking process. Most existing ROI detection procedures use manual-based methods, while in automated methods the robustness against intentional or unintentional attacks has not been considered extensively. The second issue is the robustness of the embedded watermark against different attacks. A common drawback of existing watermarking methods is their weakness against salt and pepper noise. The research carried out in this thesis addresses these issues of having automatic ROI detection for magnetic resonance images that are robust against attacks particularly the salt and pepper noise and designing a new watermarking method that can withstand high density salt and pepper noise. In the ROI detection part, combinations of several algorithms such as morphological reconstruction, adaptive thresholding and labelling are utilized. The noise-filtering algorithm and window size correction block are then introduced for further enhancement. The performance of the proposed ROI detection is evaluated by computing the Comparative Accuracy (CA). In the watermarking part, a combination of spatial method, channel coding and noise filtering schemes are used to increase the robustness against salt and pepper noise. The quality of watermarked image is evaluated using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), and the accuracy of the extracted watermark is assessed in terms of Bit Error Rate (BER). Based on experiments, the CA under eight different attacks (speckle noise, average filter, median filter, Wiener filter, Gaussian filter, sharpening filter, motion, and salt and pepper noise) is between 97.8% and 100%. The CA under different densities of salt and pepper noise (10%-90%) is in the range of 75.13% to 98.99%. In the watermarking part, the performance of the proposed method under different densities of salt and pepper noise measured by total PSNR, ROI PSNR, total SSIM and ROI SSIM has improved in the ranges of 3.48-23.03 (dB), 3.5-23.05 (dB), 0-0.4620 and 0-0.5335 to 21.75-42.08 (dB), 20.55-40.83 (dB), 0.5775-0.8874 and 0.4104-0.9742 respectively. In addition, the BER is reduced to the range of 0.02% to 41.7%. To conclude, the proposed method has managed to significantly improve the performance of existing medical image watermarking methods
An Adaptive Threshold based FPGA Implementation for Object and Face detection
The moving object and face detection are vital requirement for real time security applications. In this paper, we propose an Adaptive Threshold based FPGA Implementation for Object and Face detection. The input Images and reference Images are preprocessed using Gaussian Filter to smoothen the high frequency components. The 2D-DWT is applied on Gaussian filter outputs and only LL bands are considered for further processing. The modified background with adaptive threshold are used to detect the object with LL band of reference image. The detected object is passed through Gaussian filter to enhance the quality of object. The matching unit is designed to recognize face from standard face database images. It is observed that the performance parameters such as percentage TSR and hardware utilizations are better compared to existing techniques
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Recommended from our members
Evaluation and analysis of hybrid intelligent pattern recognition techniques for speaker identification
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem
of identifying a speaker from its voice regardless of the content (i.e.
text-independent), and to design efficient methods of combining face and voice in producing a robust authentication system.
A novel approach towards speaker identification is developed using
wavelet analysis, and multiple neural networks including Probabilistic
Neural Network (PNN), General Regressive Neural Network (GRNN)and Radial Basis Function-Neural Network (RBF NN) with the AND
voting scheme. This approach is tested on GRID and VidTIMIT cor-pora and comprehensive test results have been validated with state-
of-the-art approaches. The system was found to be competitive and it improved the recognition rate by 15% as compared to the classical Mel-frequency Cepstral Coe±cients (MFCC), and reduced the recognition time by 40% compared to Back Propagation Neural Network (BPNN), Gaussian Mixture Models (GMM) and Principal Component Analysis (PCA).
Another novel approach using vowel formant analysis is implemented using Linear Discriminant Analysis (LDA). Vowel formant based speaker identification is best suitable for real-time implementation and requires only a few bytes of information to be stored for each speaker, making it both storage and time efficient. Tested on GRID and Vid-TIMIT, the proposed scheme was found to be 85.05% accurate when Linear Predictive Coding (LPC) is used to extract the vowel formants, which is much higher than the accuracy of BPNN and GMM. Since the proposed scheme does not require any training time other than creating a small database of vowel formants, it is faster as well. Furthermore, an increasing number of speakers makes it di±cult for BPNN and GMM to sustain their accuracy, but the proposed score-based methodology stays almost linear.
Finally, a novel audio-visual fusion based identification system is implemented using GMM and MFCC for speaker identi¯cation and PCA for face recognition. The results of speaker identification and face recognition are fused at different levels, namely the feature, score and decision levels. Both the score-level and decision-level (with OR voting) fusions were shown to outperform the feature-level fusion in terms of accuracy and error resilience. The result is in line with the distinct nature of the two modalities which lose themselves when combined at the feature-level. The GRID and VidTIMIT test results validate that
the proposed scheme is one of the best candidates for the fusion of
face and voice due to its low computational time and high recognition accuracy
- …