3 research outputs found
Video dizilerinden çoğul biyometrik kimlik doğrulama = combining face and voice modalities for person verification from video sequences
In this paper, a multimodal person verification system is presented. The system is based on face and voice modalities. Fusion of information derived from each modality is performed at the matching swre level using sum rule. For face verification statistical subspace tools are utilized as feature exhactors. For speaker verification, me1 frequency cepstral coefficients are used as features and gaussian mixture models are used for modeling. Various wmbination cases are hied in the experiments and the results show that for each case the wmbined modalities performs betfer than the single modality
Multimodal person verification from video sequences
In this paper, a multimodal person verification system based on fusing information derived from face speech signals is proposed. Principle component
analysis and independent component analysis techniques are used for face verification and melfrequency-cepstral coefficients are used for speaker
verification. The matching scores from individual modalities are combined using the sum rule. The results indicate that fusing indivual modalities improve overall performance of the verification system
Multi-modal person recognition for vehicular applications
In this paper, we present biometric person recognition experiments in
a real-world car environment using speech, face, and driving signals. We have
performed experiments on a subset of the in-car corpus collected at the Nagoya
University, Japan. We have used Mel-frequency cepstral coefficients (MFCC)
for speaker recognition. For face recognition, we have reduced the feature
dimension of each face image through principal component analysis (PCA). As
for modeling the driving behavior, we have employed features based on the
pressure readings of acceleration and brake pedals and their time-derivatives.
For each modality, we use a Gaussian mixture model (GMM) to model each
person’s biometric data for classification. GMM is the most appropriate tool for
audio and driving signals. For face, even though a nearest-neighbor-classifier is
the preferred choice, we have experimented with a single mixture GMM as
well. We use background models for each modality and also normalize each
modality score using an appropriate sigmoid function. At the end, all modality
scores are combined using a weighted sum rule. The weights are optimized
using held-out data. Depending on the ultimate application, we consider three
different recognition scenarios: verification, closed-set identification, and
open-set identification. We show that each modality has a positive effect on
improving the recognition performance