Search CORE

56 research outputs found

Recognition of Isolated Marathi words from Side Pose for multi-pose Audio Visual Speech Recognition

Author: Pravin Yannawar Sadhana Sukale, Prashant Borde Shivanand Gornale,
Publication venue: Assam Don Bosco University
Publication date: 29/06/2016
Field of study

Abstract: This paper presents a new multi pose audio visual speech recognition system based on fusion of side pose visual features and acoustic signals. The proposed method improved robustness and circumvention of conventional multimodal speech recognition system. The work was implemented on â€˜vVISWAâ€™ (Visual Vocabulary of Independent Standard Words) dataset comprised of full frontal, 45degree and side pose visual streams.The feature sets originating from the visual feature for Side pose are extracted using 2D Stationary Wavelet Transform (2D-SWT) and acoustic features extracted using (Linear Predictive Coding) LPC were fused and classified using KNN algorithm resulted in 90 % accuracy. This work facilitates approach of automatic recognition of isolated words from side pose in Multipose audio visual speech recognition domainwhere partial visual features of face were exists.Keywords: Side pose face detection, stationary wavelet transform, linear predictive analysis, Feature level fusion, KNN classifier

Assam Don Bosco University Journals

Head Yaw Estimation From Asymmetry of Facial Appearance

Author: Bingpeng Ma
Shiguang Shan
Wen Gao
Xilin Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Semantic guided multi-future human motion prediction

Author
Publication venue
Publication date
Field of study

L'obiettivo della tesi è quello di esplorare il possibile utilizzo di un modello basato su reti neurali già sviluppato per la previsione multi-futuro del moto di un agente umano. Data una traiettoria con informazione spaziale (sotto forma di angoli relativi dei giunti) di una struttura semplificata di scheletro umano, si cerca di aumentare l'accuratezza di previsione del modello grazie all'aggiunta di informazione semantica. Per informazione semantica si intende il significato ad alto livello dell'azione che l'agente umano sta compiendo.Investigate the potential utilization of a pre-existing neural network model, originally designed for multi-future prediction of human agent motion in a static camera scene, adapted to forecast rotational trajectories of human joints. By incorporating semantic information, pertaining to the higher-level depiction of the human agent's action, the objective is to enhance the prediction accuracy of the model. The study made use of the AMASS and BABEL datasets to achieve this purpose

Padua Thesis and Dissertation Archive

A Joint Learning Approach to Face Detection in Wavelet Compressed Domain

Author: Shang-Hong Lai
Szu-Hao Huang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Face detection has been an important and active research topic in computer vision and image processing. In recent years, learning-based face detection algorithms have prevailed with successful applications. In this paper, we propose a new face detection algorithm that works directly in wavelet compressed domain. In order to simplify the processes of image decompression and feature extraction, we modify the AdaBoost learning algorithm to select a set of complimentary joint-coefficient classifiers and integrate them to achieve optimal face detection. Since the face detection on the wavelet compression domain is restricted by the limited discrimination power of the designated feature space, the proposed learning mechanism is developed to achieve the best discrimination from the restricted feature space. The major contributions in the proposed AdaBoost face detection learning algorithm contain the feature space warping, joint feature representation, ID3-like plane quantization, and weak probabilistic classifier, which dramatically increase the discrimination power of the face classifier. Experimental results on the CBCL benchmark and the MIT + CMU real image dataset show that the proposed algorithm can detect faces in the wavelet compressed domain accurately and efficiently

Crossref

Directory of Open Access Journals

Modification of the AdaBoost-based Detector for Partially Occluded Faces

Author: Jie Chen
Shengye Yang
Shiguang Shan
Wen Gao
Xilin Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

While face detection seems a solved problem under general conditions, most state-of-the-art systems degrade rapidly when faces are partially occluded by other objects. This paper presents a solution to detect partially occluded faces by reasonably modifying the AdaBoost-based face detector. Our basic idea is that the weak classifiers in the AdaBoost-based face detector, each corresponding to a Haar-like feature, are inherently a patch-based model. Therefore, one can divide the whole face region into multiple patches, and map those weak classifiers to the patches. The weak classifiers belonging to each patch are re-formed to be a new classifier to determine if it is a valid face patch—without occlusion. Finally, we combine all of the valid face patches by assigning the patches with different weights to make the final decision whether the input subwindow is a face. The experimental results show that the proposed method is promising for the detection of occluded faces. 1

CiteSeerX

Crossref

Template Adaptation for Face Verification and Identification

Author: Byrne Jeffrey
Cao Qiong
Crosswhite Nate
Parkhi Omkar M.
Stauffer Chris
Zisserman Andrew
Publication venue
Publication date: 05/04/2016
Field of study

Face recognition performance evaluation has traditionally focused on one-to-one verification, popularized by the Labeled Faces in the Wild dataset for imagery and the YouTubeFaces dataset for videos. In contrast, the newly released IJB-A face recognition dataset unifies evaluation of one-to-many face identification with one-to-one face verification over templates, or sets of imagery and videos for a subject. In this paper, we study the problem of template adaptation, a form of transfer learning to the set of media in a template. Extensive performance evaluations on IJB-A show a surprising result, that perhaps the simplest method of template adaptation, combining deep convolutional network features with template specific linear SVMs, outperforms the state-of-the-art by a wide margin. We study the effects of template size, negative set construction and classifier fusion on performance, then compare template adaptation to convolutional networks with metric learning, 2D and 3D alignment. Our unexpected conclusion is that these other methods, when combined with template adaptation, all achieve nearly the same top performance on IJB-A for template-based face verification and identification

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive