16,189 research outputs found
Array of Multilayer Perceptrons with No-class Resampling Training for Face Recognition
A face recognition (FR) problem involves the face detection, representation and classification steps. Once a face is located in an image, it has to be represented through a feature extraction process, for later performing a proper face classication task. The most widely used approach for feature extraction is the eigenfaces method, where an eigenspace is established from the image training samples using principal components analysis.In the classification phase, an input face is projected to the obtained eigenspace and classified by an appropriate classifier. Neural network classifiers based on multilayer perceptron models have proven to be well suited to this task. This paper presents an array of multilayer perceptron neural networks trained with a novel no-class resampling strategy which takes into account the balance problem between class and no-class examples andincreases the generalization capabilities. The proposed model is compared against a classical multilayer perceptron classifier for face recognition over the AT&T database of faces, obtaining results that show an improvement over the classification rates of a classical classifier.Fil: Capello, D.. Universidad Tecnológica Nacional. Facultad Regional Santa Fe; ArgentinaFil: Martínez, César Ernesto. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas; ArgentinaFil: Milone, Diego Humberto. Universidad Nacional de Entre Ríos; ArgentinaFil: Stegmayer, Georgina. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas; Argentin
Registration of Face Image Using Modified BRISK Feature Descriptor
Automatic face recognition is a hot area of research in the field of computer vision. Even though a lot of research have been done in this field, still researchers are unable to develop an algorithm which can detect the face images under all possible real time conditions. Automatic face recognition algorithms are used in a variety of applications such as surveillance, automatic tagging, and human-robot interaction etc. The main problem faced by researchers working with the above real time problems is the uncertainty about the pose of the detected face, i.e. if the pose of the sensed image differ from the images in the trained database most of the existing algorithms will fail. So researchers suggested and proved that the detection accuracy against pose variation can be improved if we considered image registration as a preprocessing step prior to face recognition. In this work, scale and rotation invariant features have been used for image registration. The important steps in feature based image registration are preprocessing, feature detection, feature matching, transformation estimation, and resampling. In this work, feature detectors and descriptors like SIFT, SURF, FAST, DAISY and BRISK are used. Among all these descriptors the BRISK descriptor performs the best. To avoid mismatches, using some threshold values, a modified BRISK descriptor has been proposed in this work. Modified BRISK descriptor performs best in terms of maximum matching as compared to other state of arts descriptors. The next step is to calculate the transformation model which is capable of transforming the coordinates of sensed image to coordinates of reference image. Some radial basis functions are used in this step to design the proper transformation function. In resampling step, we used bilinear interpolation to compute some pixels in the output image. A new algorithm is proposed in this work to find out the possible image pairs from the train database corresponds to the input image, for doing image registration. In this work, image registration algorithms are simulated in MATLAB with different detector-descriptor combination and affine transformation matrix. For measuring the similarity between registered output image and the reference image, SSIM index and mutual information is used
Single camera pose estimation using Bayesian filtering and Kinect motion priors
Traditional approaches to upper body pose estimation using monocular vision
rely on complex body models and a large variety of geometric constraints. We
argue that this is not ideal and somewhat inelegant as it results in large
processing burdens, and instead attempt to incorporate these constraints
through priors obtained directly from training data. A prior distribution
covering the probability of a human pose occurring is used to incorporate
likely human poses. This distribution is obtained offline, by fitting a
Gaussian mixture model to a large dataset of recorded human body poses, tracked
using a Kinect sensor. We combine this prior information with a random walk
transition model to obtain an upper body model, suitable for use within a
recursive Bayesian filtering framework. Our model can be viewed as a mixture of
discrete Ornstein-Uhlenbeck processes, in that states behave as random walks,
but drift towards a set of typically observed poses. This model is combined
with measurements of the human head and hand positions, using recursive
Bayesian estimation to incorporate temporal information. Measurements are
obtained using face detection and a simple skin colour hand detector, trained
using the detected face. The suggested model is designed with analytical
tractability in mind and we show that the pose tracking can be
Rao-Blackwellised using the mixture Kalman filter, allowing for computational
efficiency while still incorporating bio-mechanical properties of the upper
body. In addition, the use of the proposed upper body model allows reliable
three-dimensional pose estimates to be obtained indirectly for a number of
joints that are often difficult to detect using traditional object recognition
strategies. Comparisons with Kinect sensor results and the state of the art in
2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014
conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video:
https://www.youtube.com/watch?v=dJMTSo7-uF
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
- …