31 research outputs found
The Extraction and Recognition of Facial Images in the Complex Background
Article信州大学工学部紀要 74: 77-88 (1994)departmental bulletin pape
An Automatic Human Face Detection Method
This article contains a proposal for an automatic human face detection method, that tries to join several theories proposed by different authors. The method is based on detection of shape features (eye pairs) and skin color. The method assumes certain circumstances and constraints, respectively. Therefore it is not applicable universally. Given the constraints, it is effective enough for applications where fast execution is required. Test results are given and at the end some directives for future work are discussed
Recommended from our members
Learning to See with Minimal Human Supervision
Deep learning has significantly advanced computer vision in the past decade, paving the way for practical applications such as facial recognition and autonomous driving. However, current techniques depend heavily on human supervision, limiting their broader deployment. This dissertation tackles this problem by introducing algorithms and theories to minimize human supervision in three key areas: data, annotations, and neural network architectures, in the context of various visual understanding tasks such as object detection, image restoration, and 3D generation.
First, we present self-supervised learning algorithms to handle in-the-wild images and videos that traditionally require time-consuming manual curation and labeling. We demonstrate that when a deep network is trained to be invariant to geometric and photometric transformations, representations from its intermediate layers are highly predictive of object semantic parts such as eyes and noses. This insight offers a simple unsupervised learning framework that significantly improves the efficiency and accuracy of few-shot landmark prediction and matching. We then present a technique for learning single-view 3D object pose estimation models by utilizing in-the-wild videos where objects turn (e.g., cars in roundabouts). This technique achieves competitive performance with respect to existing state-of-the-art without requiring any manual labels during training. We also contribute an Accidental Turntables Dataset, containing a challenging set of 41,212 images of cars in cluttered backgrounds, motion blur, and illumination changes that serve as a benchmark for 3D pose estimation.
Second, we address variations in labeling styles across different annotators, which leads to a type of noisy label referred to as heterogeneous label. This variability in human annotation can cause subpar performance during both the training and testing phases. To mitigate this, we have developed a framework that models the labeling styles of individual annotators, reducing the impact of human annotation variations and enhancing the performance of standard object detection models. We have also applied this framework to analyze ecological data, which are often collected opportunistically across different case studies without consistent annotation guidelines. Through this application, we have obtained several insightful observations into large-scale bird migration behaviors and their relationship to climate change.
Our next study explores the challenges of designing neural networks, an area that lacks a comprehensive theoretical understanding. By linking deep neural networks with Gaussian processes, we propose a novel Bayesian interpretation of the deep image prior, which parameterizes a natural image as the output of a convolutional network with random parameters and random input. This approach offers valuable insights to optimize the design of neural networks for various image restoration tasks.
Lastly, we introduce several machine-learning techniques to reconstruct and edit 3D shapes from 2D images with minimal human effort. We first present a generic multi-modal generative model that bridges 2D images and 3D shapes via a shared latent space, and demonstrate its applications on versatile 3D shape generation and manipulation tasks. Additionally, we develop a framework for joint estimation of 3D neural scene representation and camera poses. This approach outperforms prior works and allows us to operate in the general SE(3) camera pose setting, unlike the baselines. The results also indicate this method can be complementary to classical structure-from-motion (SfM) pipelines as it compares favorably to SfM on low-texture and low-resolution images
A Real-time Model for Multiple Human Face Tracking from Low-resolution Surveillance Videos
AbstractThis article discusses a novel approach of multiple-face tracking from low-resolution surveillance videos. There has been significant research in the field of face detection using neural-network based training. Neural network based face detection methods are highly accurate, albeit computationally intensive. Hence neural network based approaches are not suitable for real-time applications. The proposed approach approximately detects faces in an image solely using the color information. It detects skin region in an image and finds existence of eye and mouth region in the skin region. If it finds so, it marks the skin region as a face and fits an oriented rectangle to the face. The approach requires low computation and hence can be applied on subsequent frames from a video. The proposed approach is tested on FERET face database images, on different images containing multiple faces captured in unconstrained environments, and on frames extracted from IP surveillance camera
A New Texture Based Segmentation Method to Extract Object from Background
Extraction of object regions from complex background is a hard task and it is an essential part of image segmentation and recognition. Image segmentation denotes a process of dividing an image into different regions. Several segmentation approaches for images have been developed. Image segmentation plays a vital role in image analysis. According to several authors, segmentation terminates when the observer2019;s goal is satisfied. The very first problem of segmentation is that a unique general method still does not exist: depending on the application, algorithm performances vary. This paper studies the insect segmentation in complex background. The segmentation methodology on insect images consists of five steps. Firstly, the original image of RGB space is converted into Lab color space. In the second step 2018;a2019; component of Lab color space is extracted. Then segmentation by two-dimension OTSU of automatic threshold in 2018;a-channel2019; is performed. Based on the color segmentation result, and the texture differences between the background image and the required object, the object is extracted by the gray level co-occurrence matrix for texture segmentation. The algorithm was tested on dreamstime image database and the results prove to be satisfactory
Enhanced face detection framework based on skin color and false alarm rejection
Fast and precise face detection is a challenging task in computer vision. Human face detection plays an essential role in the first stage of face processing applications such as recognition tracking, and image database management. In the applications, face objects often come from an inconsequential part of images that contain variations namely different illumination, pose, and occlusion. These variations can decrease face detection rate noticeably. Besides that, detection time is an important factor, especially in real time systems. Most existing face detection approaches are not accurate as they have not been able to resolve unstructured images due to large appearance variations and can only detect human face under one particular variation. Existing frameworks of face detection need enhancement to detect human face under the stated variations to improve detection rate and reduce detection time. In this study, an enhanced face detection framework was proposed to improve detection rate based on skin color and provide a validity process. A preliminary segmentation of input images based on skin color can significantly reduce search space and accelerate the procedure of human face detection. The main detection process is based on Haar-like features and Adaboost algorithm. A validity process is introduced to reject non-face objects, which may be selected during a face detection process. The validity process is based on a two-stage Extended Local Binary Patterns. Experimental results on CMU-MIT and Caltech 10000 datasets over a wide range of facial variations in different colors, positions, scales, and lighting conditions indicated a successful face detection rate. As a conclusion, the proposed enhanced face detection framework in color images with the presence of varying lighting conditions and under different poses has resulted in high detection rate and reducing overall detection time
Video Face Swapping
Face swapping is the challenge of replacing one or multiple faces in a target image with a
face from a source image, the source image conditions need to be transformed in order to
match the conditions in the target image (lighting and pose). A code for Image Face Swapping
(IFS) was refactored and used to perform face swapping in videos. The basic logic
behind Video Face Swapping (VFS) is the same as the one used for IFS since a video is just
a sequence of images (frames) stitched together to imitate movement. In order to achieve
VFS, the face(s) in an input image are detected, their facial landmarks key points are calculated
and assigned to their corresponding (X,Y) coordinates, subsequently the faces are
aligned using a procrustes analysis, next a mask is created for each image in order to determine
what parts of the source and target image need to be shown in the output, then the
source image shape has to warp onto the shape of the target image and for the output to look
as natural as possible, color correction is performed. Finally, the two masks are blended to
generate a new image output showing the face swap. The results were analysed and obstacles
of the VFS code were identified and optimization of the code was conducted.
In estonian: Näovahetusena mõistetakse käesolevalt lähtekujutiselt saadud ühe või mitme näo
asendamist sihtpildil. Lähtekujutise tingimusi peab transformeerima, et nad ühtiksid
sihtpildiga (valgus, asend). Pildi näovahetus (IFS, Image Face Swapping) koodi refaktoreeriti ja kasutati video
näovahetuseks. Video näovahetuse (Video Face Swapping, VFS) põhiline loogika on sama kui IFSi puhul,
kuna video on olemuselt ühendatud kujutiste järjestus, mis imiteerib liikumist. VFSi
saavutamiseks tuvastatakse nägu (näod) sisendkujutisel, arvutatakse näotuvastusalgoritmi
abil näojoonte koordinaadid, pärast mida joondatakse näod Procrustese meetodiga.
Järgnevalt luuakse igale kujutisele image-mask, määratlemaks, milliseid lähte- ja
sihtkujutise osi on vaja näidata väljundina; seejärel ühitatakse lähte- ja sihtkujutise kujud ja
võimalikult loomuliku tulemuse jaoks viiakse läbi värvikorrektsioon. Lõpuks hajutatakse
kaks maski uueks väljundkujutiseks, millel on näha näovahetuse tulemus.
Tulemusi analüüsiti ja tuvastati VFS koodi takistused ning seejärel optimeeriti koodi
Particle Filters for Colour-Based Face Tracking Under Varying Illumination
Automatic human face tracking is the basis of robotic and active vision systems used for facial feature analysis, automatic surveillance, video conferencing, intelligent transportation, human-computer interaction and many other applications. Superior human face tracking will allow future safety surveillance systems which monitor drowsy drivers, or patients and elderly people at the risk of seizure or sudden falls and will perform with lower risk of failure in unexpected situations. This area has actively been researched in the current literature in an attempt to make automatic face trackers more stable in challenging real-world environments. To detect faces in video sequences, features like colour, texture, intensity, shape or motion is used. Among these feature colour has been the most popular, because of its insensitivity to orientation and size changes and fast process-ability. The challenge of colour-based face trackers, however, has been dealing with the instability of trackers in case of colour changes due to the drastic variation in environmental illumination. Probabilistic tracking and the employment of particle filters as powerful Bayesian stochastic estimators, on the other hand, is increasing in the visual tracking field thanks to their ability to handle multi-modal distributions in cluttered scenes. Traditional particle filters utilize transition prior as importance sampling function, but this can result in poor posterior sampling. The objective of this research is to investigate and propose stable face tracker capable of dealing with challenges like rapid and random motion of head, scale changes when people are moving closer or further from the camera, motion of multiple people with close skin tones in the vicinity of the model person, presence of clutter and occlusion of face. The main focus has been on investigating an efficient method to address the sensitivity of the colour-based trackers in case of gradual or drastic illumination variations. The particle filter is used to overcome the instability of face trackers due to nonlinear and random head motions. To increase the traditional particle filter\u27s sampling efficiency an improved version of the particle filter is introduced that considers the latest measurements. This improved particle filter employs a new colour-based bottom-up approach that leads particles to generate an effective proposal distribution. The colour-based bottom-up approach is a classification technique for fast skin colour segmentation. This method is independent to distribution shape and does not require excessive memory storage or exhaustive prior training. Finally, to address the adaptability of the colour-based face tracker to illumination changes, an original likelihood model is proposed based of spatial rank information that considers both the illumination invariant colour ordering of a face\u27s pixels in an image or video frame and the spatial interaction between them. The original contribution of this work lies in the unique mixture of existing and proposed components to improve colour-base recognition and tracking of faces in complex scenes, especially where drastic illumination changes occur. Experimental results of the final version of the proposed face tracker, which combines the methods developed, are provided in the last chapter of this manuscript
Improved facial feature fitting for model based coding and animation
EThOS - Electronic Theses Online ServiceGBUnited Kingdo