Search CORE

187 research outputs found

Facial Beauty Prediction and Analysis based on Deep Convolutional Neural Network: A Review

Author: Abdulazeez Adnan Mohsin
Saeed Jwan
Publication venue: 'Penerbit UTHM'
Publication date: 15/04/2021
Field of study

Abstract: Facial attractiveness or facial beauty prediction (FBP) is a current study that has several potential usages. It is a key difficulty area in the computer vision domain because of the few public databases related to FBP and its experimental trials on the minor-scale database. Moreover, the evaluation of facial beauty is personalized in nature, with people having personalized favor of beauty. Deep learning techniques have displayed a significant ability in terms of analysis and feature representation. The previous studies focussed on scattered portions of facial beauty with fewer comparisons between diverse techniques. Thus, this article reviewed the recent research on computer prediction and analysis of face beauty based on deep convolution neural network DCNN. Furthermore, the provided possible lines of research and challenges in this article can help researchers in advancing the state â€“ of- art in future work

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

Modeling and Mapping Location-Dependent Human Appearance

Author: Bessinger Zachary
Publication venue: UKnowledge
Publication date: 01/01/2018
Field of study

Human appearance is highly variable and depends on individual preferences, such as fashion, facial expression, and makeup. These preferences depend on many factors including a person\u27s sense of style, what they are doing, and the weather. These factors, in turn, are dependent upon geographic location and time. In our work, we build computational models to learn the relationship between human appearance, geographic location, and time. The primary contributions are a framework for collecting and processing geotagged imagery of people, a large dataset collected by our framework, and several generative and discriminative models that use our dataset to learn the relationship between human appearance, location, and time. Additionally, we build interactive maps that allow for inspection and demonstration of what our models have learned

University of Kentucky

Semi-supervised auto-encoder for facial attributes recognition

Author: Bouhlel Med Salim
Boujneh Nouredine
Zaghbani Soumaya
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/08/2020
Field of study

The particularity of our faces encourages many researchers to exploit their features in different domains such as user identification, behaviour analysis, computer technology, security, and psychology. In this paper, we present a method for facial attributes analysis. The work addressed to analyse facial images and extract features in the purpose to recognize demographic attributes: age, gender, and ethnicity (AGE). In this work, we exploited the robustness of deep learning (DL) using an updating version of autoencoders called the deep sparse autoencoder (DSAE). In this work we used a new architecture of DSAE by adding the supervision to the classic model and we control the overfitting problem by regularizing the model. The pass from DSAE to the semi-supervised autoencoder (DSSAE) facilitates the supervision process and achieves an excellent performance to extract features. In this work we focused to estimate AGE jointly. The experiment results show that DSSAE is created to recognize facial features with high precision. The whole system achieves good performance and important rates in AGE using the MORPH II databas

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Human Age and Gender Classification using Convolutional Neural Networks

Author: Kelliher Eamon
Publication venue: Technological University Dublin
Publication date: 01/01/2021
Field of study

In a world relying ever more on human classification, this papers aims to improve on age and gender image classification through the use of Convolutional Neural Networks (CNN). Age and gender classification has become a popular area of study in the past number of years however there are still improvements to be made, particularly in the area of age classification. This research paper aims to test the currently accepted fact that CNN models are the superior model type for image classification by comparing CNN performance against Support Vector Machine performance on the same dataset. Using the Adience image classification dataset, this research also focuses on the implementation of data augmentation techniques, some more novel than others, as a means of improving CNN performance. In terms of standard popular methods of augmentation, image mirroring and image rotation were applied. As well as these, a more novel approach to augmentation was applied to the area of age classification. This technique was completed using Faceapp, an AI image editor in the form of a mobile application. This application allows for the placement of ”filters” on images of human beings in order to alter their appearance. The results of the data augmented models were superior to that of the standard CNN models with gender classification improving by 2.6% while age classification improved by 7.1%. The results of this research establish the potential for further improvements through the inclusion of more augmentation techniques or through the use of more filter types provided in the Faceapp application

Arrow@TUDublin

Representations of racial minorities in popular movies: A content-analytic synergy of computer vision and network science

Author: Hopp F.R.
Malik M.
Weber R.
Publication venue
Publication date: 14/12/2021
Field of study

In the Hollywood film industry, racial minorities remain underrepresented. Characters from racially underrepresented groups receive less screen time, fewer central story positions, and frequently inherit plotlines, motivations, and actions that are primarily driven by White characters. Currently, there are no clearly defined, standardized, and scalable metrics for taking stock of racial minorities’ cinematographic representation. In this paper, we combine methodological tools from computer vision and network science to develop a content analytic framework for identifying visual and structural racial biases in film productions. We apply our approach on a set of 89 popular, full-length movies, demonstrating that this method provides a scalable examination of racial inclusion in film production and predicts movie performance. We integrate our method into larger theoretical discussions on audiences’ perception of racial minorities and illuminate future research trajectories towards the computational assessment of racial biases in audiovisual narratives

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Multimodal Adversarial Learning

Author: Osahor Uche
Publication venue: 'West Virginia University Libraries'
Publication date: 01/01/2022
Field of study

Deep Convolutional Neural Networks (DCNN) have proven to be an exceptional tool for object recognition, generative modelling, and multi-modal learning in various computer vision applications. However, recent findings have shown that such state-of-the-art models can be easily deceived by inserting slight imperceptible perturbations to key pixels in the input. A good target detection systems can accurately identify targets by localizing their coordinates on the input image of interest. This is ideally achieved by labeling each pixel in an image as a background or a potential target pixel. However, prior research still confirms that such state of the art targets models are susceptible to adversarial attacks. In the case of generative models, facial sketches drawn by artists mostly used by law enforcement agencies depend on the ability of the artist to clearly replicate all the key facial features that aid in capturing the true identity of a subject. Recent works have attempted to synthesize these sketches into plausible visual images to improve visual recognition and identification. However, synthesizing photo-realistic images from sketches proves to be an even more challenging task, especially for sensitive applications such as suspect identification. However, the incorporation of hybrid discriminators, which perform attribute classification of multiple target attributes, a quality guided encoder that minimizes the perceptual dissimilarity of the latent space embedding of the synthesized and real image at different layers in the network have shown to be powerful tools towards better multi modal learning techniques. In general, our overall approach was aimed at improving target detection systems and the visual appeal of synthesized images while incorporating multiple attribute assignment to the generator without compromising the identity of the synthesized image. We synthesized sketches using XDOG filter for the CelebA, Multi-modal and CelebA-HQ datasets and from an auxiliary generator trained on sketches from CUHK, IIT-D and FERET datasets. Our results overall for different model applications are impressive compared to current state of the art

The Research Repository @ WVU (West Virginia University)

Towards End-to-end Video-based Eye-Tracking

Author: E Chong
O Chapelle
PK Mital
Q Huang
S Hochreiter
S Park
T Fischer
Y Cheng
Y Sugano
Z Li
Publication venue
Publication date: 01/01/2020
Field of study

Estimating eye-gaze from images alone is a challenging task, in large parts due to un-observable person-specific factors. Achieving high accuracy typically requires labeled data from test users which may not be attainable in real applications. We observe that there exists a strong relationship between what users are looking at and the appearance of the user's eyes. In response to this understanding, we propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships. Our video dataset consists of time-synchronized screen recordings, user-facing camera views, and eye gaze data, which allows for new benchmarks in temporal gaze tracking as well as label-free refinement of gaze. Importantly, we demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures acquired through supervised personalization. Our final method yields significant performance improvements on our proposed EVE dataset, with up to a 28 percent improvement in Point-of-Gaze estimates (resulting in 2.49 degrees in angular error), paving the path towards high-accuracy screen-based eye tracking purely from webcam sensors. The dataset and reference source code are available at https://ait.ethz.ch/projects/2020/EVEComment: Accepted at ECCV 202

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Recommended from our members

An Investigation into the Performance of Ethnicity Verification Between Humans and Machine Learning Algorithms

Author: Jilani Shelina K.
Publication venue: Faculty of Engineering and Informatics. School of Media, Design and Technology
Publication date: 01/01/2020
Field of study

There has been a significant increase in the interest for the task of classifying demographic profiles i.e. race and ethnicity. Ethnicity is a significant human characteristic and applying facial image data for the discrimination of ethnicity is integral to face-related biometric systems. Given the diversity in the application of ethnicity-specific information such as face recognition and iris recognition, and the availability of image datasets for more commonly available human populations, i.e. Caucasian, African-American, Asians, and South-Asian Indians. A gap has been identified for the development of a system which analyses the full-face and its individual feature-components (eyes, nose and mouth), for the Pakistani ethnic group. An efficient system is proposed for the verification of the Pakistani ethnicity, which incorporates a two-tier (computer vs human) approach. Firstly, hand-crafted features were used to ascertain the descriptive nature of a frontal-image and facial profile, for the Pakistani ethnicity. A total of 26 facial landmarks were selected (16 frontal and 10 for the profile) and by incorporating 2 models for redundant information removal, and a linear classifier for the binary task. The experimental results concluded that the facial profile image of a Pakistani face is distinct amongst other ethnicities. However, the methodology consisted of limitations for example, low performance accuracy, the laborious nature of manual data i.e. facial landmark, annotation, and the small facial image dataset. To make the system more accurate and robust, Deep Learning models are employed for ethnicity classification. Various state-of-the-art Deep models are trained on a range of facial image conditions, i.e. full face and partial-face images, plus standalone feature components such as the nose and mouth. Since ethnicity is pertinent to the research, a novel facial image database entitled Pakistani Face Database (PFDB), was created using a criterion-specific selection process, to ensure assurance in each of the assigned class-memberships, i.e. Pakistani and Non-Pakistani. Comparative analysis between 6 Deep Learning models was carried out on augmented image datasets, and the analysis demonstrates that Deep Learning yields better performance accuracy compared to low-level features. The human phase of the ethnicity classification framework tested the discrimination ability of novice Pakistani and Non-Pakistani participants, using a computerised ethnicity task. The results suggest that humans are better at discriminating between Pakistani and Non-Pakistani full face images, relative to individual face-feature components (eyes, nose, mouth), struggling the most with the nose, when making judgements of ethnicity. To understand the effects of display conditions on ethnicity discrimination accuracy, two conditions were tested; (i) Two-Alternative Forced Choice (2-AFC) and (ii) Single image procedure. The results concluded that participants perform significantly better in trials where the target (Pakistani) image is shown alongside a distractor (Non-Pakistani) image. To conclude the proposed framework, directions for future study are suggested to advance the current understanding of image based ethnicity verification.Acumé Forensi

Bradford Scholars

Survey of Social Bias in Vision-Language Models

Author: Bang Yejin
Cahyawijaya Samuel
Dai Wenliang
Fung Pascale
Lee Nayeon
Lovenia Holy
Publication venue
Publication date: 24/09/2023
Field of study

In recent years, the rapid advancement of machine learning (ML) models, particularly transformer-based pre-trained models, has revolutionized Natural Language Processing (NLP) and Computer Vision (CV) fields. However, researchers have discovered that these models can inadvertently capture and reinforce social biases present in their training datasets, leading to potential social harms, such as uneven resource allocation and unfair representation of specific social groups. Addressing these biases and ensuring fairness in artificial intelligence (AI) systems has become a critical concern in the ML community. The recent introduction of pre-trained vision-and-language (VL) models in the emerging multimodal field demands attention to the potential social biases present in these models as well. Although VL models are susceptible to social bias, there is a limited understanding compared to the extensive discussions on bias in NLP and CV. This survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL. By examining these perspectives, the survey aims to offer valuable guidelines on how to approach and mitigate social bias in both unimodal and multimodal settings. The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models in various applications and research endeavors

arXiv.org e-Print Archive