Face image quality assessment (FIQA) attempts to improve face recognition
(FR) performance by providing additional information about sample quality.
Because FIQA methods attempt to estimate the utility of a sample for face
recognition, it is reasonable to assume that these methods are heavily
influenced by the underlying face recognition system. Although modern face
recognition systems are known to perform well, several studies have found that
such systems often exhibit problems with demographic bias. It is therefore
likely that such problems are also present with FIQA techniques. To investigate
the demographic biases associated with FIQA approaches, this paper presents a
comprehensive study involving a variety of quality assessment methods
(general-purpose image quality assessment, supervised face quality assessment,
and unsupervised face quality assessment methods) and three diverse
state-of-theart FR models. Our analysis on the Balanced Faces in the Wild (BFW)
dataset shows that all techniques considered are affected more by variations in
race than sex. While the general-purpose image quality assessment methods
appear to be less biased with respect to the two demographic factors
considered, the supervised and unsupervised face image quality assessment
methods both show strong bias with a tendency to favor white individuals (of
either sex). In addition, we found that methods that are less racially biased
perform worse overall. This suggests that the observed bias in FIQA methods is
to a significant extent related to the underlying face recognition system.Comment: The content of this paper was published in EUSIPCO 202