Despite the development of effective deepfake detection models in recent
years, several recent studies have demonstrated that biases in the training
data utilized to develop deepfake detection models can lead to unfair
performance for demographic groups of different races and/or genders. Such can
result in these groups being unfairly targeted or excluded from detection,
allowing misclassified deepfakes to manipulate public opinion and erode trust
in the model. While these studies have focused on identifying and evaluating
the unfairness in deepfake detection, no methods have been developed to address
the fairness issue of deepfake detection at the algorithm level. In this work,
we make the first attempt to improve deepfake detection fairness by proposing
novel loss functions to train fair deepfake detection models in ways that are
agnostic or aware of demographic factors. Extensive experiments on four
deepfake datasets and five deepfake detectors demonstrate the effectiveness and
flexibility of our approach in improving the deepfake detection fairness