25,408 research outputs found
An Open Source Framework for Standardized Comparisons of Face Recognition Algorithms
In this paper we introduce the facereclib, the first software library that allows to compare a variety of face recognition algorithms on most of the known facial image databases and that permits rapid prototyping of novel ideas and testing of meta-parameters of face recognition algorithms. The facereclib is built on the open source signal processing and machine learning library Bob. It uses well-specified face recognition protocols to ensure that results are comparable and reproducible. We show that the face recognition algorithms implemented in Bob as well as third party face recognition libraries can be used to run face recognition experiments within the framework of the facereclib. As a proof of concept, we execute four different state-of-the-art face recognition algorithms: local Gabor binary pattern histogram sequences (LGBPHS), Gabor graph comparisons with a Gabor phase based similarity measure, inter-session variability modeling (ISV) of DCT block features, and the linear discriminant analysis on two different color channels (LDA-IR) on two different databases: The Good, The Bad, & The Ugly, and the BANCA database, in all cases using their fixed protocols. The results show that there is not one face recognition algorithm that outperforms all others, but rather that the results are strongly dependent on the employed database
ImageNet Large Scale Visual Recognition Challenge
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in
object category classification and detection on hundreds of object categories
and millions of images. The challenge has been run annually from 2010 to
present, attracting participation from more than fifty institutions.
This paper describes the creation of this benchmark dataset and the advances
in object recognition that have been possible as a result. We discuss the
challenges of collecting large-scale ground truth annotation, highlight key
breakthroughs in categorical object recognition, provide a detailed analysis of
the current state of the field of large-scale image classification and object
detection, and compare the state-of-the-art computer vision accuracy with human
accuracy. We conclude with lessons learned in the five years of the challenge,
and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL
VOC (per-category comparisons in Table 3, distribution of localization
difficulty in Fig 16), a list of queries used for obtaining object detection
images (Appendix C), and some additional reference
Model Cards for Model Reporting
Trained machine learning models are increasingly used to perform high-impact
tasks in areas such as law enforcement, medicine, education, and employment. In
order to clarify the intended use cases of machine learning models and minimize
their usage in contexts for which they are not well suited, we recommend that
released models be accompanied by documentation detailing their performance
characteristics. In this paper, we propose a framework that we call model
cards, to encourage such transparent model reporting. Model cards are short
documents accompanying trained machine learning models that provide benchmarked
evaluation in a variety of conditions, such as across different cultural,
demographic, or phenotypic groups (e.g., race, geographic location, sex,
Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex
and Fitzpatrick skin type) that are relevant to the intended application
domains. Model cards also disclose the context in which models are intended to
be used, details of the performance evaluation procedures, and other relevant
information. While we focus primarily on human-centered machine learning models
in the application fields of computer vision and natural language processing,
this framework can be used to document any trained machine learning model. To
solidify the concept, we provide cards for two supervised models: One trained
to detect smiling faces in images, and one trained to detect toxic comments in
text. We propose model cards as a step towards the responsible democratization
of machine learning and related AI technology, increasing transparency into how
well AI technology works. We hope this work encourages those releasing trained
machine learning models to accompany model releases with similar detailed
evaluation numbers and other relevant documentation
DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection
A critical yet frequently overlooked challenge in the field of deepfake
detection is the lack of a standardized, unified, comprehensive benchmark. This
issue leads to unfair performance comparisons and potentially misleading
results. Specifically, there is a lack of uniformity in data processing
pipelines, resulting in inconsistent data inputs for detection models.
Additionally, there are noticeable differences in experimental settings, and
evaluation strategies and metrics lack standardization. To fill this gap, we
present the first comprehensive benchmark for deepfake detection, called
DeepfakeBench, which offers three key contributions: 1) a unified data
management system to ensure consistent input across all detectors, 2) an
integrated framework for state-of-the-art methods implementation, and 3)
standardized evaluation metrics and protocols to promote transparency and
reproducibility. Featuring an extensible, modular-based codebase, DeepfakeBench
contains 15 state-of-the-art detection methods, 9 deepfake datasets, a series
of deepfake detection evaluation protocols and analysis tools, as well as
comprehensive evaluations. Moreover, we provide new insights based on extensive
analysis of these evaluations from various perspectives (e.g., data
augmentations, backbones). We hope that our efforts could facilitate future
research and foster innovation in this increasingly critical domain. All codes,
evaluations, and analyses of our benchmark are publicly available at
https://github.com/SCLBD/DeepfakeBench
- …