1 research outputs found
About Voice: A Longitudinal Study of Speaker Recognition Dataset Dynamics
Like face recognition, speaker recognition is widely used for voice-based
biometric identification in a broad range of industries, including banking,
education, recruitment, immigration, law enforcement, healthcare, and
well-being. However, while dataset evaluations and audits have improved data
practices in computer vision and face recognition, the data practices in
speaker recognition have gone largely unquestioned. Our research aims to
address this gap by exploring how dataset usage has evolved over time and what
implications this has on bias and fairness in speaker recognition systems.
Previous studies have demonstrated the presence of historical, representation,
and measurement biases in popular speaker recognition benchmarks. In this
paper, we present a longitudinal study of speaker recognition datasets used for
training and evaluation from 2012 to 2021. We survey close to 700 papers to
investigate community adoption of datasets and changes in usage over a crucial
time period where speaker recognition approaches transitioned to the widespread
adoption of deep neural networks. Our study identifies the most commonly used
datasets in the field, examines their usage patterns, and assesses their
attributes that affect bias, fairness, and other ethical concerns. Our findings
suggest areas for further research on the ethics and fairness of speaker
recognition technology.Comment: 14 pages (23 with References and Appendix