8 research outputs found
A comparison of linear and non-linear calibrations for speaker recognition
In recent work on both generative and discriminative score to
log-likelihood-ratio calibration, it was shown that linear transforms give good
accuracy only for a limited range of operating points. Moreover, these methods
required tailoring of the calibration training objective functions in order to
target the desired region of best accuracy. Here, we generalize the linear
recipes to non-linear ones. We experiment with a non-linear, non-parametric,
discriminative PAV solution, as well as parametric, generative,
maximum-likelihood solutions that use Gaussian, Student's T and
normal-inverse-Gaussian score distributions. Experiments on NIST SRE'12 scores
suggest that the non-linear methods provide wider ranges of optimal accuracy
and can be trained without having to resort to objective function tailoring.Comment: accepted for Odyssey 2014: The Speaker and Language Recognition
Worksho
A Generative Model for Score Normalization in Speaker Recognition
We propose a theoretical framework for thinking about score normalization,
which confirms that normalization is not needed under (admittedly fragile)
ideal conditions. If, however, these conditions are not met, e.g. under
data-set shift between training and runtime, our theory reveals dependencies
between scores that could be exploited by strategies such as score
normalization. Indeed, it has been demonstrated over and over experimentally,
that various ad-hoc score normalization recipes do work. We present a first
attempt at using probability theory to design a generative score-space
normalization model which gives similar improvements to ZT-norm on the
text-dependent RSR 2015 database
A Speaker Verification Backend with Robust Performance across Conditions
In this paper, we address the problem of speaker verification in conditions
unseen or unknown during development. A standard method for speaker
verification consists of extracting speaker embeddings with a deep neural
network and processing them through a backend composed of probabilistic linear
discriminant analysis (PLDA) and global logistic regression score calibration.
This method is known to result in systems that work poorly on conditions
different from those used to train the calibration model. We propose to modify
the standard backend, introducing an adaptive calibrator that uses duration and
other automatically extracted side-information to adapt to the conditions of
the inputs. The backend is trained discriminatively to optimize binary
cross-entropy. When trained on a number of diverse datasets that are labeled
only with respect to speaker, the proposed backend consistently and, in some
cases, dramatically improves calibration, compared to the standard PLDA
approach, on a number of held-out datasets, some of which are markedly
different from the training data. Discrimination performance is also
consistently improved. We show that joint training of the PLDA and the adaptive
calibrator is essential -- the same benefits cannot be achieved when freezing
PLDA and fine-tuning the calibrator. To our knowledge, the results in this
paper are the first evidence in the literature that it is possible to develop a
speaker verification system with robust out-of-the-box performance on a large
variety of conditions
Bi-Gaussianized calibration of likelihood ratios
For a perfectly calibrated forensic evaluation system, the likelihood ratio of the likelihood ratio is the likelihood ratio. Conversion of uncalibrated log-likelihood ratios (scores) to calibrated log-likelihood ratios is often performed using logistic regression. The results, however, may be far from perfectly calibrated. We propose and demonstrate a new calibration method, “bi-Gaussianized calibration,” that warps scores toward perfectly calibrated log-likelihood-ratio distributions. Using both synthetic and real data, we demonstrate that bi-Gaussianized calibration leads to better calibration than does logistic regression, that it is robust to score distributions that violate the assumption of two Gaussians with the same variance, and that it is competitive with logistic-regression calibration in terms of performance measured using log-likelihood-ratio cost (Cllr). We also demonstrate advantages of bi-Gaussianized calibration over calibration using pool-adjacent violators (PAV). Based on bi-Gaussianized calibration, we also propose a graphical representation that may help explain the meaning of likelihood ratios to triers of fact
Feature Fusion for Fingerprint Liveness Detection
For decades, fingerprints have been the most widely used biometric trait in identity
recognition systems, thanks to their natural uniqueness, even in rare cases such as
identical twins. Recently, we witnessed a growth in the use of fingerprint-based
recognition systems in a large variety of devices and applications. This, as a consequence,
increased the benefits for offenders capable of attacking these systems. One
of the main issues with the current fingerprint authentication systems is that, even
though they are quite accurate in terms of identity verification, they can be easily
spoofed by presenting to the input sensor an artificial replica of the fingertip skin’s
ridge-valley patterns.
Due to the criticality of this threat, it is crucial to develop countermeasure
methods capable of facing and preventing these kind of attacks. The most effective
counter–spoofing methods are those trying to distinguish between a "live" and a
"fake" fingerprint before it is actually submitted to the recognition system. According
to the technology used, these methods are mainly divided into hardware and software-based
systems. Hardware-based methods rely on extra sensors to gain more pieces
of information regarding the vitality of the fingerprint owner. On the contrary,
software-based methods merely rely on analyzing the fingerprint images acquired
by the scanner. Software-based methods can then be further divided into dynamic,
aimed at analyzing sequences of images to capture those vital signs typical of a real
fingerprint, and static, which process a single fingerprint impression. Among these
different approaches, static software-based methods come with three main benefits.
First, they are cheaper, since they do not require the deployment of any additional
sensor to perform liveness detection. Second, they are faster since the information
they require is extracted from the same input image acquired for the identification
task. Third, they are potentially capable of tackling novel forms of attack through an
update of the software. The interest in this type of counter–spoofing methods is at the basis of this
dissertation, which addresses the fingerprint liveness detection under a peculiar
perspective, which stems from the following consideration. Generally speaking, this
problem has been tackled in the literature with many different approaches. Most of
them are based on first identifying the most suitable image features for the problem
in analysis and, then, into developing some classification system based on them. In
particular, most of the published methods rely on a single type of feature to perform
this task. Each of this individual features can be more or less discriminative and often
highlights some peculiar characteristics of the data in analysis, often complementary
with that of other feature. Thus, one possible idea to improve the classification
accuracy is to find effective ways to combine them, in order to mutually exploit their
individual strengths and soften, at the same time, their weakness. However, such a
"multi-view" approach has been relatively overlooked in the literature.
Based on the latter observation, the first part of this work attempts to investigate
proper feature fusion methods capable of improving the generalization and robustness
of fingerprint liveness detection systems and enhance their classification strength.
Then, in the second part, it approaches the feature fusion method in a different way,
that is by first dividing the fingerprint image into smaller parts, then extracting an
evidence about the liveness of each of these patches and, finally, combining all these
pieces of information in order to take the final classification decision.
The different approaches have been thoroughly analyzed and assessed by comparing
their results (on a large number of datasets and using the same experimental
protocol) with that of other works in the literature. The experimental results discussed
in this dissertation show that the proposed approaches are capable of obtaining
state–of–the–art results, thus demonstrating their effectiveness
Analyzing and Applying Cryptographic Mechanisms to Protect Privacy in Applications
Privacy-Enhancing Technologies (PETs) emerged as a technology-based response to the increased collection and storage of data as well as the associated threats to individuals' privacy in modern applications. They rely on a variety of cryptographic mechanisms that allow to perform some computation without directly obtaining knowledge of plaintext information. However, many challenges have so far prevented effective real-world usage in many existing applications. For one, some mechanisms leak some information or have been proposed outside of security models established within the cryptographic community, leaving open how effective they are at protecting privacy in various applications. Additionally, a major challenge causing PETs to remain largely academic is their practicality-in both efficiency and usability. Cryptographic mechanisms introduce a lot of overhead, which is mostly prohibitive, and due to a lack of high-level tools are very hard to integrate for outsiders.
In this thesis, we move towards making PETs more effective and practical in protecting privacy in numerous applications. We take a two-sided approach of first analyzing the effective security (cryptanalysis) of candidate mechanisms and then building constructions and tools (cryptographic engineering) for practical use in specified emerging applications in the domain of machine learning crucial to modern use cases. In the process, we incorporate an interdisciplinary perspective for analyzing mechanisms and by collaboratively building privacy-preserving architectures with requirements from the application domains' experts.
Cryptanalysis. While mechanisms like Homomorphic Encryption (HE) or Secure Multi-Party Computation (SMPC) provably leak no additional information, Encrypted Search Algorithms (ESAs) and Randomization-only Two-Party Computation (RoTPC) possess additional properties that require cryptanalysis to determine effective privacy protection.
ESAs allow for search on encrypted data, an important functionality in many applications. Most efficient ESAs possess some form of well-defined information leakage, which is cryptanalyzed via a breadth of so-called leakage attacks proposed in the literature. However, it is difficult to assess their practical effectiveness given that previous evaluations were closed-source, used restricted data, and made assumptions about (among others) the query distribution because real-world query data is very hard to find. For these reasons, we re-implement known leakage attacks in an open-source framework and perform a systematic empirical re-evaluation of them using a variety of new data sources that, for the first time, contain real-world query data. We obtain many more complete and novel results where attacks work much better or much worse than what was expected based on previous evaluations.
RoTPC mechanisms require cryptanalysis as they do not rely on established techniques and security models, instead obfuscating messages using only randomizations. A prominent protocol is a privacy-preserving scalar product protocol by Lu et al. (IEEE TPDS'13). We show that this protocol is formally insecure and that this translates to practical insecurity by presenting attacks that even allow to test for certain inputs, making the case for more scrutiny of RoTPC protocols used as PETs.
This part of the thesis is based on the following two publications:
[KKM+22] S. KAMARA, A. KATI, T. MOATAZ, T. SCHNEIDER, A. TREIBER, M. YONLI. “SoK: Cryptanalysis of Encrypted Search with LEAKER - A framework for LEakage AttacK Evaluation on Real-world data”. In: 7th IEEE European Symposium on Security and Privacy (EuroS&P’22). Full version: https://ia.cr/2021/1035. Code: https://encrypto.de/code/LEAKER. IEEE, 2022, pp. 90–108. Appendix A.
[ST20] T. SCHNEIDER , A. TREIBER. “A Comment on Privacy-Preserving Scalar Product Protocols as proposed in “SPOC””. In: IEEE Transactions on Parallel and Distributed Systems (TPDS) 31.3 (2020). Full version: https://arxiv.org/abs/1906.04862. Code: https://encrypto.de/code/SPOCattack, pp. 543–546. CORE Rank A*. Appendix B.
Cryptographic Engineering. Given the above results about cryptanalysis, we investigate using the leakage-free and provably-secure cryptographic mechanisms of HE and SMPC to protect privacy in machine learning applications. As much of the cryptographic community has focused on PETs for neural network applications, we focus on two other important applications and models: Speaker recognition and sum product networks. We particularly show the efficiency of our solutions in possible real-world scenarios and provide tools usable for non-domain experts.
In speaker recognition, a user's voice data is matched with reference data stored at the service provider. Using HE and SMPC, we build the first privacy-preserving speaker recognition system that includes the state-of-the-art technique of cohort score normalization using cohort pruning via SMPC. Then, we build a privacy-preserving speaker recognition system relying solely on SMPC, which we show outperforms previous solutions based on HE by a factor of up to 4000x. We show that both our solutions comply with specific standards for biometric information protection and, thus, are effective and practical PETs for speaker recognition.
Sum Product Networks (SPNs) are noteworthy probabilistic graphical models that-like neural networks-also need efficient methods for privacy-preserving inference as a PET. We present CryptoSPN, which uses SMPC for privacy-preserving inference of SPNs that (due to a combination of machine learning and cryptographic techniques and contrary to most works on neural networks) even hides the network structure. Our implementation is integrated into the prominent SPN framework SPFlow and evaluates medium-sized SPNs within seconds.
This part of the thesis is based on the following three publications:
[NPT+19] A. NAUTSCH, J. PATINO, A. TREIBER, T. STAFYLAKIS, P. MIZERA, M. TODISCO, T. SCHNEIDER, N. EVANS. Privacy-Preserving Speaker Recognition with Cohort Score Normalisation”. In: 20th Conference of the International Speech Communication Association (INTERSPEECH’19). Online: https://arxiv.org/abs/1907.03454. International Speech Communication Association (ISCA), 2019, pp. 2868–2872. CORE Rank A. Appendix C.
[TNK+19] A. TREIBER, A. NAUTSCH , J. KOLBERG , T. SCHNEIDER , C. BUSCH. “Privacy-Preserving PLDA Speaker Verification using Outsourced Secure Computation”. In: Speech Communication 114 (2019). Online: https://encrypto.de/papers/TNKSB19.pdf. Code: https://encrypto.de/code/PrivateASV, pp. 60–71. CORE Rank B. Appendix D.
[TMW+20] A. TREIBER , A. MOLINA , C. WEINERT , T. SCHNEIDER , K. KERSTING. “CryptoSPN: Privacy-preserving Sum-Product Network Inference”. In: 24th European Conference on Artificial Intelligence (ECAI’20). Full version: https://arxiv.org/abs/2002.00801. Code: https://encrypto.de/code/CryptoSPN. IOS Press, 2020, pp. 1946–1953. CORE Rank A. Appendix E.
Overall, this thesis contributes to a broader security analysis of cryptographic mechanisms and new systems and tools to effectively protect privacy in various sought-after applications