62 research outputs found
Direct Ensemble Estimation of Density Functionals
Estimating density functionals of analog sources is an important problem in
statistical signal processing and information theory. Traditionally, estimating
these quantities requires either making parametric assumptions about the
underlying distributions or using non-parametric density estimation followed by
integration. In this paper we introduce a direct nonparametric approach which
bypasses the need for density estimation by using the error rates of k-NN
classifiers asdata-driven basis functions that can be combined to estimate a
range of density functionals. However, this method is subject to a non-trivial
bias that dramatically slows the rate of convergence in higher dimensions. To
overcome this limitation, we develop an ensemble method for estimating the
value of the basis function which, under some minor constraints on the
smoothness of the underlying distributions, achieves the parametric rate of
convergence regardless of data dimension.Comment: 5 page
Investigating the Effects of Word Substitution Errors on Sentence Embeddings
A key initial step in several natural language processing (NLP) tasks
involves embedding phrases of text to vectors of real numbers that preserve
semantic meaning. To that end, several methods have been recently proposed with
impressive results on semantic similarity tasks. However, all of these
approaches assume that perfect transcripts are available when generating the
embeddings. While this is a reasonable assumption for analysis of written text,
it is limiting for analysis of transcribed text. In this paper we investigate
the effects of word substitution errors, such as those coming from automatic
speech recognition errors (ASR), on several state-of-the-art sentence embedding
methods. To do this, we propose a new simulator that allows the experimenter to
induce ASR-plausible word substitution errors in a corpus at a desired word
error rate. We use this simulator to evaluate the robustness of several
sentence embedding methods. Our results show that pre-trained neural sentence
encoders are both robust to ASR errors and perform well on textual similarity
tasks after errors are introduced. Meanwhile, unweighted averages of word
vectors perform well with perfect transcriptions, but their performance
degrades rapidly on textual similarity tasks for text with word substitution
errors.Comment: 4 Pages, 2 figures. Copyright IEEE 2019. Accepted and to appear in
the Proceedings of the 44th International Conference on Acoustics, Speech,
and Signal Processing 2019 (IEEE-ICASSP-2019), May 12-17 in Brighton, U.K.
Personal use of this material is permitted. However, permission to
reprint/republish this material must be obtained from the IEE
Simulating dysarthric speech for training data augmentation in clinical speech applications
Training machine learning algorithms for speech applications requires large,
labeled training data sets. This is problematic for clinical applications where
obtaining such data is prohibitively expensive because of privacy concerns or
lack of access. As a result, clinical speech applications are typically
developed using small data sets with only tens of speakers. In this paper, we
propose a method for simulating training data for clinical applications by
transforming healthy speech to dysarthric speech using adversarial training. We
evaluate the efficacy of our approach using both objective and subjective
criteria. We present the transformed samples to five experienced
speech-language pathologists (SLPs) and ask them to identify the samples as
healthy or dysarthric. The results reveal that the SLPs identify the
transformed speech as dysarthric 65% of the time. In a pilot classification
experiment, we show that by using the simulated speech samples to balance an
existing dataset, the classification accuracy improves by about 10% after data
augmentation.Comment: Will appear in Proc. of ICASSP 201
Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer
A good supervised embedding for a specific machine learning task is only
sensitive to changes in the label of interest and is invariant to other
confounding factors. We leverage the concept of repeatability from measurement
theory to describe this property and propose to use the intra-class correlation
coefficient (ICC) to evaluate the repeatability of embeddings. We then propose
a novel regularizer, the ICC regularizer, as a complementary component for
contrastive losses to guide deep neural networks to produce embeddings with
higher repeatability. We use simulated data to explain why the ICC regularizer
works better on minimizing the intra-class variance than the contrastive loss
alone. We implement the ICC regularizer and apply it to three speech tasks:
speaker verification, voice style conversion, and a clinical application for
detecting dysphonic voice. The experimental results demonstrate that adding an
ICC regularizer can improve the repeatability of learned embeddings compared to
only using the contrastive loss; further, these embeddings lead to improved
performance in these downstream tasks.Comment: Accepted by NeurIPS 202
A label-efficient two-sample test
Two-sample tests evaluate whether two samples are realizations of the same
distribution (the null hypothesis) or two different distributions (the
alternative hypothesis). We consider a new setting for this problem where
sample features are easily measured whereas sample labels are unknown and
costly to obtain. Accordingly, we devise a three-stage framework in service of
performing an effective two-sample test with only a small number of sample
label queries: first, a classifier is trained with samples uniformly labeled to
model the posterior probabilities of the labels; second, a novel query scheme
dubbed \emph{bimodal query} is used to query labels of samples from both
classes, and last, the classical Friedman-Rafsky (FR) two-sample test is
performed on the queried samples. Theoretical analysis and extensive
experiments performed on several datasets demonstrate that the proposed test
controls the Type I error and has decreased Type II error relative to uniform
querying and certainty-based querying. Source code for our algorithms and
experimental results is available at
\url{https://github.com/wayne0908/Label-Efficient-Two-Sample}.Comment: Accepted to the 38th conference on Uncertainty in Artificial
Intelligence (UAI2022
- …