40 research outputs found

    A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

    Full text link
    Recent work has managed to learn cross-lingual word embeddings without parallel data by mapping monolingual embeddings to a shared space through adversarial training. However, their evaluation has focused on favorable conditions, using comparable corpora or closely-related languages, and we show that they often fail in more realistic scenarios. This work proposes an alternative approach based on a fully unsupervised initialization that explicitly exploits the structural similarity of the embeddings, and a robust self-learning algorithm that iteratively improves this solution. Our method succeeds in all tested scenarios and obtains the best published results in standard datasets, even surpassing previous supervised systems. Our implementation is released as an open source project at https://github.com/artetxem/vecmapComment: ACL 201

    Language-independent speaker anonymization using orthogonal Householder neural network

    Full text link
    Speaker anonymization aims to conceal a speaker's identity while preserving content information in speech. Current mainstream neural-network speaker anonymization systems disentangle speech into prosody-related, content, and speaker representations. The speaker representation is then anonymized by a selection-based speaker anonymizer that uses a mean vector over a set of randomly selected speaker vectors from an external pool of English speakers. However, the resulting anonymized vectors are subject to severe privacy leakage against powerful attackers, reduction in speaker diversity, and language mismatch problems for unseen language speaker anonymization. To generate diverse, language-neutral speaker vectors, this paper proposes an anonymizer based on an orthogonal Householder neural network (OHNN). Specifically, the OHNN acts like a rotation to transform the original speaker vectors into anonymized speaker vectors, which are constrained to follow the distribution over the original speaker vector space. A basic classification loss is introduced to ensure that anonymized speaker vectors from different speakers have unique speaker identities. To further protect speaker identities, an improved classification loss and similarity loss are used to push original-anonymized sample pairs away from each other. Experiments on VoicePrivacy Challenge datasets in English and the AISHELL-3 dataset in Mandarin demonstrate the proposed anonymizer's effectiveness

    Itzulpen automatiko gainbegiratu gabea

    Get PDF
    192 p.Modern machine translation relies on strong supervision in the form of parallel corpora. Such arequirement greatly departs from the way in which humans acquire language, and poses a major practicalproblem for low-resource language pairs. In this thesis, we develop a new paradigm that removes thedependency on parallel data altogether, relying on nothing but monolingual corpora to train unsupervisedmachine translation systems. For that purpose, our approach first aligns separately trained wordrepresentations in different languages based on their structural similarity, and uses them to initializeeither a neural or a statistical machine translation system, which is further trained through iterative backtranslation.While previous attempts at learning machine translation systems from monolingual corporahad strong limitations, our work¿along with other contemporaneous developments¿is the first to reportpositive results in standard, large-scale settings, establishing the foundations of unsupervised machinetranslation and opening exciting opportunities for future research

    Itzulpen automatiko gainbegiratu gabea

    Get PDF
    192 p.Modern machine translation relies on strong supervision in the form of parallel corpora. Such arequirement greatly departs from the way in which humans acquire language, and poses a major practicalproblem for low-resource language pairs. In this thesis, we develop a new paradigm that removes thedependency on parallel data altogether, relying on nothing but monolingual corpora to train unsupervisedmachine translation systems. For that purpose, our approach first aligns separately trained wordrepresentations in different languages based on their structural similarity, and uses them to initializeeither a neural or a statistical machine translation system, which is further trained through iterative backtranslation.While previous attempts at learning machine translation systems from monolingual corporahad strong limitations, our work¿along with other contemporaneous developments¿is the first to reportpositive results in standard, large-scale settings, establishing the foundations of unsupervised machinetranslation and opening exciting opportunities for future research

    Modelling Digital Media Objects

    Get PDF

    Nonparametric enrichment in computational and biological representations of distributions

    Get PDF
    This thesis proposes nonparametric techniques to enhance unsupervised learning methods in computational or biological contexts. Representations of intractable distributions and their relevant statistics are enhanced by nonparametric components trained to handle challenging estimation problems. The first part introduces a generic algorithm for learning generative latent variable models. In contrast to traditional variational learning, no representation for the intractable posterior distributions are computed, making it agnostic to the model structure and the support of latent variables. Kernel ridge regression is used to consistently estimate the gradient for learning. In many unsupervised tasks, this approach outperforms advanced alternatives based on the expectation-maximisation algorithm and variational approximate inference. In the second part, I train a model of data known as the kernel exponential family density. The kernel, used to describe smooth functions, is augmented by a parametric component trained using an efficient meta-learning procedure; meta-learning prevents overfitting as would occur using conventional routines. After training, the contours of the kernel become adaptive to the local geometry of the underlying density. Compared to maximum-likelihood learning, our method better captures the shape of the density, which is the desired quantity in many downstream applications. The final part sees how nonparametric ideas contribute to understanding uncertainty computation in the brain. First, I show that neural networks can learn to represent uncertainty using the distributed distributional code (DDC), a representation similar to the nonparametric kernel mean embedding. I then derive several DDC-based message-passing algorithms, including computations of filtering and real-time smoothing. The latter is a common neural computation embodied in many postdictive phenomena of perception in multiple modalities. The main idea behind these algorithms is least-squares regression, where the training data are simulated from an internal model. The internal model can be concurrently updated to follow the statistics in sensory stimuli, enabling adaptive inference

    Unsupervised learning for text-to-speech synthesis

    Get PDF
    This thesis introduces a general method for incorporating the distributional analysis of textual and linguistic objects into text-to-speech (TTS) conversion systems. Conventional TTS conversion uses intermediate layers of representation to bridge the gap between text and speech. Collecting the annotated data needed to produce these intermediate layers is a far from trivial task, possibly prohibitively so for languages in which no such resources are in existence. Distributional analysis, in contrast, proceeds in an unsupervised manner, and so enables the creation of systems using textual data that are not annotated. The method therefore aids the building of systems for languages in which conventional linguistic resources are scarce, but is not restricted to these languages. The distributional analysis proposed here places the textual objects analysed in a continuous-valued space, rather than specifying a hard categorisation of those objects. This space is then partitioned during the training of acoustic models for synthesis, so that the models generalise over objects' surface forms in a way that is acoustically relevant. The method is applied to three levels of textual analysis: to the characterisation of sub-syllabic units, word units and utterances. Entire systems for three languages (English, Finnish and Romanian) are built with no reliance on manually labelled data or language-specific expertise. Results of a subjective evaluation are presented

    Spectral Target Detection using Physics-Based Modeling and a Manifold Learning Technique

    Get PDF
    Identification of materials from calibrated radiance data collected by an airborne imaging spectrometer depends strongly on the atmospheric and illumination conditions at the time of collection. This thesis demonstrates a methodology for identifying material spectra using the assumption that each unique material class forms a lower-dimensional manifold (surface) in the higher-dimensional spectral radiance space and that all image spectra reside on, or near, these theoretic manifolds. Using a physical model, a manifold characteristic of the target material exposed to varying illumination and atmospheric conditions is formed. A graph-based model is then applied to the radiance data to capture the intricate structure of each material manifold, followed by the application of the commute time distance (CTD) transformation to separate the target manifold from the background. Detection algorithms are then applied in the CTD subspace. This nonlinear transformation is based on a random walk on a graph and is derived from an eigendecomposition of the pseudoinverse of the graph Laplacian matrix. This work provides a geometric interpretation of the CTD transformation, its algebraic properties, the atmospheric and illumination parameters varied in the physics-based model, and the influence the target manifold samples have on the orientation of the coordinate axes in the transformed space. This thesis concludes by demonstrating improved detection results in the CTD subspace as compared to detection in the original spectral radiance space
    corecore