808 research outputs found

    Representing Low-Resource Languages and Dialects: Improved Neural Methods for Spoken Language Processing

    Get PDF
    Languages are fundamental to human communication and serve as a means to express social and cultural values. However, many people treat languages as homogeneous entities, disregarding the fact that they are often composed of multiple varieties. These language varieties may be tied to certain geographical locations or the cultural identity of the speakers.Studying language variation can thus provide valuable insights into how language varieties relate to their linguistic communities. Most language varieties do not correspond to administrative boundaries, such as provinces or states within nations, and neighboring varieties often transition gradually.In this dissertation, we presented a new method to describe and model linguistic diversity. Specifically, we leveraged deep learning or artificial neural network models to quantify differences between the pronunciations of speakers from different language varieties. This new method assesses the differences between language varieties more accurately and efficiently compared to previously-used methods.Additionally, we investigated the use of these neural network models to develop speech technology to help empower language varieties. We developed an audio-based search algorithm that can automatically identify occurrences of a spoken search term in a large collection of spoken materials, improving access to resources that would normally require manual annotation. Furthermore, we presented approaches to improve speech recognition performance for several language varieties from different language families. This technology could, for example, be used to generate subtitles for videos or television broadcasts. This can be a promising step towards the important goal of developing speech technology that is inclusive of the world’s languages

    Mission, Performance Indicators, and Assessment in U. S. Honors: A View from the Netherlands

    Get PDF
    Amission statement that identifies the goals and aims of an honors program is a key step in program development. The NCHC’s Basic Characteristics of a Fully Developed Honors Program states unequivocally that a successful honors program “has a clear mandate from the institution’s administration in the form of a mission statement or charter document that includes the objectives and responsibilities of honors and defines the place of honors in the administrative and academic structure of the institution.” According to Mrozinski, mission statements are public definitions of purpose published in a college‘s catalog, website, or other planning documents and are generally required by accrediting bodies. Such mission statements have now become standard for honors programs and colleges

    A New Acoustic-Based Pronunciation Distance Measure

    Get PDF
    We present an acoustic distance measure for comparing pronunciations, and apply the measure to assess foreign accent strength in American-English by comparing speech of non-native American-English speakers to a collection of native American-English speakers. An acoustic-only measure is valuable as it does not require the time-consuming and error-prone process of phonetically transcribing speech samples which is necessary for current edit distance-based approaches. We minimize speaker variability in the data set by employing speaker-based cepstral mean and variance normalization, and compute word-based acoustic distances using the dynamic time warping algorithm. Our results indicate a strong correlation of r = −0.71 (p < 0.0001) between the acoustic distances and human judgments of native-likeness provided by more than 1,100 native American-English raters. Therefore, the convenient acoustic measure performs only slightly lower than the state-of-the-art transcription-based performance of r = −0.77. We also report the results of several small experiments which show that the acoustic measure is not only sensitive to segmental differences, but also to intonational differences and durational differences. However, it is not immune to unwanted differences caused by using a different recording device

    Tuning the lipid bilayer: the influence of small molecules on domain formation and membrane fusion

    Get PDF
    Alle levende organismen bestaan uit cellen omgeven door een membraan. Dit membraan bestaat uit een lipide bilaag welke transport-, receptor- en kanaaleiwitten bevat. Verwacht wordt dat membranen domeinen bevatten, genaamd ‘rafts’. Eiwitten associeren met deze rafts of worden juist buitengesloten, waardoor bepaalde eiwitten dicht bij elkaar komen of juist bij elkaar uit de buurt worden gehouden. Omdat rafts zo klein zijn en dynamisch, zijn modelsystemen ontwikkeld om interacties tussen lipiden en eiwitlocalisatie te kunnen onderzoeken. Één van deze model membranen zijn gigantische unilamellaire vesikels (GUVs), welke tientallen micrometers groot kunnen zijn en daardoor bestudeerd kunnen worden met lichtmicroscopie. GUVs bestaande uit drie verzadigde en onverzadigde lipiden en cholesterol scheiden in twee fasen: een dichte geordende vloeibare Lo fase en een meer vloeibare Ld fase. In dit proefschrift heb ik laten zien dat kleine moleculen zoals suikers, koolwaterstoffen en bepaalde lipiden de fase scheiding beĂŻnvloeden. De verdeling (partitie) van het model peptide WALP over de Lo en Ld domeinen van de membraan is bestudeerd. Het tweede deel van dit proefschrift is gericht op kleinere vesikels, zogenoemde grote-unilamellaire vesikels (LUVs). Vesikels gemaakt van niet-ionogene oppervlakte-actieve stoffen zijn gekarakteriseerd. Ik laat zien dat deze vetachtige stoffen gesloten vesikels kunnen vormen, niosomen genoemd, die qua eigenschappen te vergelijken zijn met vesikels gebaseerd op lipiden. Membraanfusie een delicate balans is tussen het verstoren van het membraan en handhaven van de permeabiliteitsbarriere. Ik ben er in geslaagd membraanfusie te realiseren zonder noemenswaardige lekkage van de vesikelinhoud door gebruik te maken van een enzym welke de kopgroepen van een deel van de lipiden afknipt

    Adapting Monolingual Models:Data can be Scarce when Language Similarity is High

    Get PDF
    For many (minority) languages, the resources needed to train large models are not available. We investigate the performance of zero-shot transfer learning with as little data as possible, and the influence of language similarity in this process. We retrain the lexical layers of four BERT-based models using data from two low-resource target language varieties, while the Transformer layers are independently fine-tuned on a POS-tagging task in the model's source language. By combining the new lexical layers and fine-tuned Transformer layers, we achieve high task performance for both target languages. With high language similarity, 10MB of data appears sufficient to achieve substantial monolingual transfer performance. Monolingual BERT-based models generally achieve higher downstream task performance after retraining the lexical layer than multilingual BERT, even when the target language is included in the multilingual model
    • 

    corecore