951 research outputs found

    Training Gaussian Mixture Models at Scale via Coresets

    Get PDF
    How can we train a statistical mixture model on a massive data set? In this work we show how to construct coresets for mixtures of Gaussians. A coreset is a weighted subset of the data, which guarantees that models fitting the coreset also provide a good fit for the original data set. We show that, perhaps surprisingly, Gaussian mixtures admit coresets of size polynomial in dimension and the number of mixture components, while being independent of the data set size. Hence, one can harness computationally intensive algorithms to compute a good approximation on a significantly smaller data set. More importantly, such coresets can be efficiently constructed both in distributed and streaming settings and do not impose restrictions on the data generating process. Our results rely on a novel reduction of statistical estimation to problems in computational geometry and new combinatorial complexity results for mixtures of Gaussians. Empirical evaluation on several real-world datasets suggests that our coreset-based approach enables significant reduction in training-time with negligible approximation error

    Dimension reduction problems in the modelling of hydrogel thin films

    Get PDF

    Acoustic Analysis of Montenegrin English L2 Vowels: Production and Perception

    Get PDF
    This study provides an acoustic analysis of Montenegrin vowels, in order to make a comparison with the already existing measurements of General American English (GAE) vowels. Also, a production analysis is done on Montenegrin (MTN) learners of English, which shows the vowels that are the most problematic in their L2 pronunciation. In addition to this, a two-way perception study was conducted with the participants. American native English speakers listened to 11 GAE vowels produced by Montenegrin speakers of English, and tried to indicate which vowels they heard, while Montenegrin speakers of English did the same after listening to native GAE speakers. The study shows that some vowels are easy for Montenegrin speakers to produce and perceive. However, certain vowels (e.g., the ones that are present in English, but not in Montenegrin) cause problems for participants in both production and perception analysis. This research helps determine the causes of miscomprehension between native speakers of GAE and Montenegrin EFL learners. These findings can help learners and teachers of ESL/EFL provide better quality instruction for Montenegrin learners by giving them more information on the problematic differences in the vowel systems of Montenegrin and English
    corecore