18 research outputs found

    On the Distributional Representation of Ragas: Experiments with Allied Raga Pairs

    Get PDF
    Raga grammar provides a theoretical framework that supports creativity and flexibility in improvisation while carefully maintaining the distinctiveness of each raga in the ears of a listener. A computational model for raga grammar can serve as a powerful tool to characterize grammaticality in performance. Like in other forms of tonal music, a distributional representation capturing tonal hierarchy has been found to be useful in characterizing a raga’s distinctiveness in performance. In the continuous-pitch melodic tradition, several choices arise for the defining attributes of a histogram representation of pitches. These can be resolved by referring to one of the main functions of the representation, namely to embody the raga grammar and therefore the technical boundary of a raga in performance. Based on the analyses of a representative dataset of audio performances in allied ragas by eminent Hindustani vocalists, we propose a computational representation of distributional information, and further apply it to obtain insights about how this aspect of raga distinctiveness is manifested in practice over different time scales by very creative performers

    Automatic prosodic analysis for computer aided pronunciation teaching

    Get PDF
    Correct pronunciation of spoken language requires the appropriate modulation of acoustic characteristics of speech to convey linguistic information at a suprasegmental level. Such prosodic modulation is a key aspect of spoken language and is an important component of foreign language learning, for purposes of both comprehension and intelligibility. Computer aided pronunciation teaching involves automatic analysis of the speech of a non-native talker in order to provide a diagnosis of the learner's performance in comparison with the speech of a native talker. This thesis describes research undertaken to automatically analyse the prosodic aspects of speech for computer aided pronunciation teaching. It is necessary to describe the suprasegmental composition of a learner's speech in order to characterise significant deviations from a native-like prosody, and to offer some kind of corrective diagnosis. Phonological theories of prosody aim to describe the suprasegmental composition of speech..

    A weighted-constraint model of F0 movements/

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Linguistics and Philosophy, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 155-159).This dissertation develops a grammar of phonetic implementation of phonologically significant F0 (pitch) events, which is applicable across languages. Through production studies of various languages, we show that phonetic universals exist which govern phonetic realization of the phonological representations of tones. In the previous literature, there have been two conflicting views concerning tonal timing: tones are aligned with respect to segments (the Segmental Anchoring Hypothesis) or tones occur at a fixed interval from other tones (the Constant Duration Hypothesis). In this dissertation, the two hypotheses are tested in languages with various tonal phonologies: Seoul Korean (phrasal boundary tone), Tokyo Japanese (lexical pitch accent), Mandarin (lexical tone), and English (intonational pitch accent). In all languages, both tendencies to maintain segmental alignment and a target duration for pitch rises are simultaneously observed. We thus adopt a weighted-constraint model (Flemming, 2001) where segmental alignment and target duration are interpreted as weighted constraints. In this model, timing of tones is determined to minimize the summed cost of violation of these conflicting constraints. Mixed-effects models were fitted to the data to obtain the actual weights in each language. Relative weights of the constraints reflect cross-linguistic differences in the alignment of tones. The relative weights of constraints in the phonetic realization grammar are not random but systematic, reflecting the phonological nature of tones in each language. The experimental studies in this dissertation show that tonal alignment patterns depend on phonological status and context of tones. Lexically-contrastive tones (Japanese accented words, Mandarin lexical tone) or prominence-lending tones (English pitch accents) are more strictly aligned with respect to their anchoring points than phrasal boundary tones (Seoul Korean, Japanese unaccented words), if other conditions are equal. Tones show different alignment patterns depending on phonological context: tones are more strictly aligned in word-final context than in word-medial context in Japanese accented words, and in lexical-tone context than in neutral-tone context in Mandarin. In addition, languages show different phonetic realization patterns depending on whether contour tones are contrastive in the language (Mandarin and English) or not (Korean and Japanese). These results point to the fact that details of phonetic realization of tones are determined by language-specific phonetic realization grammar, rather than by default universal rules.by Hyesun Cho.Ph.D

    HMM-based speech synthesis using an acoustic glottal source model

    Get PDF
    Parametric speech synthesis has received increased attention in recent years following the development of statistical HMM-based speech synthesis. However, the speech produced using this method still does not sound as natural as human speech and there is limited parametric flexibility to replicate voice quality aspects, such as breathiness. The hypothesis of this thesis is that speech naturalness and voice quality can be more accurately replicated by a HMM-based speech synthesiser using an acoustic glottal source model, the Liljencrants-Fant (LF) model, to represent the source component of speech instead of the traditional impulse train. Two different analysis-synthesis methods were developed during this thesis, in order to integrate the LF-model into a baseline HMM-based speech synthesiser, which is based on the popular HTS system and uses the STRAIGHT vocoder. The first method, which is called Glottal Post-Filtering (GPF), consists of passing a chosen LF-model signal through a glottal post-filter to obtain the source signal and then generating speech, by passing this source signal through the spectral envelope filter. The system which uses the GPF method (HTS-GPF system) is similar to the baseline system, but it uses a different source signal instead of the impulse train used by STRAIGHT. The second method, called Glottal Spectral Separation (GSS), generates speech by passing the LF-model signal through the vocal tract filter. The major advantage of the synthesiser which incorporates the GSS method, named HTS-LF, is that the acoustic properties of the LF-model parameters are automatically learnt by the HMMs. In this thesis, an initial perceptual experiment was conducted to compare the LFmodel to the impulse train. The results showed that the LF-model was significantly better, both in terms of speech naturalness and replication of two basic voice qualities (breathy and tense). In a second perceptual evaluation, the HTS-LF system was better than the baseline system, although the difference between the two had been expected to be more significant. A third experiment was conducted to evaluate the HTS-GPF system and an improved HTS-LF system, in terms of speech naturalness, voice similarity and intelligibility. The results showed that the HTS-GPF system performed similarly to the baseline. However, the HTS-LF system was significantly outperformed by the baseline. Finally, acoustic measurements were performed on the synthetic speech to investigate the speech distortion in the HTS-LF system. The results indicated that a problem in replicating the rapid variations of the vocal tract filter parameters at transitions between voiced and unvoiced sounds is the most significant cause of speech distortion. This problem encourages future work to further improve the system

    Pitch Contour Stylization Using an Optimal Piecewise Polynomial Approximation

    No full text

    Review of Particle Physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 2,143 new measurements from 709 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. Particle properties and search limits are listed in Summary Tables. We give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 120 reviews are many that are new or heavily revised, including a new review on Machine Learning, and one on Spectroscopy of Light Meson Resonances. The Review is divided into two volumes. Volume 1 includes the Summary Tables and 97 review articles. Volume 2 consists of the Particle Listings and contains also 23 reviews that address specific aspects of the data presented in the Listings

    Review of Particle Physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 2,143 new measurements from 709 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. Particle properties and search limits are listed in Summary Tables. We give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 120 reviews are many that are new or heavily revised, including a new review on Machine Learning, and one on Spectroscopy of Light Meson Resonances. The Review is divided into two volumes. Volume 1 includes the Summary Tables and 97 review articles. Volume 2 consists of the Particle Listings and contains also 23 reviews that address specific aspects of the data presented in the Listings. The complete Review (both volumes) is published online on the website of the Particle Data Group (pdg.lbl.gov) and in a journal. Volume 1 is available in print as the PDG Book. A Particle Physics Booklet with the Summary Tables and essential tables, figures, and equations from selected review articles is available in print, as a web version optimized for use on phones, and as an Android app
    corecore