Search CORE

5 research outputs found

Automated tone transcription

Author: Bird Steven
Publication venue
Publication date: 01/01/1994
Field of study

In this paper I report on an investigation into the problem of assigning tones to pitch contours. The proposed model is intended to serve as a tool for phonologists working on instrumentally obtained pitch data from tone languages. Motivation and exemplification for the model is provided by data taken from my fieldwork on Bamileke Dschang (Cameroon). Following recent work by Liberman and others, I provide a parametrised F_0 prediction function P which generates F_0 values from a tone sequence, and I explore the asymptotic behaviour of downstep. Next, I observe that transcribing a sequence X of pitch (i.e. F_0) values amounts to finding a tone sequence T such that P(T) {}~= X. This is a combinatorial optimisation problem, for which two non-deterministic search techniques are provided: a genetic algorithm and a simulated annealing algorithm. Finally, two implementations---one for each technique---are described and then compared using both artificial and real data for sequences of up to 20 tones. These programs can be adapted to other tone languages by adjusting the F_0 prediction function.Comment: 12 pages, 4 postscript figures, uses examples.sty, newapa.sty, latex-acl.sty, ipamacs.st

arXiv.org e-Print Archive

CiteSeerX

A semi-automated workflow for producing time-aligned intermediate tonal representations

Author: Grabowski Emily
McPherson Laura
Publication venue
Publication date: 02/03/2017
Field of study

Tone can be one of the most daunting aspects of a language to document, particularly at the beginning of a project. Even if tonal categories can be determined in elicitation contexts, tone in running speech and narratives, the core focus of documentary linguistics, is notoriously difficult even for seasoned tonal specialists. When tone marking is included, they are the researcher’s analytical conclusions (e.g. H, L) rather than a representation of the speech melody itself. To address these issues and facilitate the inclusion of objective replicable tonal annotations in language documentation, we have developed a semi-automated computational workflow to take raw phonetic data (fundamental frequency, or f0) and turn it into a more easily interpretable intermediate representation: a system of levels, already used as a descriptive lingua franca in reference grammars either numerically or using dashes (e.g. HL = 51 = ). The analyst can set the number of levels to reflect the desired level of detail. For instance, if the researcher suspects a two-tone system, she may set the number of levels at 4 or 5 to reflect processes such as declination, downdrift or upstep. For more complex tone systems, more levels might be employed. The workflow begins in Praat by creating a TextGrid annotation delimiting the tonal spans to be analyzed; for the most part, these will be vowels or syllable rimes so as to analyze only tone bearing units. A Praat script extracts f0 from multiple points in each span. Next, a Python script converts these f0 values to semitones based on the speaker’s mean f0, excluding outliers. The speaker’s pitch range, excluding these outliers, is then divided into the desired number of levels, and each point designated by the researcher within a syllable (e.g. 20% and 80%) is assigned a number corresponding to that level. The output of the Python script is a text file with time stamps for each annotation, which can be imported into Elan, thus tying together searchable tonal information with the broader text transcription. In this talk, we describe the workflow and demonstrate some analytical uses for the tool, including comparison of elicitation and free speech, interspeaker variation, and first-look analysis of unanalyzed tonal data

ScholarSpace at University of Hawai'i at Manoa

Recommended from our members

Multidimensional Exploration of Online Linguistic Field Data

Author: Bird Steven
Publication venue: ScholarWorks@UMass Amherst
Publication date: 07/10/2020
Field of study

ScholarWorks@UMass Amherst

Argumentative zoning information extraction from scientific text

Author: Teufel Simone
Publication venue: The University of Edinburgh
Publication date: 01/01/1999
Field of study

Let me tell you, writing a thesis is not always a barrel of laughs—and strange things can happen, too. For example, at the height of my thesis paranoia, I had a re-current dream in which my cat Amy gave me detailed advice on how to restructure the thesis chapters, which was awfully nice of her. But I also had a lot of human help throughout this time, whether things were going fine or beserk. Most of all, I want to thank Marc Moens: I could not have had a better or more knowledgable supervisor. He always took time for me, however busy he might have been, reading chapters thoroughly in two days. He both had the calmness of mind to give me lots of freedom in research, and the right judgement to guide me away, tactfully but determinedly, from the occasional catastrophe or other waiting along the way. He was great fun to work with and also became a good friend. My work has profitted from the interdisciplinary, interactive and enlightened atmosphere at the Human Communication Centre and the Centre for Cognitive Science (which is now called something else). The Language Technology Group was a great place to work in, as my research was grounded in practical applications develope

CiteSeerX

Edinburgh Research Archive