133 research outputs found
Comparing human and machine vowel classification
In this study we compare human ability to identify vowels with a machine learning approach. A perception experiment for 14 Hungarian vowels in isolation and embedded in a carrier word was accomplished, and a C4.5 decision tree was trained on the same material. A comparison between the identification results of the subjects and the classifier showed that in three of four conditions (isolated vowel quantity and identity, embedded vowel identity) the performance of the classifier was superior and in one condition (embedded vowel quantity) equal to the subjects’ performance. This outcome can be explained by perceptual limits of the subjects and by stimulus properties. The classifier’s performance was significantly weakened by replacing the continuous spectral information by binary 3-Bark thresholds as proposed in phonetic literature [8]. Parts of the resulting decision trees can be interpreted phonetically, which could qualify this classifier as a tool for phonetic research
Quantity distinction in the Hungarian vowel system - just theory or also reality?
According to most current theories, the Hungarian vowel system involves 14 vowels that correspond to seven vowel pairs, each differentiated by quantity. However, there are phenomena both on the phonological and the phonetic level which suggest that for low, mid, and high vowels a separate evaluation of the quantity opposition is necessary. In order to test this, we conducted a perception test, in which embedded and isolated vowels spoken by a native Hungarian speaker were to be identified by native listeners. The results show that the perception of vowel length and vowel quality (i.e. the formant structure) closely interacts in Hungarian. Low vowels, for which short and long realisations differ in quality, i.e. in vowel height, were seldom identified incorrectly. For embedded high vowels, duration was not obviously regarded as a crucial cue for identification by the subjects, nor were they clearly differentiated by the speaker. Mid vowels showed a mixed behaviour: they were differentiated regarding their duration and formant structure in production, however, this information was only partly used by the listeners. The fact that vowel quantity distinction in Hungarian is only maintained where there is a perceivable quality difference shows that the role of quantity is not as dominant as it has been regarded for long
Deaccentuation in Hungarian and its logical background
New information in a sentence is expressed by prosodic prominence in many languages. However, the reverse is less obvious: given information and the lack of emphasis do not necessarily go hand in hand. This is especially true of languages that are not flexible with respect to their sentence-internal accentuation patterns, i.e in which a nucleus shift is not (always) possible. Based on predictions in the literature, we investigated whether deaccentuation is obligatory in Hungarian in certain sentence positions. A production and a perception experiment, the latter based on naturalness judgements, showed that the deaccentuation of the verb is obligatory if a focus other than the verb is present in the sentence. Sentence-initial content words were always accented, no matter whether they expressed given information or not, and mismatching patterns did not elicit low naturalness scores in the perception experiment. Our results show that Hungarian utilises deaccentuation in a different way from Indo-European languages: it serves as an expression of logical structure rather than of information structure
A beszédpercepció helye a teljes megértési folyamatban
A hangzó nyelvi feldolgozás alulról felfelé építkező modelljeinek bemenete az akusztikai jel, amely absztrakt fonológiai leképezés után kerül be a magasabb kognitív folyamatokba. Nem világos azonban, hogy hogyan lesz a fizikailag mérhető, folytonos, és erőteljes variabilitásnak kitett akusztikai jelből absztrakt, diszkrét, kis számú egységből álló fonémasorozat. A cikkben először a szegmentális és szupraszegmentális szint észlelését jellemezzük pszichoakusztikai szempontból. Ezután bemutatunk hat, napjainkban is meghatározó, egymásnak részben látszólag ellentmondó beszédpercepciós modellt, amelyek az akusztikai jelből vagy az artikulációs folyamatokból kiindulva jutnak el a megkülönböztető jegyeket hordozó fonémák azonosításához. Ilyenek a motoros elmélet, a kvantális elmélet és a LAFF, a közvetlen realista elmélet, a H&H elmélet, a példányelmélet és a nyommodell. Végül a modelleket összegezve tárgyaljuk a beszédpercepció és a magasabb szintű nyelvi folyamatok összefüggéseit. | The input to models of spoken language processing is the acoustic signal that
is projected on abstract phonological units which participate in higher
cognitive processes. However, it is not clear in what way the physically
measurable, continuous, and highly variable acoustic signal is transformed
into a sequence of abstract and discrete units forming a closed set. The paper
starts with a description of the perception of segmental and suprasegmental
units based on psychoacoustics. In the next part six relevant models of
speech perception are presented that seem to be partly contradictory at the
first sight. They take either the acoustic signal or articulatory processes as their input and result in the identification of phonemes with distinctive
features as their output. The theories are: motor theory, quantal theory and
LAFF, direct realist theory, H&H theory, exemplar theory and the trace
model. Finally, the models are summarised and the relationship between
speech perception and higher linguistic processes is discussed
- …