4,587 research outputs found

    A cross-linguistic database of phonetic transcription systems

    Get PDF
    Contrary to what non-practitioners might expect, the systems of phonetic notation used by linguists are highly idiosyncratic. Not only do various linguistic subfields disagree on the specific symbols they use to denote the speech sounds of languages, but also in large databases of sound inventories considerable variation can be found. Inspired by recent efforts to link cross-linguistic data with help of reference catalogues (Glottolog, Concepticon) across different resources, we present initial efforts to link different phonetic notation systems to a catalogue of speech sounds. This is achieved with the help of a database accompanied by a software framework that uses a limited but easily extendable set of non-binary feature values to allow for quick and convenient registration of different transcription systems, while at the same time linking to additional datasets with restricted inventories. Linking different transcription systems enables us to conveniently translate between different phonetic transcription systems, while linking sounds to databases allows users quick access to various kinds of metadata, including feature values, statistics on phoneme inventories, and information on prosody and sound classes. In order to prove the feasibility of this enterprise, we supplement an initial version of our cross-linguistic database of phonetic transcription systems (CLTS), which currently registers five transcription systems and links to fifteen datasets, as well as a web application, which permits users to conveniently test the power of the automatic translation across transcription systems

    Temporal articulatory stability, phonological variation, and lexical contrast preservation in diaspora Tibetan

    Get PDF
    This dissertation examines how lexical tone can be represented with articulatory gestures, and the ways a gestural perspective can inform synchronic and diachronic analysis of the phonology and phonetics of a language. Tibetan is chosen an example of a language with interacting laryngeal and tonal phonology, a history of tonogenesis and dialect diversification, and recent contact-induced realignment of the tonal and consonantal systems. Despite variation in voice onset time (VOT) and presence/absence of the lexical tone contrast, speakers retain a consistent relative timing of consonant and vowel gestures. Recent research has attempted to integrate tone into the framework of Articulatory Phonology through the addition of tone gestures. Unlike other theories of phonetics-phonology, Articulatory Phonology uniquely incorporates relative timing as a key parameter. This allows the system to represent contrasts instantiated not just in the presence or absence of gestures, but also in how gestures are timed with each other. Building on the different predictions of various timing relations, along with the historical developments in the language, hypotheses are generated and tested with acoustic and articulatory experiments. Following an overview of relevant theory, the second chapter surveys past literature on the history of sound change and present phonological diversity of Tibetic dialects. Whereas Old Tibetan lacked lexical tone, contrasted voiced and voiceless obstruents, and exhibited complex clusters, a series of overlapping sound changes have led to some modern varieties that are tone, lack clusters, and vary in the expression of voicing and aspiration. Furthermore, speakers in the Tibetan diaspora use a variety that has grown out of the contact between diverse Tibetic dialects. The state of the language and the dynamics of diaspora have created a situation ripe for sound change, including the recombination of elements from different dialects and, potentially, the loss of tone contrasts. The nature of the diaspora Tibetan is investigated through an acoustic corpus study. Recordings made in Kathmandu, Nepal, are being transcribed and forced-aligned into a useful audio corpus. Speakers in the corpus come from diverse backgrounds across and outside traditional Tibetan-speaking regions, but the analysis presented here focuses on speakers who grew up in diaspora, with a mixed input of Standard Tibetan (spyi skad) and other Tibetan varieties. Especially notable among these speakers is the high variability of voice onset time (VOT) and its interaction with tone. An analysis of this data in terms of the relative timing of oral, laryngeal, and tone gestures leads to the generation of hypotheses for testing using articulatory data. The articulatory study is conducted using electromagnetic articulography (EMA), and six Tibetan-speaking participants. The key finding is that the relative timing of consonant and vowel gestures is consistent across phonological categories and across speakers who do and do not contrast tone. This result leads to the conclusion that the relative timing of speech gestures is conserved and acquired independently. Speakers acquire and generalize a limited inventory of timing patterns, and can use timing patterns even when the conditioning environment for the development of those patterns, namely tone, has been lost

    Native Speaker Perceptions of Accented Speech: The English Pronunciation of Macedonian EFL Learners

    Get PDF
    The paper reports on the results of a study that aimed to describe the vocalic and consonantal features of the English pronunciation of Macedonian EFL learners as perceived by native speakers of English and to find out whether native speakers who speak different standard variants of English perceive the same segments as non-native. A specially designed computer web application was employed to gather two types of data: a) quantitative (frequency of segment variables and global foreign accent ratings on a 5-point scale), and b) qualitative (open-ended questions). The result analysis points out to three most frequent markers of foreign accent in the English speech of Macedonian EFL learners: final obstruent devoicing, vowel shortening and substitution of English dental fricatives with Macedonian dental plosives. It also reflects additional phonetic aspects poorly explained in the available reference literature such as allophonic distributional differences between the two languages and intonational mismatch

    Children's computation of complex linguistic forms: a study of frequency and imageability effects.

    Get PDF
    This study investigates the storage vs. composition of inflected forms in typically-developing children. Children aged 8-12 were tested on the production of regular and irregular past-tense forms. Storage (vs. composition) was examined by probing for past-tense frequency effects and imageability effects--both of which are diagnostic tests for storage--while controlling for a number of confounding factors. We also examined sex as a factor. Irregular inflected forms, which must depend on stored representations, always showed evidence of storage (frequency and/or imageability effects), not only across all children, but also separately in both sexes. In contrast, for regular forms, which could be either stored or composed, only girls showed evidence of storage. This pattern is similar to that found in previously-acquired adult data from the same task, with the notable exception that development affects which factors influence the storage of regulars in females: imageability plays a larger role in girls, and frequency in women. Overall, the results suggest that irregular inflected forms are always stored (in children and adults, and in both sexes), whereas regulars can be either composed or stored, with their storage a function of various item- and subject-level factors

    Challenges of Annotation and Analysis in Computer-Assisted Language Comparison: A Case Study on Burmish Languages

    Get PDF
    The use of computational methods in comparative linguistics is growing in popularity. The increasing deployment of such methods draws into focus those areas in which they remain inadequate as well as those areas where classical approaches to language comparison are untransparent and inconsistent. In this paper we illustrate specific challenges which both computational and classical approaches encounter when studying South-East Asian languages. With the help of data from the Burmish language family we point to the challenges resulting from missing annotation standards and insufficient methods for analysis and we illustrate how to tackle these problems within a computer-assisted framework in which computational approaches are used to pre-analyse the data while linguists attend to the detailed analyses

    Using Graph Mining Method in Analyzing Turkish Loanwords Derived from Arabic Language

    Get PDF
    الكلمات المستعارة هي الكلمات التي يتم نقلها من لغة إلى أخرى وتصبح جزءًا أساسيًا من لغة الاستعارة. جاءت الكلمات المستعارة من لغة المصدر إلى لغة المستلم لأسباب عديدة. على سبيل المثال لا الحصر الغزوات أو المهن أو التجارة. ان ايجاد هذه الكلمات المستعارة بين اللغات عملية صعبة ومعقدة نظرا لانه لايوجد معايير ثابتة لتحويل الكلمات بين اللغات وبالتالي تكون الدقة قليلة. في هذا البحث تم تحسين دقة ايجاد الكلمات التركية المستعارة من اللغة العربية. وكذلك سوف يساهم هذا البحث بايجاد كل الكلمات المستعارة باستخدام اي مجموعة من الحرووف سواءا كانت مرتبة او غير مرتبة ابجديا. عالج هذا البحث مشكلة التشويه في النطق وقام بايجاد الحلول للحروف المفقودة في اللغة التركية والموجودة في اللغة العربية. تقدم هذه الورقة طريقة مقترحة لتحديد الكلمات التركية المستعارة من اللغة العربية اعتمادًا على تقنيات التنقيب في المخططات والتي استخدمت لاول مرة لهذا الغرض. فقد تم حل مشاكل الاختلاف في الحروف بين اللغتين باستخدام لغة مرجعية وهي اللغة الانكليزية لتوحيد نمط وشكل الحروف. لقد تم اختبار هذا النظام المقترح باستخدام 1256 كلمة. النتائج التي تم الحصول عليها تبين ان الدقة في تحديد الكلمات المستعارة كانت 0,99 والتي تعتبر قيمة عالية جدا. كل هذه المساهمات تؤدي إلى تقليل الوقت والجهد لتحديد الكلمات المستعارة بطريقة فعالة ودقيقة. كما أن الباحث لا يحتاج إلى معرفة باللغة المستعيرة واللغة المأخوذ منها. علاوة على ذلك ، يمكن تعميم هذه الطريقة على أي لغتين باستخدام نفس الخطوات المتبعة في الحصول على الكلمات المستعارة التركية من العربية.Loanwords are the words transferred from one language to another, which become essential part of the borrowing language. The loanwords have come from the source language to the recipient language because of many reasons. Detecting these loanwords is complicated task due to that there are no standard specifications for transferring words between languages and hence low accuracy. This work tries to enhance this accuracy of detecting loanwords between Turkish and Arabic language as a case study. In this paper, the proposed system contributes to find all possible loanwords using any set of characters either alphabetically or randomly arranged. Then, it processes the distortion in the pronunciation, and solves the problem of the missing letters in Turkish language relative to Arabic language. A graph mining technique was introduced, for identifying the Turkish loanwords from Arabic language, which is used for the first time for this purpose. Also, the problem of letters differences, in the two languages, is solved by using a reference language (English) to unify the style of writing. The proposed system was tested using 1256 words that manually annotated. The obtained results showed that the f-measure is 0.99 which is high value for such system. Also, all these contributions lead to decrease time and effort to identify the loanwords in efficient and accurate way. Moreover, researchers do not need to have knowledge in the recipient and the source languages. In addition, this method can be generalized to any two languages using the same steps followed in obtaining Turkish loanwords from Arabic

    Script Effects as the Hidden Drive of the Mind, Cognition, and Culture

    Get PDF
    This open access volume reveals the hidden power of the script we read in and how it shapes and drives our minds, ways of thinking, and cultures. Expanding on the Linguistic Relativity Hypothesis (i.e., the idea that language affects the way we think), this volume proposes the “Script Relativity Hypothesis” (i.e., the idea that the script in which we read affects the way we think) by offering a unique perspective on the effect of script (alphabets, morphosyllabaries, or multi-scripts) on our attention, perception, and problem-solving. Once we become literate, fundamental changes occur in our brain circuitry to accommodate the new demand for resources. The powerful effects of literacy have been demonstrated by research on literate versus illiterate individuals, as well as cross-scriptal transfer, indicating that literate brain networks function differently, depending on the script being read. This book identifies the locus of differences between the Chinese, Japanese, and Koreans, and between the East and the West, as the neural underpinnings of literacy. To support the “Script Relativity Hypothesis”, it reviews a vast corpus of empirical studies, including anthropological accounts of human civilization, social psychology, cognitive psychology, neuropsychology, applied linguistics, second language studies, and cross-cultural communication. It also discusses the impact of reading from screens in the digital age, as well as the impact of bi-script or multi-script use, which is a growing trend around the globe. As a result, our minds, ways of thinking, and cultures are now growing closer together, not farther apart. ; Examines the origin, emergence, and co-evolution of written language, the human mind, and culture within the purview of script effects Investigates how the scripts we read over time shape our cognition, mind, and thought patterns Provides a new outlook on the four representative writing systems of the world Discusses the consequences of literacy for the functioning of the min

    Networks in the mind – what communities reveal about the structure of the lexicon

    Get PDF
    The mental lexicon stores words and information about words. The lexicon is seen by many researchers as a network, where lexical units are nodes and the different links between the units are connections. Based on the analysis of a word association network, in this article we show that different kinds of associative connections exist in the mental lexicon. Our analysis is based on a word association database from the agglutinative language Hungarian. We use communities – closely knit groups – of the lexicon to provide evidence for the existence and coexistence of different connections. We search for communities in the database using two different algorithms, enabling us to see the overlapping (a word belongs to multiple communities) and non-overlapping (a word belongs to only one community) community structures. Our results show that the network of the lexicon is organized by semantic, phonetic, syntactic and grammatical connections, but encyclopedic knowledge and individual experiences are also shaping the associative structure. We also show that words may be connected not just by one, but more types of connections at the same time
    corecore