10 research outputs found

    Applying the Levenshtein Distance to Catalan dialects: A brief comparison of two dialectometric approaches 1

    Get PDF
    Abstract. In recent years, dialectometry has gained interest among Catalan dialectologists. As a consequence, a specific dialectometric approach has been developed at the University of Barcelona, which aims at increasing the accuracy of final groupings by means of discriminating the predictable components of the language from its unpredictable ones. Another popular method to obtain dialect distances is the Levenshtein Distance (LD) which has never been applied to a Catalan corpus so far. The goal of this paper is to present the results of applying the LD to a corpus of Catalan linguistic data, and to compare the results from this analysis both with the results from Barcelona and the traditional classifications of Catalan dialectology. 1

    Studying dialects to understand human language

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Includes bibliographical references (leaves 65-71).This thesis investigates the study of dialect variations as a way to understand how humans might process speech. It evaluates some of the important research in dialect identification and draws conclusions about how their results can give insights into human speech processing. A study clustering dialects using k-means clustering is done. Self-organizing maps are proposed as a tool for dialect research, and a self-organizing map is implemented for the purposes of testing this. Several areas for further research are identified, including how dialects are stored in the brain, more detailed descriptions of how dialects vary, including contextual effects, and more sophisticated visualization tools. Keywords: dialect, accent, identification, recognition, self-organizing maps, words, lexical sets, clustering.by Akua Afriyie Nti.M.Eng

    Norwegian Dialects Examined Perceptually and Acoustically

    No full text
    Gooskens (2003) described an experiment which determined linguistic distances between 15 Norwegian dialects as perceived by Norwegian listeners. The results are compared to Levenshtein distances, calculated on the basis of transcriptions (of the words) of the same recordings as used in the perception experiment. The Levenshtein distance is equal to the sum of the weights of the insertions,deletions and substitutions needed to change one pronunciation into another. The success of the method depends on the reliability of the transcriber.The aim of this paper is to find an acoustic distance measure between dialects which approximates perceptual distance measure. We use and compare different representations of the acoustic signal: Barkfilter spectrograms, cochleagrams and formant tracks. We now apply the Levenshtein algorithm to spectra or formant value bundles instead of transcription segments. From these acoustic representations we got the best results using the formant track representation. However the transcription-based Levenshtein distances correlate still more closely. In the acoustic signal the speaker-dependent influence is kept to some extent, while a transcriber abstracts from voice quality. Using more samples per dialect word (instead of only one as in our research) should improve the accuracy of the measurements

    Norwegian dialects examined perceptually and acoustically

    No full text
    Gooskens (2003) described an experiment which determined linguistic distances between 15 Norwegian dialects as perceived by Norwegian listeners. The results are compared to Levenshtein distances, calculated on the basis of transcriptions (of the words) of the same recordings as used in the perception experiment. The Levenshtein distance is equal to the sum of the weights of the insertions, deletions and substitutions needed to change one pronunciation into another. The success of the method depends on the reliability of the transcriber. The aim of this paper is to find an acoustic distance measure between dialects which approximates perceptual distance measure. We use and compare different representations of the acoustic signal: Barkfilter spectrograms, cochleagrams and formant tracks. We now apply the Levenshtein algorithm to spectra or formant value bundles instead of transcription segments. From these acoustic representations we got the best results using the formant track representation. However the transcription-based Levenshtein distances correlate still more closely. In the acoustic signal the speaker-dependent influence is kept to some extent, while a transcriber abstracts from voice quality. Using more samples per dialect word (instead of only one as in our research) should improve the accuracy of the measurements

    Subsidia: Tools and Resources for Speech Sciences

    Get PDF
    Este libro, resultado de la colaboración de investigadores expertos en sus respectivas áreas, pretende ser una ayuda a la comunidad científica en tanto en cuanto recopila y describe una serie de materiales de gran utilidad para seguir avanzando en la investigació

    concepts - methods - visualization

    Get PDF
    While Darwin’s grand view of evolution has undergone many changes and shown up in many facets, there remains one outstanding common feature in its 150-year history: since the very beginning, branching trees have been the dominant scheme for representing evolutionary processes. Only recently, network models have gained ground reflecting contact-induced mixing or hybridization in evolutionary scenarios. In biology, research on prokaryote evolution indicates that lateral gene transfer is a major feature in the evolution of bacteria. In the field of linguistics, the mutual lexical and morphosyntactic borrowing between languages seems to be much more central for language evolution than the family tree model is likely to concede. In the humanities, networks are employed as an alternative to established phylogenetic models, to express the hybridization of cultural phenomena, concepts or the social structure of science. However, an interdisciplinary display of network analyses for evolutionary processes remains lacking. Therefore, this volume includes approaches studying the evolutionary dynamics of science, languages and genomes, all of which were based on methods incorporating network approaches

    Gaelic dialects present and past: a study of modern and medieval dialect relationships in the Gaelic languages

    Get PDF
    This thesis focuses on the historical development of dialectal variation in the Gaelic languages with special reference to Irish. As a point of departure, competing scholarly theories concerning the historical relationships between Goidelic dialects are laid out. Next, these theories are tested using dialectometric methods of linguistic analysis. Dialectometry clearly suggests the Irish of Ulster is the most linguistically distinctive of Irish dialects. This perspective on the modern dialects is utilised in subsequent chapters to clarify our understanding of the history of Gaelic dialectal variation, especially during the Old Irish period (AD 600–900). Theoretical and methodological frameworks that have been used in the study of the historical dialectology of Gaelic are next outlined. It is argued that these frameworks may not be the most appropriate for investigating dialectal variation during the Old Irish period. For the first time, principles from historical sociolinguistics are here applied in investigating the language of the Old Irish period. In particular, the social and institutional structures which supported the stability of Old Irish as a text language during the 8th and 9th centuries are scrutinised from this perspective. The role of the ecclesiastical and political centre of Armagh as the principal and central actor in the relevant network structures is highlighted. Focus then shifts to the processes through which ‘standard’ languages emerge, with special reference to Old Irish. The evidence of a small number of texts upon which modern understandings of Old Irish was based is assessed; it is argued that these texts most likely emerged from monasteries in the northeast of Ireland and the southwest of Scotland. Secondly, the processes through which the standard of the Old Irish period is likely to have come about are investigated. It is concluded that the standard language of the period arose primarily through the agency of monastic schools in the northeast of Ireland, particularly Armagh and Bangor. It is argued that this fact, and the subsequent prominence of Armagh as a stable and supremely prestigious centre of learning throughout the period, offers a sociolinguistically robust explanation for the apparent lack of dialectal variation in the language. Finally, the socio-political situation of the Old Irish period is discussed. Models of new-dialect formation are applied to historical evidence, and combined with later linguistic evidence, in an attempt to enunciate dialectal divisions which may have existed during the period
    corecore