75 research outputs found

    Reduction of Survey Sites in Dialectology: A New Methodology Based on Clustering

    Get PDF
    Many language change studies aim for a partial revisitation, i.e., selecting survey sites from previous dialect studies. The central issue of survey site reduction, however, has often been addressed only qualitatively. Cluster analysis offers an innovative means of identifying the most representative survey sites among a set of original survey sites. In this paper, we present a general methodology for finding representative sites for an intended study, potentially applicable to any collection of data about dialects or linguistic variation. We elaborate the quantitative steps of the proposedmethodology in the context of the “Linguistic Atlas of Japan” (LAJ). Next, we demonstrate the full application of the methodology on the “Linguistic Atlas of German-speaking Switzerland” (Germ.: “Sprachatlas der Deutschen Schweiz”—SDS), with the explicit aim of selecting survey sites corresponding to the aims of the current project “Swiss German Dialects Across Time and Space” (SDATS), which revisits SDS 70 years later. We find that depending on the circumstances and requirements of a study, the proposed methodology, introducing cluster analysis into the survey site reduction process, allows for a greater objectivity in comparison to traditional approaches. We suggest, however, that the suitability of any set of candidate survey sites resulting from the proposed methodology be rigorously revised by experts due to potential incongruences, such as the overlap of objectives and variables across the original and intended studies and ongoing dialect change

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

    A quantitative approach to social and geographical dialect variation

    Get PDF

    Families and resemblances

    Get PDF

    Families and resemblances

    Get PDF

    Using Gabmap

    Get PDF
    AbstractGabmap is a freely available, open-source web application that analyzes the data of language variation, e.g. varying words for the same concepts, varying pronunciations for the same words, or varying frequencies of syntactic constructions in transcribed conversations. Gabmap is an integrated part of CLARIN (see e.g. http://portal.clarin.nl). This article summarizes Gabmap's basic functionality, adding material on some new features and reporting on the range of uses to which Gabmap has been put. Gabmap is modestly successful, and its popularity underscores the fact that the study of language variation has crossed a watershed concerning the acceptability of automated language analysis. Automated analysis not only improves researchers’ efficiency, it also improves the replicability of their analyses and allows them to focus on inferences to be drawn from analyses and other more abstract aspects of that study

    Métodos de la dialectología cuantitativa

    Get PDF
    La introducción de la cuantificación de la variación geolingüística ha traído consigo un espectacular auge de las publicaciones sobre la materia, que indican una renovada vitalidad de la disciplina. Uno de los mayores avances de la dialectología del siglo pasado, la dialectometría, se ha convertido en una realidad en prácticamente todas las lenguas cultivadas (Goebl 1992; Nerbonne 2013). La variedad de técnicas cuantitativas utilizadas en la dialectología pone al alcance de los investigadores un amplio abanico de posibilidades de analizar los datos dialectales. Pero todo análisis cuantitativo necesita de una base de datos amplia que aleja al dialectólogo de las prácticas del denominado (single) feature based dialectología, ganando en la objetividad de la muestra del análisis. En este trabajo se presentan los pasos que hay que seguir para desarrollar una investigación en dialectología cuantitativa. Además, se exponen algunas de las técnicas utilizadas, como las destinadas a la cuantificación de la distancia entre variedades, a la clasificación jerárquica, y/o al análisis del continuum dialectal. Así mismo, también se exponen métodos multivariantes para la identificación de patrones de variación, estudio de las variables que presentan similares patrones geográficos, analizar la probabilidad de pertenencia a determinados grupos dialectales, etc. La metodología de la dialectología cuantitativa se halla delimitada por los siguientes pasos: elección de un atlas lingüístico del que se proveerá su base de datos (que puede ser fonética, ortográfica o/y etiquetada), aplicación de una medida de distancia que proporciona una matriz de distancias y el uso de técnicas cuantitativas aplicadas a la matriz de distancias. La cuantificación se ha convertido en un paso obligatorio para expertos que se dedican al estudio de la variación lingüística.The introduction of the quantification of geolinguistic variation has brought a spectacular rise in publications on the subject, which indicate a renewed vitality of the discipline. One of the greatest advances in dialectology of the last century, dialectometry, has become a reality in practically all cultivated languages (Goebl 1992; Nerbonne 2013). The variety of quantitative techniques used in dialectometry offers researchers a wide range of possibilities for analyzing dialectical data. But any quantitative analysis needs a broad database that distances the dialectologist from the practices of the so-called '(single) feature based' dialectology, gaining in the objectivity of the analysis sample. The methodology of quantitative dialectology begins with the choice of a linguistic atlas from which its database will be provided (which can be phonetic, orthographic or/and labeled). The application of a distance measurement provides the distance matrix. The quantitative techniques applied to the distance matrix range from the quantification of the distance between dialectal varieties (interpunctual dialectometry), the hierarchicalclassification of dialectal varieties, the analysis of the dialectal continuum (with the technique of multidimensional scaling (MDS), the analysis of the correlation between geographical and linguistic distance, the detection of linguistic characteristics, etc. Quantification has become a mandatory step for experts who study linguistic variation

    Métodos de la dialectología cuantitativa

    Get PDF
    The introduction of the quantification of geolinguistic variation has brought a spectacular rise in publications on the subject, which indicate a renewed vitality of the discipline. One of the greatest advances in dialectology of the last century, dialectometry, has become a reality in practically all cultivated languages (Goebl 1992; Nerbonne 2013). The variety of quantitative techniques used in dialectometry offers researchers a wide range of possibilities for analyzing dialectical data. But any quantitative analysis needs a broad database that distances the dialectologist from the practices of the so-called '(single) feature based' dialectology, gaining in the objectivity of the analysis sample. The methodology of quantitative dialectology begins with the choice of a linguistic atlas from which its database will be provided (which can be phonetic, orthographic or/and labeled). The application of a distance measurement provides the distance matrix. The quantitative techniques applied to the distance matrix range from the quantification of the distance between dialectal varieties (interpunctual dialectometry), the hierarchical classification of dialectal varieties, the analysis of the dialectal continuum (with the technique of multidimensional scaling (MDS), the analysis of the correlation between geographical and linguistic distance, the detection of linguistic characteristics, etc. Quantification has become a mandatory step for experts who study linguistic variation.La introducción de la cuantificación de la variación geolingüística ha traído consigo un espectacular auge de las publicaciones sobre la materia, que indican una renovada vitalidad de la disciplina. Uno de los mayores avances de la dialectología del siglo pasado, la dialectometría, se ha convertido en una realidad en prácticamente todas las lenguas cultivadas (Goebl 1992; Nerbonne 2013). La variedad de técnicas cuantitativas utilizadas en la dialectología pone al alcance de los investigadores un amplio abanico de posibilidades de analizar los datos dialectales. Pero todo análisis cuantitativo necesita de una base de datos amplia que aleja al dialectólogo de las prácticas del denominado (single) feature based dialectología, ganando en la objetividad de la muestra del análisis. En este trabajo se presentan los pasos que hay que seguir para desarrollar una investigación en dialectología cuantitativa. Además, se exponen algunas de las técnicas utilizadas, como las destinadas a la cuantificación de la distancia entre variedades, a la clasificación jerárquica, y/o al análisis del continuum dialectal. Así mismo, también se exponen métodos multivariantes para la identificación de patrones de variación, estudio de las variables que presentan similares patrones geográficos, analizar la probabilidad de pertenencia a determinados grupos dialectales, etc. La metodología de la dialectología cuantitativa se halla delimitada por los siguientes pasos: elección de un atlas lingüístico del que se proveerá su base de datos (que puede ser fonética, ortográfica o/y etiquetada), aplicación de una medida de distancia que proporciona una matriz de distancias y el uso de técnicas cuantitativas aplicadas a la matriz de distancias. La cuantificación se ha convertido en un paso obligatorio para expertos que se dedican al estudio de la variación lingüística

    A quantitative approach to social and geographical dialect variation

    Get PDF
    corecore