16,257 research outputs found
Holistic corpus-based dialectology
This paper is concerned with sketching future directions for corpus-based dialectology. We advocate a holistic approach to the study of geographically conditioned linguistic variability, and we present a suitable methodology, 'corpusbased dialectometry', in exactly this spirit. Specifically, we argue that in order to live up to the potential of the corpus-based method, practitioners need to (i) abandon their exclusive focus on individual linguistic features in favor of the study of feature aggregates, (ii) draw on computationally advanced multivariate analysis techniques (such as multidimensional scaling, cluster analysis, and principal component analysis), and (iii) aid interpretation of empirical results by marshalling state-of-the-art data visualization techniques. To exemplify this line of analysis, we present a case study which explores joint frequency variability of 57 morphosyntax features in 34 dialects all over Great Britain.Este artigo debruça-se sobre o esboço propositivo de futuras direções para a dialetologia baseada em corpus. Defendemos uma abordagem holística para o estudo da variabilidade linguística geograficamente condicionada, e apresentamos uma metodologia adequada para tal - a dialetometria baseada em corpus. Mais especificamente, defendemos que para que se obtenham todos os resultados esperados da metodologia de corpus, pesquisadores devem: (i) abandonar seu foco exclusivo em traços linguísticos individuais em favor do estudo dos agregados de traços, (ii) amparar-se em métodos computacionais avançados de técnicas de análise multivariada (tais como escalagem multidimensional, análise de clusters, e análise de componente principal), e (iii) auxiliar a interpretação de resultados empíricos através da utilização do estado da arte em técnicas de visualização. A fim de exemplificarmos essa linha de análise, apresentamos um estudo de caso que explora a variabilidade da frequência agregada de 57 traços morfossintáticos de 34 dialetos da Grã-Bretanha
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
We propose a method for embedding two-dimensional locations in a continuous
vector space using a neural network-based model incorporating mixtures of
Gaussian distributions, presenting two model variants for text-based
geolocation and lexical dialectology. Evaluated over Twitter data, the proposed
model outperforms conventional regression-based geolocation and provides a
better estimate of uncertainty. We also show the effectiveness of the
representation for predicting words from location in lexical dialectology, and
evaluate it using the DARE dataset.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP
2017) September 2017, Copenhagen, Denmar
Databases, Dictionaries and Dialectology. Dental instability in Early Middle English: A case study
Recommended from our members
Exploratory proposal to encode Germanicist, Nordicist, and other phonetic characters in the UCS
This is a preliminary document that presents various Latin characters for specialist phonetic use that may be eligible to add to the international character encoding standard Unicode. The set of Teuthonista characters were later separately proposed and were published in Unicode Standard version 7.0 in June 2014. (The later proposal for Teuthonista which was approved is: .
Review of \u27Bod kyi yul skad rnam bshad [General Introduction to the Tibetan Dialects]\u27 by Sum-bha Don-grub Tshe-ring [Sumbha Dondrub Tshering]
Scrunch, growze, or chobble?: investigating regional variation in sound symbolism in the Survey of English Dialects
This paper draws on data extracted from Upton et al.’s (1994) Survey of English Dialects: The Dictionary and Grammar in investigating the regional distribution across England of sound symbolic phonesthemes, that is, word-initial consonant clusters which appear to carry with them a non-arbitrary relationship between sound and meaning. Using such empirical data and employing systematic quantitative analysis, this study avoids the criticism often aimed at sound symbolism research that evidence is speculative and anecdotal. In operating on the intersection between sound symbolism and dialectology, the research here addresses a field currently understudied due to the scholarly attention paid to the morphological status of phonesthemes and their universality across languages. The results suggest that phonesthemes are to some extent subject to regional variation, indicating that certain phonesthemes are more common in some areas of England than alternatives which appear to carry the same sound-meaning relationship, often producing clear distributional patterns. In turn, these patterns are discussed, and explanations offered, in light of existing dialectological and variationist theoretical constructs. The significance of these findings underlines the contribution that such exploration can make to both the sound symbolism and dialectology fields, as well as highlighting the continuing opportunities for innovative research offered by the Survey of English Dialects material
Employing geographical principles for sampling in state of the art dialectological projects
The aims of this paper are twofold: First, we locate the most effective human geographical methods for sampling across space in large-scale dialectological projects. We propose two geographical concepts as a basis for sampling decisions: Geo-demographic classification, which is a multidimensional method used for the socio-economic grouping of areas. We also develop an updated version of functional regions that can be used in sociolinguistic research. We then report on the results of a pilot project that applies these models to collect data regarding the acceptability of vernacular morpho-syntactic forms in the North-East of England. Following the method of natural breaks advocated for dialectology by Horvath and Horvath (2002), we interpret breaks in the probabilistic patterns as areas of dialect transitions. This study contributes to the debate about the role and limitations of spatiality in linguistic analysis. It intends to broaden our knowledge about the interfaces between human geography and dialectology
- …
