Search CORE

2,739 research outputs found

Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks

Author: Baldwin Timothy
Cohn Trevor
Rahimi Afshin
Publication venue
Publication date: 01/01/2017
Field of study

We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP 2017) September 2017, Copenhagen, Denmar

arXiv.org e-Print Archive

Crossref

University of Queensland eSpace

The spoken Omani Arabic of ‘Ibrī : A “Crossing Point” in Gulf dialects

Author: Lombezzi Letizia
Publication venue: Institut de recherches et d'études sur les mondes arabes et musulmans, Aix-Marseille Universite
Publication date: 01/01/2019
Field of study

‘Ibrī is located half-way in between Mascat and Dubai, and is very close to the Emirates border. This proximity facilitates young male citizens that look for job opportunities in the rich Emirates. Effectively, it is easy to find an occupation beyond the border: in Dubai, for the business sector; in Buraymi or Al-‘Ain for administration or health sector related professions (health sector for female nurses too); in various locations across the Emirates if serving as military or police staff (airport and border police includes female staff too). ‘Ibrī speakers, the majority of whom come back home after work, have daily contacts with their Gulf neighbours. This style of life makes the speech of ‘Ibrī inhabitants critical for developing two levels of analysis: 1-features of the ‘Ibrī Spoken Arabic, in the general frame of Omani Arabic; 2-tracks of contamination among Gulf variants, due to both recent and historically motivated ‘contacts and changes.’ Several pairs of variables must be taken into account: social, referring to badawiyy or ḥaḍariyy; geographical, referring to the inner part of the country, or to west/east and north/south sides. In principle, the area of ‘Ibri should be “ḥaḍariyy of the north”. Nevertheless, we find elements that go beyond this classification. Phonology, for example, shows a series of combinatorial possibilities that hardly fit a schematic and annotated classification; then, we may also find the gahwah syndrome in occasional ‘Ibri speeches. According to what emerged from my collection of data in the city, I offer here a general morpho-phonological description of the local register. I also provide unpublished Omani texts, composed by teachers of “dialect”, with examples of syntax and lexicon. I intend to demonstrate how strong is the mismatching between political and linguistic borders in the Gulf area

Crossref

Archivio della Ricerca - Università degli Studi di Siena

Archivio della ricerca- Università di Roma La Sapienza

Databases, Dictionaries and Dialectology. Dental instability in Early Middle English: A case study

Author: Laing Margaret
Lass Roger
Publication venue
Publication date: 01/01/2009
Field of study

Edinburgh Research Explorer

Holistic corpus-based dialectology

This paper is concerned with sketching future directions for corpus-based dialectology. We advocate a holistic approach to the study of geographically conditioned linguistic variability, and we present a suitable methodology, 'corpusbased dialectometry', in exactly this spirit. Specifically, we argue that in order to live up to the potential of the corpus-based method, practitioners need to (i) abandon their exclusive focus on individual linguistic features in favor of the study of feature aggregates, (ii) draw on computationally advanced multivariate analysis techniques (such as multidimensional scaling, cluster analysis, and principal component analysis), and (iii) aid interpretation of empirical results by marshalling state-of-the-art data visualization techniques. To exemplify this line of analysis, we present a case study which explores joint frequency variability of 57 morphosyntax features in 34 dialects all over Great Britain

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Directory of Open Access Journals

The University of Manchester - Institutional Repository

Spatial evolution of human dialects

Author: Burridge James
Publication venue: 'American Physical Society (APS)'
Publication date: 01/07/2017
Field of study

The geographical pattern of human dialects is a result of history. Here, we formulate a simple spatial model of language change which shows that the final result of this historical evolution may, to some extent, be predictable. The model shows that the boundaries of language dialect regions are controlled by a length minimizing effect analogous to surface tension, mediated by variations in population density which can induce curvature, and by the shape of coastline or similar borders. The predictability of dialect regions arises because these effects will drive many complex, randomized early states toward one of a smaller number of stable final configurations. The model is able to reproduce observations and predictions of dialectologists. These include dialect continua, isogloss bundling, fanning, the wave-like spread of dialect features from cities, and the impact of human movement on the number of dialects that an area can support. The model also provides an analytical form for S\'{e}guy's Curve giving the relationship between geographical and linguistic distance, and a generalisation of the curve to account for the presence of a population centre. A simple modification allows us to analytically characterize the variation of language use by age in an area undergoing linguistic change

arXiv.org e-Print Archive

Directory of Open Access Journals

Portsmouth University Research Portal (Pure)

Employing geographical principles for sampling in state of the art dialectological projects

Author: Allen
Anderwald
Auer
Beal
Beal
Beal
Beal
Beal
Bondi
Britain
Britain
Britain
Britain
Buchstaller
Buchstaller
Cheshire
Childs
Cloke
Coombes
Coombes
Cowart
Eckert
Edwards
Ellis
Fasold
Fitzpatrick
Giddens
Glauser
Harris
Heslop
Hughes
Kolb
Kolb
Labov
Miller
Miller
Montgomery
Murray
Orton
Orton
Poussa
Quirk
Shucksmith
Soja
Stuart-Smith
Traugott
Viereck
Visser
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2013
Field of study

The aims of this paper are twofold: First, we locate the most effective human geographical methods for sampling across space in large-scale dialectological projects. We propose two geographical concepts as a basis for sampling decisions: Geo-demographic classification, which is a multidimensional method used for the socio-economic grouping of areas. We also develop an updated version of functional regions that can be used in sociolinguistic research. We then report on the results of a pilot project that applies these models to collect data regarding the acceptability of vernacular morpho-syntactic forms in the North-East of England. Following the method of natural breaks advocated for dialectology by Horvath and Horvath (2002), we interpret breaks in the probabilistic patterns as areas of dialect transitions. This study contributes to the debate about the role and limitations of spatiality in linguistic analysis. It intends to broaden our knowledge about the interfaces between human geography and dialectology

Northumbria Research Link

Crossref

SSOAR - Social Science Open Access Repository

Variation and linguistic theory

Author: Honeybone P.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2011
Field of study

Edinburgh Research Explorer

Sociolinguistics in the Netherlands

Author: Gorter D.
Publication venue: Niemeyer
Publication date: 01/01/2003
Field of study

KNAW Repository

International Migration, Integration and Social Cohesion online publications

Computational Sociolinguistics: A Survey

Author: de Jong Franciska
Doğruöz A. Seza
Nguyen Dong
Rosé Carolyn P.
Publication venue
Publication date: 01/01/2016
Field of study

Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

EUR Research Repository

University of Twente Research Information

Dialect contact and past BE in the English Fens

Author: Britain David J
Publication venue: Essex Research Reports in Linguistics
Publication date: 01/01/2001
Field of study

University of Essex Research Repository