26 research outputs found

    Macro- and Microevolution of Languages: Exploring Linguistic Divergence with Approaches from Evolutionary Biology

    Get PDF
    There are more than 7000 languages in the world, and many of these have emerged through linguistic divergence. While questions related to the drivers of linguistic diversity have been studied before, including studies with quantitative methods, there is no consensus as to which factors drive linguistic divergence, and how. In the thesis, I have studied linguistic divergence with a multidisciplinary approach, applying the framework and quantitative methods of evolutionary biology to language data. With quantitative methods, large datasets may be analyzed objectively, while approaches from evolutionary biology make it possible to revisit old questions (related to, for example, the shape of the phylogeny) with new methods, and adopt novel perspectives to pose novel questions. My chief focus was on the effects exerted on the speakers of a language by environmental and cultural factors. My approach was thus an ecological one, in the sense that I was interested in how the local environment affects humans and whether this human-environment connection plays a possible role in the divergence process. I studied this question in relation to the Uralic language family and to the dialects of Finnish, thus covering two different levels of divergence. However, as the Uralic languages have not previously been studied using quantitative phylogenetic methods, nor have population genetic methods been previously applied to any dialect data, I first evaluated the applicability of these biological methods to language data. I found the biological methodology to be applicable to language data, as my results were rather similar to traditional views as to both the shape of the Uralic phylogeny and the division of Finnish dialects. I also found environmental conditions, or changes in them, to be plausible inducers of linguistic divergence: whether in the first steps in the divergence process, i.e. dialect divergence, or on a large scale with the entire language family. My findings concerning Finnish dialects led me to conclude that the functional connection between linguistic divergence and environmental conditions may arise through human cultural adaptation to varying environmental conditions. This is also one possible explanation on the scale of the Uralic language family as a whole. The results of the thesis bring insights on several different issues in both a local and a global context. First, they shed light on the emergence of the Finnish dialects. If the approach used in the thesis is applied to the dialects of other languages, broader generalizations may be drawn as to the inducers of linguistic divergence. This again brings us closer to understanding the global patterns of linguistic diversity. Secondly, the quantitative phylogeny of the Uralic languages, with estimated times of language divergences, yields another hypothesis as to the shape and age of the language family tree. In addition, the Uralic languages can now be added to the growing list of language families studied with quantitative methods. This will allow broader inferences as to global patterns of language evolution, and more language families can be included in constructing the tree of the world’s languages. Studying history through language, however, is only one way to illuminate the human past. Therefore, thirdly, the findings of the thesis, when combined with studies of other language families, and those for example in genetics and archaeology, bring us again closer to an understanding of human history.Monet maailman yli 7000 kielestä ovat syntyneet erkaantumisprosessin kautta. Tällöin yhdestä kielestä muotoutuu eri tekijöiden vaikutuksesta aikojen saatossa useampia kieliä. Kielten erkaantumiseen vaikuttavia tekijöitä on tutkittu aiemminkin ja myös laskennallisia menetelmiä käyttäen. Vielä on kuitenkin epäselvää mitkä kaikki tekijät voivat vaikuttaa kielten erkaantumiseen ja miten. Tutkin väitöskirjassani kielten erkaantumiseen vaikuttavia tekijöitä. Lähestymistapani on monitieteinen, sillä sovellan laskennallisia evoluutiobiologian menetelmiä ja teorioita kieliaineistoon. Laskennalliset menetelmät mahdollistavat suurien aineistojen objektiivisen analysoinnin, kun taas evoluutiobiologisen lähestymistavan avulla voin muodostaa uudenlaisia tutkimuskysymyksiä ja käyttää uusia menetelmiä vastatakseni aiemmin esitettyihin kysymyksiin (esimerkiksi sukupuun muotoon liittyen). Tutkimuksessani keskityin selvittämään kielten erkaantumista ihmisen ekologian kannalta. Toisin sanoen olin kiinnostunut ympäristö- ja/tai kulttuuritekijöiden vaikutuksesta kielenpuhujiin ja siitä, voiko tämä kytkös olla osallisena kielten erkaantumisprosessissa. Tutkin kysymystä tämän prosessin kahdessa eri vaiheessa: sen alussa ennen kuin eriytyminen on kokonaan tapahtunut, ja sen jo tapahduttua. Murteiden eriytyminen vastaa prossessin alkuvaihetta, ja tutkin sitä suomen kielen murreaineistoa käyttäen. Tapahtuneita erkaantumisia tutkin sukupuista, joita tein uralilaisten kielten sanastoaineistosta. Koska uralilaisia kieliä ei ole aiemmin tutkittu vastaavanlaisin laskennallisin menetelmin eikä käyttämiäni populaatiogenetiikan menetelmiä ole käytetty aiemmin mihinkään murreaineistoon, testasin aluksi näiden menetelmien soveltuvuutta aineistojeni analysointiin. Totesin biologisten menetelmien soveltuvan kieliaineiston analysointiin, sillä tulokseni vastasivat perinteisiä näkemyksiä sekä uralilaisen sukupuun muodosta että suomen murrejaosta. Lisäksi havaitsin, että erot ympäristöoloissa mahdollisesti vaikuttavat kielten erkaantumiseen. Tämä oli havaittavissa niin eriytymisprosessin varhaisissa vaiheissa murteiden välillä kuin myös koko kieliryhmän eriytymisiä tutkittaessa. Koska ihmisten tiedetään usein sopeutuvan vallitseviin ympäristöolosuhteisiin kulttuurisopeumien avulla, päättelin murretutkimusteni tuloksista, että juuri kieltenpuhujien kulttuurinen sopeutuminen paikallisiin ympäristöolosuhteisiin saattaisi toimia puhujapopulaatioita erottavana tekijänä ja täten kytköksenä ympäristöerojen ja kielellisen erkaantumisen välillä. Tämä voisi mahdollisesti selittää myös uralilaisten kielten erkaantumisia. Väitöstutkimukseni tulokset tuovat uusia näkemyksiä kielten erkaantumiseen niin paikallisella kuin maailmanlaajuisellakin tasolla. Havaintoni ympäristöerojen mahdollisesta vaikutuksesta suomen murteiden muotoutumisessa herättää kysymyksen löytöni yleistettävyydestä myös muihin kieliin ja niiden murteisiin. Koska murteiden erkaantuminen on ensimmäinen vaihe kielen eriytymisprosessissa, on murteiden muotoutumista tutkimalla mahdollista myös selvittää, mitkä tekijät ovat aikaansaaneet maailmanlaajuisen kielten kirjon. Tästä syystä tarvitaan vastaavanlaisia tutkimuksia myös muiden kielten murteista. Esitän väitöskirjassani myös uralilaisten kielten laskennallisesti tehdyn sukupuun, jota voidaan verrata vastaavilla menetelmillä tehtyihin muiden kieliryhmien puihin. Tämän vertailun kautta on mahdollista selvittää onko kielisukupuiden muodossa jotain maailmanlaajuisia säännönmukaisuuksia, josta voi edelleen tehdä päätelmiä kieliin vaikuttavista lainalaisuuksista. Ihmiskunnan historian ja esihistorian selvittäminen on haasteellinen palapeli, jossa eri tieteenalojen palasia yhteen sovittelemalla voidaan päästä lähemmäksi yleistä ymmärrystä menneisyydestä. Väitöstutkimukseni on pieni osa tätä kokonaisuutta, mutta yhdistelemällä havaintojani niin muista kieliryhmistä tehtyihin havaintoihin kuin myös esimerkiksi arkeologian ja genetiikan tuloksiin, olemme taas askeleen lähempänä tätä tavoitetta.Siirretty Doriast

    Borrowability of kinship terms in Uralic languages

    Get PDF
    Kinship terms are assumed to be universal and central to social life, and consequently they are not particularly prone to borrowing. Borrowing of kinship terms does happen, however, and this provides us a lens with which to evaluate the nature and intensity of contact situations. In this study, we provide a general overview of the borrowability of kinship terms into the Uralic languages. We collected kinship terms from twenty Uralic languages and used a list of 146 kin categories total as the basis for our data collection. We found that affinal kin categories such as those denoting spouses, spouse’s siblings, and sibling’s spouses had the largest number of loanwords. However, among the kin categories with the largest number of loanwords were also consanguineal categories such as those of ‘mother’ and ‘father’. We also found that the Uralic languages vary notably in how large a percentage of their kinship terminology has been borrowed: the Mordvin languages have borrowed the most, more than 40 percent of their kinship terms, while for many Samoyedic languages no loanwords were detected in their kinship terminology. In addition to the quantitative approach, we also delve into the kin categories with the largest number of loanwords and discuss the patterns of these loanwords in certain languages, and the occurrence of semantic change as a factor explaining the large number of loanwords of terms for ‘husband’ and ‘wife’. All in all, borrowing of kin terms is a context-dependent process and it is challenging to make global generalizations. Nevertheless, we propose that borrowed kin terms could provide us the best possible material through which individual contact situations of the past could be studied. This study also summarizes the borrowed kin terms in the Uralic languages, brings the topic into the spotlight, and pinpoints cases where more research is needed

    A comprehensive spatial model for historical travel effort - a case study in Finland

    Get PDF
    Contributing to multidisciplinary studies of human population history, this paper presents an analysis chain to comprehensively model the historical travel environment in Finland, based on a study of spatial patterns of overall accessibility within the country. We created a spatial historical travel environment model over the whole country using high-quality terrain and landscape spatial data, combined with information from historical sources that characterize the landscape in terms of travel effort given the environmental and human-related factors current up until the late 19th century. Spatial analyses of historical travel effort based on the travel environment model indicate travel speeds for different parts of the country, ranging from 0.6 to 5.3 km/h. This is nearly a tenfold range, potentially highly significant for studies relying on historical travel effort and contacts between population groups in Finland. The results show that the overall travel effort in southern Finland is significantly smaller than in the north: almost all areas in southern Finland have average travel speeds above 3 km/h, whereas the average travel speeds below 2.5 km/h are typical in the north. A more detailed study using random 100 km transects highlights the variability of the least-cost routes in different landscapes and between different source data combinations in each cost surface. The paper identifies great potential in combining the existing spatial data archives with archaeological, linguistic, and genetic data in a GIS analysis, to study the travel effort and its impact on the observed spatial patterns of languages, genetic traits, and archaeological findings

    Town population size and structuring into villages and households drive infectious disease risks in pre-healthcare Finland

    Get PDF
    Social life is often considered to cost in terms of increased parasite or pathogen risk. However, evidence for this in the wild remains equivocal, possibly because populations and social groups are often structured, which affects the local transmission and extinction of diseases. We test how the structuring of towns into villages and households influenced the risk of dying from three easily diagnosable infectious diseases-smallpox, pertussis and measles-using a novel dataset covering almost all of Finland in the pre-healthcare era (1800-1850). Consistent with previous results, the risk of dying from all three diseases increased with the local population size. However, the division of towns into a larger number of villages decreased the risk of dying from smallpox and to some extent of pertussis but it slightly increased the risk for measles. Dividing towns into a larger number of households increased the length of the epidemic for all three diseases and led to the expected slower spread of the infection. However, this could be seen only when local population sizes were small. Our results indicate that the effect of population structure on epidemics, disease or parasite risk varies between pathogens and population sizes, hence lowering the ability to generalize the consequences of epidemics in spatially structured populations, and mapping the costs of social life, via parasites and diseases

    Clustering lexical variation of Finnic languages based on Atlas Linguarum Fennicarum.

    Get PDF
    The article focuses on lexical relations of the Finnic languages. Here we studied whether lexical data is suitable for detecting the coarse-grained and fine-grained substructure within the Finnic group. We evaluated this by clustering old lexical variation from a dialectal dataset covering the whole Finnic speaker area (Atlas Linguarum Fennicarum; ALFE) using quantitative methods adopted from population genetics, and by comparing our results to groups suggested by earlier linguistic literature. We found the main lexical division between north-eastern and south-western Finnic. According to our lexical analysis, the Finnic languages are Finnish, North Estonian, South Estonian, Livonian, Karelian, Veps, and Votic-Ingrian. These groups matched well with the earlier suggested divisions, and we concluded that lexical data could be utilised more often in defining linguistic sub-structures, especially in linguistic situations that involve dialect continua.Peer reviewe

    Applying Population Genetic Approaches within Languages: Finnish Dialects as Linguistic Populations

    Get PDF
    The adoption of evolutionary approaches to study language change as a type of non-biological evolution has gained increasing interest and introduced a variety of quantitative tools to linguistics. The focus has thus far mainly been on language families, or ‘linguistic macroevolution,’ and taken the shape of linguistic phylogenetics. Here we explore whether evolutionary methods could be applicable for studying intra-lingual variation (‘linguistic microevolution’) by testing a population genetic clustering method for analyzing the ‘population structure’ of Finnish dialects. We compare the results with traditional dialect divisions established in the literature and with K-medoids clustering, which is free from biological assumptions. The results are encouragingly similar to each other and agree with traditional views, suggesting that population genetic tools could be a useful addition to the dialectological toolkit. We also show how the results of the model-based clustering could serve as a basis for further study.</p

    Best practices in justifying calibrations for dating language families

    Get PDF
    The use of computational methods to assign absolute datings to language divergence is receiving renewed interest, as modern approaches based on Bayesian statistics offer alternatives to the discredited techniques of glottochronology. The datings provided by these new analyses depend crucially on the use of calibration, but the methodological issues surrounding calibration have received comparatively little attention. Especially, underappreciated is the extent to which traditional historical linguistic scholarship can contribute to the calibration process via loanword analysis. Aiming at a wide audience, we provide a detailed discussion of calibration theory and practice, evaluate previously used calibrations, recommend best practices for justifying calibrations, and provide a concrete example of these practices via a detailed derivation of calibrations for the Uralic language family. This article aims to inspire a higher quality of scholarship surrounding all statistical approaches to language dating, and especially closer engagement between practitioners of statistical methods and traditional historical linguists, with the former thinking more carefully about the arguments underlying their calibrations and the latter more clearly identifying results of their work which are relevant to calibration, or even suggesting calibrations directly.</p

    Socio-cultural similarity with host population rather than ecological similarity predicts success and failure of human migrations

    Get PDF
    Demographers argue that human migration patterns are shaped by people moving to better environments. More recently, however, evolutionary theorists have argued that people move to similar environments to which they are culturally adapted. While previous studies analysing which factors affect migration patterns have focused almost exclusively on successful migrations, here we take advantage of a natural experiment during World War II in which an entire population was forcibly displaced but were then allowed to return home to compare successful with unsuccessful migrations. We test two competing hypotheses: (1) individuals who relocate to environments that are superior to their place of origin will be more likely to remain—The Better Environment Hypothesis or (2) individuals who relocate to environments that are similar to their place of origin will be more likely to remain—The Similar Environment Hypothesis. Using detailed records recording the social, cultural, linguistic and ecological conditions of the origin and destination locations, we find that cultural similarity (e.g. linguistic similarity and marrying within one’s own minority ethnic group)—rather than ecological differences—are the best predictors of successful migrations. These results suggest that social relationships, empowered by cultural similarity with the host population, play a critical role in successful migrations and provide limited support for the similar environment hypothesis. Overall, these results demonstrate the importance of comparing unsuccessful with successful migrations in efforts understand the engines of human dispersal and suggest that the primary obstacles to human migrations and successful range expansion are sociocultural rather than ecological.</p

    Crouching TIGER, hidden structure: Exploring the nature of linguistic data using TIGER values

    Get PDF
    In recent years, techniques such as Bayesian inference of phylogeny have become a standard part of the quantitative linguistic toolkit. While these tools successfully model the tree-like component of a linguistic dataset, real-world datasets generally include a combination of tree-like and nontree-like signals. Alongside developing techniques for modeling nontree-like data, an important requirement for future quantitative work is to build a principled understanding of this structural complexity of linguistic datasets. Some techniques exist for exploring the general structure of a linguistic dataset, such as NeighborNets, delta scores, and Q-residuals; however, these methods are not without limitations or drawbacks. In general, the question of what kinds of historical structure a linguistic dataset can contain and how these might be detected or measured remains critically underexplored from an objective, quantitative perspective. In this article, we propose TIGER values, a metric that estimates the internal consistency of a genetic dataset, as an additional metric for assessing how tree-like a linguistic dataset is. We use TIGER values to explore simulated language data ranging from very tree-like to completely unstructured, and also use them to analyze a cognate-coded basic vocabulary dataset of Uralic languages. As a point of comparison for the TIGER values, we also explore the same data using delta scores, Q-residuals, and NeighborNets. Our results suggest that TIGER values are capable of both ranking tree-like datasets according to their degree of treelikeness, as well as distinguishing datasets with tree-like structure from datasets with a nontree-like structure. Consequently, we argue that TIGER values serve as a useful metric for measuring the historical heterogeneity of datasets. Our results also highlight the complexities in measuring treelikeness from linguistic data, and how the metrics approach this question from different perspectives
    corecore