Search CORE

40 research outputs found

Lexical evolution rates by automated stability measure

Author: Bakker D
Blanchard Ph
Dyen I Kruskal J B Black P
D’Urville D
Filippo Petroni
Greenhill S J Blust R Gray R D
Kroeber A
Maurizio Serva
Oswalt R
Petroni F
Serva M
Starostin S
Swadesh M
Thomas D D
Publication venue: 'IOP Publishing'
Publication date: 09/12/2009
Field of study

Phylogenetic trees can be reconstructed from the matrix which contains the distances between all pairs of languages in a family. Recently, we proposed a new method which uses normalized Levenshtein distances among words with same meaning and averages on all the items of a given list. Decisions about the number of items in the input lists for language comparison have been debated since the beginning of glottochronology. The point is that words associated to some of the meanings have a rapid lexical evolution. Therefore, a large vocabulary comparison is only apparently more accurate then a smaller one since many of the words do not carry any useful information. In principle, one should find the optimal length of the input lists studying the stability of the different items. In this paper we tackle the problem with an automated methodology only based on our normalized Levenshtein distance. With this approach, the program of an automated reconstruction of languages relationships is completed

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Cagliari

IRIS UniversitÃ Politecnica delle Marche

Assimilation in Multilingual Cities

Author: A Cattaneo
BR Chiswick
BR Chiswick
BR Chiswick
C Dustmann
D Cutler
EP Lazear
F Tubergen van
Gregory Verdugo
I Dyen
IE Isphording
J Beckhusen
J Church
Javier Ortega
P Kraus
S Rendon
T Bauer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

We characterise how the assimilation patterns of minorities into the strong and the weak language differ in a situation of asymmetric bilingualism. Using large variations in language composition in Canadian cities from the 2001 and 2006 Censuses, we show that the differences in the knowledge of English by immigrant allophones (i.e. the immigrants with a mother tongue other than English and French) in English-majority cities are mainly due to sorting across cities. Instead, in French-majority cities, learning plays an important role in explaining differences in knowledge of French. In addition, the presence of large anglophone minorities deters much more the assimilation into French than the presence of francophone minorities deters the assimilation into English. Finally, we find that language distance plays a much more important role in explaining assimilation into French, and that assimilation into French is much more sensitive to individual characteristics than assimilation into English. Some of these asymmetric assimilation patterns extend to anglophone and francophone immigrants, but no evidence of learning is found in this case

City Research Online

Crossref

Kingston University Research Repository

HAL-Paris1

QAPgrid: A Two Level QAP-Based Approach for Large-Scale Data Analysis and Visualization

Author: A Capp
A Elshafei
A Mendes
C Fleurent
C Rabak
D Bryant
E Loiola
E Taillard
F Glover
G Miranda
I Dyen
J Carrizo
J Dickey
J Kohler
L Buriol
L Gambardella
M Eisen
M Inostroza-Ponta
M Inostroza-Ponta
Mario Inostroza-Ponta
N Demirel
P Franca
P Merz
P Merz
P Moscato
P Moscato
P Moscato
P Moscato
P Moscato
P Shannon
Pablo Moscato
R Abbiw-Jackson
R Ahuja
R Battiti
R Berretta
R Berretta
R Burkard
Regina Berretta
S Margison
S Sahni
T James
TC Koopmans
Vladimir Brusic
W Li
Z Drezner
Z Drezner
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: The visualization of large volumes of data is a computationally challenging task that often promises rewarding new insights. There is great potential in the application of new algorithms and models from combinatorial optimisation. Datasets often contain “hidden regularities” and a combined identification and visualization method should reveal these structures and present them in a way that helps analysis. While several methodologies exist, including those that use non-linear optimization algorithms, severe limitations exist even when working with only a few hundred objects. Methodology/Principal Findings: We present a new data visualization approach (QAPgrid) that reveals patterns of similarities and differences in large datasets of objects for which a similarity measure can be computed. Objects are assigned to positions on an underlying square grid in a two-dimensional space. We use the Quadratic Assignment Problem (QAP) as a mathematical model to provide an objective function for assignment of objects to positions on the grid. We employ a Memetic Algorithm (a powerful metaheuristic) to tackle the large instances of this NP-hard combinatorial optimization problem, and we show its performance on the visualization of real data sets. Conclusions/Significance: Overall, the results show that QAPgrid algorithm is able to produce a layout that represents the relationships between objects in the data set. Furthermore, it also represents the relationships between clusters that are feed into the algorithm. We apply the QAPgrid on the 84 Indo-European languages instance, producing a near-optimal layout. Next, we produce a layout of 470 world universities with an observed high degree of correlation with the score used by the Academic Ranking of World Universities compiled in the The Shanghai Jiao Tong University Academic Ranking of World Universities without the need of an ad hoc weighting of attributes. Finally, our Gene Ontology-based study on Saccharomyces cerevisiae fully demonstrates the scalability and precision of our method as a novel alternative tool for functional genomics

University of Newcastle's Digital Repository

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Using hybridization networks to retrace the evolution of Indo-European languages

Author: A Boc
A Boc
A Boc
A Schleicher
AC Baugh
AF Buffington
AM Sciullo Di
Anna Maria Di Sciullo
B Pierre
BB Kachru
BH Hodgson
C Bowern
C Darwin
C Renfrew
D Bryant
D Bryant
D Kenrick
D Lightfoot
D Lightfoot
DH Huson
DH Huson
DH Huson
DH Huson
DH Huson
E Carlin
Etienne Lord
François-Joseph Lapointe
G Bonnet
G Longobardi
GA Bournoutian
Gilbert Labelle
H Geisler
HB Rolf Jr
HJ Bandelt
HJ Holm
I Dyen
I Roberts
J Clackson
J Diamond
J Schmidt
J-M List
J-M List
J-M List
J-M List
JM List
JRF Piette
K Rexová
L Iersel Van
L Nakhleh
L Steiner
Louise Laforest
M Delz
M Donohue
M Gimbutas
M Gimbutas
M Kolga
M Köllner
M Pagel
M Swadesh
M Willems
Matthieu Willems
N Saitou
N Smith
OW Robinson
P Heggarty
P Legendre
QD Atkinson
QD Atkinson
QD Atkinson
R Ark Van der
R Bouckaert
RD Gray
RL Trask
S Greenhill
S Nelson-Sathi
S Nelson-Sathi
S Thomason
S Wichmann
T Finkenstaedt
T Vogt
V Colonna
V Makarenkov
V Vellupilai
VI Levenshtein
Vladimir Makarenkov
WM Fitch
WS-Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Languages, Fees and the International Scope of Patenting

Crossref

Transformed steroids. 104. The preparation of [16,17]oxazolidinones of 20-ketosteroids by intramolecular isomerization of 16,17?-N-carboethoxyepiminopregnenolone

Author: A. M. Turuta
A. V. Kamernitskii
D. Kalsines
M. E. Dyen
Z. I. Istomina
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Why do People Learn Foreign Languages?

Author: A Dalby
I Dyen
Ignacio Ortuño Ortín
Philippe Van Parijs
Shlomo Weber
V Ginsburgh
V Ginsburgh
Victor A. Ginsburgh
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Crossref

Dated ancestral trees from binary trait data and their application to the diversification of languages

Author: Blust R
Drummond A. J.
Dyen I.
Embleton S.
Garrett A.
Kimura M.
Nakhleh L.
Pagel M.
Saitou N.
Sankoff D.
Swadesh M.
Warnow T.
Yang Z.
Publication venue
Publication date: 01/01/2008
Field of study

Binary trait data record the presence or absence of distinguishing traits in individuals. We treat the problem of estimating ancestral trees with time depth from binary trait data. Simple analysis of such data is problematic. Each homology class of traits has a unique birth event on the tree, and the birth event of a trait that is visible at the leaves is biased towards the leaves. We propose a model-based analysis of such data and present a Markov chain Monte Carlo algorithm that can sample from the resulting posterior distribution. Our model is based on using a birth-death process for the evolution of the elements of sets of traits. Our analysis correctly accounts for the removal of singleton traits, which are commonly discarded in real data sets. We illustrate Bayesian inference for two binary trait data sets which arise in historical linguistics. The Bayesian approach allows for the incorporation of information from ancestral languages. The marginal prior distribution of the root time is uniform. We present a thorough analysis of the robustness of our results to model misspecification, through analysis of predictive distributions for external data, and fitting data that are simulated under alternative observation models. The reconstructed ages of tree nodes are relatively robust, whereas posterior probabilities for topology are not reliable. Copyright (c) 2008 Royal Statistical Society.

Crossref

Oxford University Research Archive

Research Papers in Economics

MPG.PuRe

Recommended from our members

Frequency of word-use predicts rates of lexical evolution throughout Indo-European history

Author: Andrew Meade
C Renfrew
D Nettle
G Leech
GK Zipf
I Dyen
I Dyen
J Burger
J Milroy
JB Kruskal
JF Fontanari
JH Greenberg
JR Anderson
M Gimbutas
M Kaiser
M Kimura
M Pagel
M Pagel
M Pagel
M Pagel
M Swadesh
MA Steel
Mark Pagel
N Metropolis
NC Ellis
Quentin D. Atkinson
R Boyd
RD Gray
RG Gordon
S Kirby
S Sharoff
SG Thomason
W Croft
W Labov
W Mackay
WN Francis
Z Yang
Publication venue
Publication date: 01/01/2007
Field of study

Greek speakers say 'oυρά', Germans 'schwanz', and the French 'queue' to describe what English speakers call a 'tail', but all of these languages use a related form of 'two' to describe the number after one. Among over one hundred Indo-European languages and dialects, the words for some meanings, such as 'tail', evolve rapidly, being expressed across languages by dozens of unrelated words, whilst others evolve much more slowly, such as the number 'two' for which all Indo-European language speakers use the same related word-form. No general linguistic mechanism has been advanced to explain this striking variation in rates of lexical replacement among meanings. Here we use four large and divergent language corpora (English, Spanish, Russian and Greek) and a comparative database of 200 fundamental vocabulary meanings in 87 Indo-European languages to show that the frequency with which these words are used in modern language predicts their rate of replacement over thousands of years of Indo-European language evolution. Across all 200 meanings, frequently used words evolve at slower rates and infrequently used words evolve more rapidly. This relationship holds separately and identically across parts of speech for each of the four language corpora and accounts for approximately 50% of the variation observed in historical rates of lexical replacement. We propose that the frequency with which specific words are used in everyday language exerts a general and law-like influence on their rates of evolution. Our findings are consistent with social models of word change that emphasise the role of selection, and suggest that owing to the ways that humans use language, some words will evolve slowly and others rapidly across all languages.Citation: Pagel, M., Atkinson, Q. D. & Meade, A. (2007). ' Frequency of word use predicts rates of lexical evolution throughout Indo-European history', Nature, 449, 717-720. [Available at http://www.nature.com/nature/index.html]. N.B. Dr Atkinson is now based at the Institute of Cognitive and Evolutionary Anthropology, University of Oxford.

Central Archive at the University of Reading

Crossref

Aberystwyth Research Portal

Oxford University Research Archive

A mixed-integer programming approach to the clustering problem

Author: Arthanari T.S.
Baron D.N.
Burbank F.
Charnes A.
Dyen I.
Everitt B.
F. Glover
Glover F.
Goronzy F.
Hartigan J.A.
Hodson F.R.
Mulvey J.M.
N. Freed
Rao M.R.
Vinod H.D.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref