5,853 research outputs found

    Fighting with the Sparsity of Synonymy Dictionaries

    Full text link
    Graph-based synset induction methods, such as MaxMax and Watset, induce synsets by performing a global clustering of a synonymy graph. However, such methods are sensitive to the structure of the input synonymy graph: sparseness of the input dictionary can substantially reduce the quality of the extracted synsets. In this paper, we propose two different approaches designed to alleviate the incompleteness of the input dictionaries. The first one performs a pre-processing of the graph by adding missing edges, while the second one performs a post-processing by merging similar synset clusters. We evaluate these approaches on two datasets for the Russian language and discuss their impact on the performance of synset induction methods. Finally, we perform an extensive error analysis of each approach and discuss prominent alternative methods for coping with the problem of the sparsity of the synonymy dictionaries.Comment: In Proceedings of the 6th Conference on Analysis of Images, Social Networks, and Texts (AIST'2017): Springer Lecture Notes in Computer Science (LNCS

    Models of atypical development must also be models of normal development

    Get PDF
    Functional magnetic resonance imaging studies of developmental disorders and normal cognition that include children are becoming increasingly common and represent part of a newly expanding field of developmental cognitive neuroscience. These studies have illustrated the importance of the process of development in understanding brain mechanisms underlying cognition and including children ill the study of the etiology of developmental disorders

    Diversity, competition, extinction: the ecophysics of language change

    Get PDF
    As early indicated by Charles Darwin, languages behave and change very much like living species. They display high diversity, differentiate in space and time, emerge and disappear. A large body of literature has explored the role of information exchanges and communicative constraints in groups of agents under selective scenarios. These models have been very helpful in providing a rationale on how complex forms of communication emerge under evolutionary pressures. However, other patterns of large-scale organization can be described using mathematical methods ignoring communicative traits. These approaches consider shorter time scales and have been developed by exploiting both theoretical ecology and statistical physics methods. The models are reviewed here and include extinction, invasion, origination, spatial organization, coexistence and diversity as key concepts and are very simple in their defining rules. Such simplicity is used in order to catch the most fundamental laws of organization and those universal ingredients responsible for qualitative traits. The similarities between observed and predicted patterns indicate that an ecological theory of language is emerging, supporting (on a quantitative basis) its ecological nature, although key differences are also present. Here we critically review some recent advances lying and outline their implications and limitations as well as open problems for future research.Comment: 17 Pages. A review on current models from statistical Physics and Theoretical Ecology applied to study language dynamic

    Word association research and the L2 lexicon

    Get PDF
    Since its modern inception in the late nineteenth century, research on word associations has developed into a large and diverse area of study, including work with both applied linguistic and psycholinguistic orientations. However, despite significant recent interest in the use of word association to investigate second language (L2) vocabulary knowledge and testing, there has until now been no systematic attempt to review the wider word association research tradition for the benefit of second language-oriented researchers and practitioners. This paper seeks to address this, drawing together linguistic research from the past 150 years, with a focus on research published since 2000. We evaluate the current state of L2 word association research, before identifying methodological and theoretical themes from a broader range of disciplinary approaches. Emerging from this, new paradigms are identified which have potential to catalyse a new phase of work for second-language word association scholars, and which indicate priority foci for future work

    Detecting non-tree-like signal using multiple tree topologies

    Get PDF
    Recent applications of phylogenetic methods to historical linguistics have been criticized for assuming a tree structure in which ancestral languages differentiate and split up into daughter languages, while language evolution is inherently non-tree-like (François 2014; Blench 2015: 32–33). This article attempts to contribute to this debate by discussing the use of the multiple topologies method (Pagel & Meade 2006a) implemented in BayesPhyloge- nies (Pagel & Meade 2004). This method is applied to lexical datasets from four different language families: Austronesian (Gray, Drummond & Green- hill 2009), Sinitic (Ben Hamed & Wang 2006), Indo-European (Bouckaert et al. 2012), and Japonic (Lee & Hasegawa 2011). Evidence for multiple topologies is found in all families except, surprisingly, Austronesian. It is suggested that reticulation may arise from a number of processes, including dialect chain break-up, borrowing (both shortly after language splits and later on), incomplete lineage sorting, and characteristics of lexical datasets. It is shown that the multiple topologies method is a useful tool to study the dynamics of language evolution
    corecore