5,853 research outputs found
Fighting with the Sparsity of Synonymy Dictionaries
Graph-based synset induction methods, such as MaxMax and Watset, induce
synsets by performing a global clustering of a synonymy graph. However, such
methods are sensitive to the structure of the input synonymy graph: sparseness
of the input dictionary can substantially reduce the quality of the extracted
synsets. In this paper, we propose two different approaches designed to
alleviate the incompleteness of the input dictionaries. The first one performs
a pre-processing of the graph by adding missing edges, while the second one
performs a post-processing by merging similar synset clusters. We evaluate
these approaches on two datasets for the Russian language and discuss their
impact on the performance of synset induction methods. Finally, we perform an
extensive error analysis of each approach and discuss prominent alternative
methods for coping with the problem of the sparsity of the synonymy
dictionaries.Comment: In Proceedings of the 6th Conference on Analysis of Images, Social
Networks, and Texts (AIST'2017): Springer Lecture Notes in Computer Science
(LNCS
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
Recommended from our members
Towards a Unified Model of Language Acquisition
In this theoretical paper, we first review and rebut standard criticisms against distributional approaches to language acquisition. We then present two closely-related models that use distributional analysis. The first deals with the acquisition of vocabulary, the second with grammatical development. We show how these two models can be combined with a semantic network grown using Hebbian learning, and briefly illustrate the advantages of this combination. An important feature of this hybrid system is that it combines two different types of distributional learning, the first based on order, and the second based on co-occurrences within a context
Models of atypical development must also be models of normal development
Functional magnetic resonance imaging studies of developmental disorders and normal cognition that include children are becoming increasingly common and represent part of a newly expanding field of developmental cognitive neuroscience. These studies have illustrated the importance of the process of development in understanding brain mechanisms underlying cognition and including children ill the study of the etiology of developmental disorders
Diversity, competition, extinction: the ecophysics of language change
As early indicated by Charles Darwin, languages behave and change very much
like living species. They display high diversity, differentiate in space and
time, emerge and disappear. A large body of literature has explored the role of
information exchanges and communicative constraints in groups of agents under
selective scenarios. These models have been very helpful in providing a
rationale on how complex forms of communication emerge under evolutionary
pressures. However, other patterns of large-scale organization can be described
using mathematical methods ignoring communicative traits. These approaches
consider shorter time scales and have been developed by exploiting both
theoretical ecology and statistical physics methods. The models are reviewed
here and include extinction, invasion, origination, spatial organization,
coexistence and diversity as key concepts and are very simple in their defining
rules. Such simplicity is used in order to catch the most fundamental laws of
organization and those universal ingredients responsible for qualitative
traits. The similarities between observed and predicted patterns indicate that
an ecological theory of language is emerging, supporting (on a quantitative
basis) its ecological nature, although key differences are also present. Here
we critically review some recent advances lying and outline their implications
and limitations as well as open problems for future research.Comment: 17 Pages. A review on current models from statistical Physics and
Theoretical Ecology applied to study language dynamic
Word association research and the L2 lexicon
Since its modern inception in the late nineteenth century, research on word associations has developed into a large and diverse area of study, including work with both applied linguistic and psycholinguistic orientations. However, despite significant recent interest in the use of word association to investigate second language (L2) vocabulary knowledge and testing, there has until now been no systematic attempt to review the wider word association research tradition for the benefit of second language-oriented researchers and practitioners. This paper seeks to address this, drawing together linguistic research from the past 150 years, with a focus on research published since 2000. We evaluate the current state of L2 word association research, before identifying methodological and theoretical themes from a broader range of disciplinary approaches. Emerging from this, new paradigms are identified which have potential to catalyse a new phase of work for second-language word association scholars, and which indicate priority foci for future work
Detecting non-tree-like signal using multiple tree topologies
Recent applications of phylogenetic methods to historical linguistics have been criticized for assuming a tree structure in which ancestral languages differentiate and split up into daughter languages, while language evolution is inherently non-tree-like (François 2014; Blench 2015: 32–33). This article attempts to contribute to this debate by discussing the use of the multiple topologies method (Pagel & Meade 2006a) implemented in BayesPhyloge- nies (Pagel & Meade 2004). This method is applied to lexical datasets from four different language families: Austronesian (Gray, Drummond & Green- hill 2009), Sinitic (Ben Hamed & Wang 2006), Indo-European (Bouckaert et al. 2012), and Japonic (Lee & Hasegawa 2011). Evidence for multiple topologies is found in all families except, surprisingly, Austronesian. It is suggested that reticulation may arise from a number of processes, including dialect chain break-up, borrowing (both shortly after language splits and later on), incomplete lineage sorting, and characteristics of lexical datasets. It is shown that the multiple topologies method is a useful tool to study the dynamics of language evolution
- …