1,673 research outputs found

    THE PROBLEM OF INTERPRETATION OF PHYLOGENETIC TREES

    Get PDF
    Abstract. Phylogenetic algorithms have been used in a number of papers to describe the evolution of language families. In the paper the neighbor joining algorithm apply to the database of the Automated Similarity Judgment Program and results are compared with the common languages classification. A number of families have been considered in detail: North Caucasian languages, Turkic languages, Maya. In addition to recognized families, a hypothetical Nostratic macrofamily is also considered. When applying phylogenetic algorithms to databases, some errors occur. Possible causes of mistakes are analyzed, and a statement that mistakes are inevitable for phylogenetic algorithms is justified. The following main types of errors are identified. Languages in databases are represented as vectors of large dimension, while in the form of trees it is a one-dimensional structure. With decreasing dimension, the loss of information is mathematically unavoidable. Testing of one of the most popular phylogenetic algorithms – the algorithm of the neighbor joining – has been carried out, and it is shown that it gives an error in 13% of cases. Another source of error is the instability of phylogenetic algorithms – small (random) changes in the data can lead to a significant rearrangement of trees. A few recommendations on the methods of correct interpretation of results obtained via phylogenetic algorithms are proposed.Keywords: Phylogenetic Algorithm, Evolution Trees, ASJP Database, North-Caucasian languages, TurkicLanguages

    Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources

    Get PDF
    Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extraction from English texts) to new domains. Event extraction in other languages was not studied due to the lack of resources and algorithms necessary for natural language processing. In this paper we define a set of linguistic resources that are necessary in development of a knowledge-based event extraction system in Russian: a vocabulary of subordination models, a vocabulary of event triggers, and a vocabulary of Frame Elements that are basic building blocks for semantic patterns. We propose a set of methods for creation of such vocabularies in Russian and other languages using Google Books NGram Corpus. The methods are evaluated in development of event extraction system for Russian

    The difference in positivity of the Russian and English lexicon: The big data approach

    Get PDF
    Psychological cross-cultural studies have long noted differences in the degree of cognition positivity, or optimism, in various cultures. Herewith, the question whether the difference shows up at the level of the language lexicon remains unexplored. Linguistic positivity bias has been confirmed for a number of languages. The point of it is that most words have a positive connotation in the language. This begs the question: is linguistic positivity bias the same for different languages or not? In a sense, the issue is similar to the hypothesis of linguistic relativity suggesting the language impact on the human cognitive system. The problem has been researched only in one work (Dodds et al. 2015), where data on the positivity bias values are given for different languages and the comparison for each pair of languages is based on merely one pair of dictionaries. In the present study, we radically increase the computational baseline by comparing four English and five Russian dictionaries. We carry out the comparative study both at the level of vocabularies and at the level of texts of different genres. A new, previously untapped idea is to compare positivity ratings of translated texts. Also, English and Russian sentiment dictionaries are compared based on the scores of translation-stable words. The results suggest that the Russian language is somewhat slightly more positive than English at the level of vocabulary

    Observation of associated near-side and away-side long-range correlations in √sNN=5.02  TeV proton-lead collisions with the ATLAS detector

    Get PDF
    Two-particle correlations in relative azimuthal angle (Δϕ) and pseudorapidity (Δη) are measured in √sNN=5.02  TeV p+Pb collisions using the ATLAS detector at the LHC. The measurements are performed using approximately 1  μb-1 of data as a function of transverse momentum (pT) and the transverse energy (ΣETPb) summed over 3.1<η<4.9 in the direction of the Pb beam. The correlation function, constructed from charged particles, exhibits a long-range (2<|Δη|<5) “near-side” (Δϕ∼0) correlation that grows rapidly with increasing ΣETPb. A long-range “away-side” (Δϕ∼π) correlation, obtained by subtracting the expected contributions from recoiling dijets and other sources estimated using events with small ΣETPb, is found to match the near-side correlation in magnitude, shape (in Δη and Δϕ) and ΣETPb dependence. The resultant Δϕ correlation is approximately symmetric about π/2, and is consistent with a dominant cos⁡2Δϕ modulation for all ΣETPb ranges and particle pT
    corecore