5 research outputs found

    ΠŸΡ€ΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ освоСния языка ΠΊ Ρ€Π΅ΡˆΠ΅Π½ΠΈΡŽ Π·Π°Π΄Π°Ρ‡ΠΈ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ ΠΌΠ°Π»Ρ‹Ρ… языков

    Get PDF
    The problem of building a computer model of a small language was under solution. The relevance of this task is due to the following considerations: the need to eliminate the information inequality between speakers of different languages; the need for new tools for the study of poorly understood languages, as well as innovative approaches to language modeling in the low-resource context; the problem of supporting and developing small languages.There are three main objectives in solving the problem of small natural language processing at the stage of describing the problem situation: to justify the problem of modeling language in the context of resource scarcity as a special task in the field of natural languages processing, to review the literature on the relevant topic, to develop the concept of language acquisition model with a relatively small number of available resources. Computer modeling techniques using neural networks, semi-supervised learning and reinforcement learning were involved.The paper provides a review of the literature on modeling the learning of vocabulary, morphology, and grammar of a child's native language. Based on the current understanding of the language acquisition and existing computer models of this process, the architecture of the system of small language processing, which is taught through modeling of ontogenesis, is proposed. The main components of the system and the principles of their interaction are highlighted. The system is based on a module built on the basis of modern dialogical language models and taught in some rich-resources language (e.g., English). During training, an intermediate layer is used which represents statements in some abstract form, for example, in the symbols of formal semantics. The relationship between the formal recording of utterances and their translation into the target low-resource language is learned by modeling the child's acquisition of vocabulary and grammar of the language. One of components stands for the non-linguistic context in which language learning takes place.This article explores the problem of modeling small languages. A detailed substantiation of the relevance of modeling small languages is given: the social significance of the problem is noted, the benefits for linguistics, ethnography, ethnology and cultural anthropology are shown. The ineffectiveness of approaches applied to large languages in conditions of aβ€―lack of resources is noted. A model of language learning by means of ontogenesis simulation is proposed, which is based both on the results obtained in the field of computer modeling and on the data of psycholinguistics.Π Π΅ΡˆΠ°Π΅Ρ‚ΡΡ Π·Π°Π΄Π°Ρ‡Π° построСния ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π½ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΌΠ°Π»ΠΎΠ³ΠΎ языка. Π•Π΅ Π°ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½ΠΎΡΡ‚ΡŒ обусловлСна Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎΡΡ‚ΡŒΡŽ устранСния ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½ΠΎΠ³ΠΎ нСравСнства ΠΌΠ΅ΠΆΠ΄Ρƒ носитСлями Ρ€Π°Π·Π»ΠΈΡ‡Π½Ρ‹Ρ… языков, Π²ΠΎΡΡ‚Ρ€Π΅Π±ΠΎΠ²Π°Π½Π½ΠΎΡΡ‚ΡŒΡŽ Π½ΠΎΠ²Ρ‹Ρ… инструмСнтов для исслСдования ΠΌΠ°Π»ΠΎΠΈΠ·ΡƒΡ‡Π΅Π½Π½Ρ‹Ρ… языков ΠΈ ΠΈΠ½Π½ΠΎΠ²Π°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΎΠ² ΠΊ ΠΌΠΎΠ΄Π΅Π»ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΡŽ языка Π² условиях Π΄Π΅Ρ„ΠΈΡ†ΠΈΡ‚Π° рСсурсов, Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎΡΡ‚ΡŒΡŽ ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠΈ ΠΈ развития языков ΠΌΠ°Π»Ρ‹Ρ… Π½Π°Ρ€ΠΎΠ΄ΠΎΠ².ΠŸΡ€ΠΈ Ρ€Π΅ΡˆΠ΅Π½ΠΈΠΈ Π·Π°Π΄Π°Ρ‡ΠΈ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ ΠΌΠ°Π»Ρ‹Ρ… языков Π½Π° этапС описания ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΠ½ΠΎΠΉ ситуации ΠΏΡ€Π΅ΡΠ»Π΅Π΄ΡƒΡŽΡ‚ΡΡ Ρ‚Ρ€ΠΈ основныС Ρ†Π΅Π»ΠΈ: обоснованиС ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹ модСлирования языка Π² условиях Π΄Π΅Ρ„ΠΈΡ†ΠΈΡ‚Π° рСсурсов ΠΊΠ°ΠΊ особой Π·Π°Π΄Π°Ρ‡ΠΈ Π² сфСрС модСлирования СстСствСнных языков, ΠΎΠ±Π·ΠΎΡ€ Π»ΠΈΡ‚Π΅Ρ€Π°Ρ‚ΡƒΡ€Ρ‹ ΠΏΠΎ ΡΠΎΠΎΡ‚Π²Π΅Ρ‚ΡΡ‚Π²ΡƒΡŽΡ‰Π΅ΠΉ Ρ‚Π΅ΠΌΠ΅ ΠΈ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΊΠ° ΠΊΠΎΠ½Ρ†Π΅ΠΏΡ†ΠΈΠΈ ΠΌΠΎΠ΄Π΅Π»ΠΈ усвоСния языка с ΠΎΡ‚Π½ΠΎΡΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎ ΠΌΠ°Π»Ρ‹ΠΌ числом доступных рСсурсов. Π˜ΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡŽΡ‚ΡΡ ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π½ΠΎΠ³ΠΎ модСлирования с ΠΏΡ€ΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ΠΌ Π½Π΅ΠΉΡ€ΠΎΠ½Π½Ρ‹Ρ… сСтСй, ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅ с частичным ΠΏΡ€ΠΈΠ²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ΠΌ учитСля ΠΈ ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅ с ΠΏΠΎΠ΄ΠΊΡ€Π΅ΠΏΠ»Π΅Π½ΠΈΠ΅ΠΌ.Π’Β  Ρ€Π°Π±ΠΎΡ‚Π΅Β  ΠΏΡ€ΠΈΠ²Π΅Π΄Π΅Π½ ΠΎΠ±Π·ΠΎΡ€Β  Π»ΠΈΡ‚Π΅Ρ€Π°Ρ‚ΡƒΡ€Ρ‹, посвящСнной ΠΌΠΎΠ΄Π΅Π»ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΡŽΒ  изучСния  лСксики,Β  ΠΌΠΎΡ€Ρ„ΠΎΠ»ΠΎΠ³ΠΈΠΈ ΠΈ Π³Ρ€Π°ΠΌΠΌΠ°Ρ‚ΠΈΠΊΠΈ Ρ€ΠΎΠ΄Π½ΠΎΠ³ΠΎ языка Ρ€Π΅Π±Π΅Π½ΠΊΠΎΠΌ. На основании соврСмСнных прСдставлСний ΠΎ Ρ…ΠΎΠ΄Π΅ изучСния языка ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π° Π°Ρ€Ρ…ΠΈΡ‚Π΅ΠΊΡ‚ΡƒΡ€Π° систСмы ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ ΠΌΠ°Π»ΠΎΠ³ΠΎ языка, которая ΠΏΡ€ΠΈ ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠΈ опираСтся Π½Π° ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π½ΠΎΠ΅ ΠΌΠΎΠ΄Π΅Π»ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ ΠΎΠ½Ρ‚ΠΎΠ³Π΅Π½Π΅Π·Π°. Π’Ρ‹Π΄Π΅Π»Π΅Π½Ρ‹ основныС ΠΊΠΎΠΌΠΏΠΎΠ½Π΅Π½Ρ‚Ρ‹ систСмы ΠΈβ€―ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΡ‹ ΠΈΡ… взаимодСйствия. Π’ основС систСмы Π»Π΅ΠΆΠΈΡ‚ ΠΌΠΎΠ΄ΡƒΠ»ΡŒ, построСнный Π½Π° Π±Π°Π·Π΅ соврСмСнных Π΄ΠΈΠ°Π»ΠΎΠ³ΠΎΠ²Ρ‹Ρ… языковых ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉΒ  ΠΈΒ  ΠΎΠ±ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ Π½Π°Β  ΠΊΠ°ΠΊΠΎΠΌ-Π»ΠΈΠ±ΠΎ ΠΊΡ€ΡƒΠΏΠ½ΠΎΠΌ языкС,Β  Π½Π°ΠΏΡ€ΠΈΠΌΠ΅Ρ€ английском. ΠŸΡ€ΠΈβ€―ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΠΈ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ ΠΏΡ€ΠΎΠΌΠ΅ΠΆΡƒΡ‚ΠΎΡ‡Π½Ρ‹ΠΉ слой, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ прСдставляСт высказывания Π² Π½Π΅ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΌ абстрактном Π²ΠΈΠ΄Π΅, Π½Π°ΠΏΡ€ΠΈΠΌΠ΅Ρ€, Π² символах Ρ„ΠΎΡ€ΠΌΠ°Π»ΡŒΠ½ΠΎΠΉ сСмантики. Π‘ΠΎΠΎΡ‚Π½ΠΎΡˆΠ΅Π½ΠΈΠ΅ ΠΌΠ΅ΠΆΠ΄Ρƒ Ρ„ΠΎΡ€ΠΌΠ°Π»ΡŒΠ½ΠΎΠΉ записью высказываний ΠΈ ΠΈΡ… ΠΏΠ΅Ρ€Π΅Π²ΠΎΠ΄ΠΎΠΌ Π½Π° Ρ†Π΅Π»Π΅Π²ΠΎΠΉ ΠΌΠ°Π»Ρ‹ΠΉ язык изучаСтся ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΎΠΌ модСлирования процСсса усвоСния лСксики ΠΈ Π³Ρ€Π°ΠΌΠΌΠ°Ρ‚ΠΈΠΊΠΈ языка Ρ€Π΅Π±Π΅Π½ΠΊΠΎΠΌ. ΠžΡ‚Π΄Π΅Π»ΡŒΠ½Ρ‹ΠΉ ΠΊΠΎΠΌΠΏΠΎΠ½Π΅Π½Ρ‚ ΠΈΠΌΠΈΡ‚ΠΈΡ€ΡƒΠ΅Ρ‚ нСязыковой контСкст, Π² ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΌ происходит ΠΈΠ·ΡƒΡ‡Π΅Π½ΠΈΠ΅ языка.Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ исслСдуСтся Π·Π°Π΄Π°Ρ‡Π° модСлирования ΠΌΠ°Π»Ρ‹Ρ… языков. Π”Π°Π½ΠΎ ΠΏΠΎΠ΄Ρ€ΠΎΠ±Π½ΠΎΠ΅ обоснованиС Π°ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ модСлирования ΠΌΠ°Π»Ρ‹Ρ… языков: ΠΏΠΎΠΊΠ°Π·Π°Π½Π° ΡΠΎΡ†ΠΈΠ°Π»ΡŒΠ½Π°Ρ Π·Π½Π°Ρ‡ΠΈΠΌΠΎΡΡ‚ΡŒ этой ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹, польза Π΅Π΅ Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ для лингвистики, этнографии, этнологии ΠΈ ΠΊΡƒΠ»ΡŒΡ‚ΡƒΡ€Π½ΠΎΠΉ Π°Π½Ρ‚Ρ€ΠΎΠΏΠΎΠ»ΠΎΠ³ΠΈΠΈ. ΠžΡ‚ΠΌΠ΅Ρ‡Π΅Π½Π° Π½Π΅ΡΡ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΎΡΡ‚ΡŒ ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΎΠ², примСняСмых ΠΊ ΠΊΡ€ΡƒΠΏΠ½Ρ‹ΠΌ языкам, Π² условиях Π΄Π΅Ρ„ΠΈΡ†ΠΈΡ‚Π° рСсурсов. ΠŸΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π° модСль изучСния языка с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ ΠΈΠΌΠΈΡ‚Π°Ρ†ΠΈΠΈ ΠΎΠ½Ρ‚ΠΎΠ³Π΅Π½Π΅Π·Π°, которая опираСтся ΠΊΠ°ΠΊ Π½Π° ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹Π΅ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ Π² области ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π½ΠΎΠ³ΠΎ модСлирования, Ρ‚Π°ΠΊ ΠΈ Π½Π° Π΄Π°Π½Π½Ρ‹Π΅ психолингвистики

    Introducing Meta-analysis in the Evaluation of Computational Models of Infant Language Development

    Get PDF
    Computational models of child language development can help us understand the cognitive underpinnings of the language learning process, which occurs along several linguistic levels at once (e.g., prosodic and phonological). However, in light of the replication crisis, modelers face the challenge of selecting representative and consolidated infant data. Thus, it is desirable to have evaluation methodologies that could account for robust empirical reference data, across multiple infant capabilities. Moreover, there is a need for practices that can compare developmental trajectories of infants to those of models as a function of language experience and development. The present study aims to take concrete steps to address these needs by introducing the concept of comparing models with large-scale cumulative empirical data from infants, as quantified by meta-analyses conducted across a large number of individual behavioral studies. We formalize the connection between measurable model and human behavior, and then present a conceptual framework for meta-analytic evaluation of computational models. We exemplify the meta-analytic model evaluation approach with two modeling experiments on infant-directed speech preference and native/non-native vowel discrimination.Peer reviewe

    TOWARDS THE GROUNDING OF ABSTRACT CATEGORIES IN COGNITIVE ROBOTS

    Get PDF
    The grounding of language in humanoid robots is a fundamental problem, especially in social scenarios which involve the interaction of robots with human beings. Indeed, natural language represents the most natural interface for humans to interact and exchange information about concrete entities like KNIFE, HAMMER and abstract concepts such as MAKE, USE. This research domain is very important not only for the advances that it can produce in the design of human-robot communication systems, but also for the implication that it can have on cognitive science. Abstract words are used in daily conversations among people to describe events and situations that occur in the environment. Many scholars have suggested that the distinction between concrete and abstract words is a continuum according to which all entities can be varied in their level of abstractness. The work presented herein aimed to ground abstract concepts, similarly to concrete ones, in perception and action systems. This permitted to investigate how different behavioural and cognitive capabilities can be integrated in a humanoid robot in order to bootstrap the development of higher-order skills such as the acquisition of abstract words. To this end, three neuro-robotics models were implemented. The first neuro-robotics experiment consisted in training a humanoid robot to perform a set of motor primitives (e.g. PUSH, PULL, etc.) that hierarchically combined led to the acquisition of higher-order words (e.g. ACCEPT, REJECT). The implementation of this model, based on a feed-forward artificial neural networks, permitted the assessment of the training methodology adopted for the grounding of language in humanoid robots. In the second experiment, the architecture used for carrying out the first study was reimplemented employing recurrent artificial neural networks that enabled the temporal specification of the action primitives to be executed by the robot. This permitted to increase the combinations of actions that can be taught to the robot for the generation of more complex movements. For the third experiment, a model based on recurrent neural networks that integrated multi-modal inputs (i.e. language, vision and proprioception) was implemented for the grounding of abstract action words (e.g. USE, MAKE). Abstract representations of actions ("one-hot" encoding) used in the other two experiments, were replaced with the joints values recorded from the iCub robot sensors. Experimental results showed that motor primitives have different activation patterns according to the action's sequence in which they are embedded. Furthermore, the performed simulations suggested that the acquisition of concepts related to abstract action words requires the reactivation of similar internal representations activated during the acquisition of the basic concepts, directly grounded in perceptual and sensorimotor knowledge, contained in the hierarchical structure of the words used to ground the abstract action words.This study was financed by the EU project RobotDoC (235065) from the Seventh Framework Programme (FP7), Marie Curie Actions Initial Training Network

    La estructuraciΓ³n temΓ‘tica en inglΓ©s y espaΓ±ol: anotaciΓ³n contrastiva de un corpus bilingΓΌe para aplicaciones lingΓΌΓ­sticas y computacionales

    Get PDF
    Tesis inΓ©dita de la Universidad Complutense de Madrid, Facultad de FilologΓ­a, Departamento de FilologΓ­a Inglesa, leΓ­da el 04-12-2015Thematization is recognized as a fundamental phenomenon in the construction of messages and texts by di erent linguistic schools. This location within a text privileges the elements that guide the reader in the orientation and interpretation of discourse at di erent levels. Thematizing a linguistic unit by locating it in the rst-initial position of a clause, paragraph, or text, confers upon it a special status: a signal of the organizational strategy which characterizes di erent text types playing a role as a variable in the distinction of registers, text types and genres. However, in spite of the importance of the study of thematization for message and textual structuring, to date there are no linguistic studies that have undertook the task of validating its aspects in a comparative manner, either for linguistic or computational purposes. This study, therefore, lls a research gap by implementing a methodology based on contrastive corpus annotation, which allows to empirically validate aspects of the phenomenon of Thematization in English and Spanish, it also seeks to develop a bilingual English-Spanish comparable corpus of newspaper texts automatically annotated with thematic features at clausal and discourse levels. The empirically validated categories (Thematic Field and its elements: Textual Theme, Interpersonal Theme, PreHead and Head) are used to annotate a larger corpus of three newspaper genres news reports, editorials and letters to the editor in terms of thematic choices. This characterization, reveals interesting results, such as the use of genre-speci c strategies in thematic position. In addition, the thesis investigates the possibility to automate the annotation of thematic features in the bilingual corpus through the development of a set of JAVA rules implemented in GATE. It also shows the e cacy of this method in comparison with the manual annotation results...Depto. de Estudios Ingleses: LingΓΌΓ­stica y LiteraturaFac. de FilologΓ­aTRUEunpu
    corecore