10 research outputs found

    TOWARDS WORD SENSES AND LINKS BETWEEN THEM

    Full text link
    In this study, we demonstrate an unsupervised approach for constructing a semantic network uniting word senses (or word concepts) rather than the coarse-grained con-cepts. The reported study was funded by RFBR (project no. 16-37-00354 ΠΌΠΎΠ»_a) and by RFH (project no. 16-04-12019).ИсслСдованиС Π²Ρ‹ΠΏΠΎΠ»Π½Π΅Π½ΠΎ ΠΏΡ€ΠΈ финансовой ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠ΅ РЀЀИ Π² Ρ€Π°ΠΌΠΊΠ°Ρ… Π½Π°ΡƒΡ‡-Π½ΠΎΠ³ΠΎ ΠΏΡ€ΠΎΠ΅ΠΊΡ‚Π° β„– 16-37-00354 ΠΌΠΎΠ»_Π° ΠΈ ΠΏΡ€ΠΈ финансовой ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠ΅ РГНЀ Π² Ρ€Π°ΠΌΠΊΠ°Ρ… Π½Π°ΡƒΡ‡Π½ΠΎΠ³ΠΎ ΠΏΡ€ΠΎΠ΅ΠΊΡ‚Π° β„– 16-04-12019 Β«Π˜Π½Ρ‚Π΅Π³Ρ€Π°Ρ†ΠΈΡ тСзаурусов RussNet ΠΈ YARNΒ»

    CROWDSOURCING AS A HUMAN-COMPUTER SYSTEM WITH FEEDBACK

    Full text link
    Crowdsourcing is an established approach for such problems as data gathering, annotation, cleaning, etc. Given a set of simple and verifiable tasks, many participants execute them voluntarily or on a paid basis. Since the resources are constrained, it is crucial to evaluate the effort of each participant and to focus the crowdsourcing process. We discuss the representation of crowdsourcing as a human-computer system with feedback and propose a reference model of such a system.РСализация ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΠΎΠ³ΠΎ ΠΏΠΎΠ΄Ρ…ΠΎΠ΄Π° выполняСтся Π² Ρ€Π°ΠΌΠΊΠ°Ρ… ΠΎΡ‚ΠΊΡ€Ρ‹Ρ‚ΠΎΠ³ΠΎ ΠΏΡ€ΠΎΠ΅ΠΊΡ‚Π° Yet Another RussNet [1]. Π Π°Π±ΠΎΡ‚Π° ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠ°Π½Π° Π³Ρ€Π°Π½Ρ‚ΠΎΠΌ РГНЀ β„– 13-04-12020 «Новый ΠΎΡ‚ΠΊΡ€Ρ‹Ρ‚Ρ‹ΠΉ элСктронный тСзаурус русского языка»

    ΠšΠΎΠ»Π»Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Π΅ ΠΏΠΎΡ‚ΠΎΠΊΠΎΠ²Ρ‹Π΅ вычислСния: рСляционныС ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΈ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ‹

    Get PDF
    Recently, microtask crowdsourcing has become a popular approach for addressing various data mining problems. Crowdsourcing workflows for approaching such problems are composed of several data processing stages which require consistent representation for making the work reproducible. This paper is devoted to the problem of reproducibility and formalization of the microtask crowdsourcing process. A computational model for microtask crowdsourcing based on an extended relational model and a dataflow computational model has been proposed. The proposed collaborative dataflow computational model is designed for processing the input data sources by executing annotation stages and automatic synchronization stages simultaneously. Data processing stages and connections between them are expressed by using collaborative computation workflows represented as loosely connected directed acyclic graphs. A synchronous algorithm for executing such workflows has been described. The computational model has been evaluated by applying it to two tasks from the computational linguistics field: concept lexicalization refining in electronic thesauri and establishing hierarchical relations between such concepts. The β€œAdd–Remove–Confirm” procedure is designed for adding the missing lexemes to the concepts while removing the odd ones. The β€œGenus–Species–Match” procedure is designed for establishing β€œis-a” relations between the concepts provided with the corresponding word pairs. The experiments involving both volunteers from popular online social networks and paid workers from crowdsourcing marketplaces confirm applicability of these procedures for enhancing lexical resources.Β Π’ послСднСС врСмя краудсорсинг Π½Π° основС выполСния ΠΌΠΈΠΊΡ€ΠΎΠ·Π°Π΄Π°Ρ‡ ΠΏΠΎΠ»ΡƒΡ‡ΠΈΠ» ΡˆΠΈΡ€ΠΎΠΊΠΎΠ΅ ΠΏΡ€ΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ Π² области Π°Π½Π°Π»ΠΈΠ·Π° нСструктурированных Π΄Π°Π½Π½Ρ‹Ρ…. Π Π°Π·Ρ€Π°Π±Π°Ρ‚Ρ‹Π²Π°ΡŽΡ‚ΡΡ спСциализированныС ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΈΠΊΠΈ, состоящиС ΠΈΠ· мноТСства этапов ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ исходных Π΄Π°Π½Π½Ρ‹Ρ…, Ρ‚Ρ€Π΅Π±ΡƒΡŽΡ‰ΠΈΡ… согласованности ΠΈΡ… прСдставлСния для обСспСчСния воспроизводимости Ρ€Π°Π±ΠΎΡ‚Ρ‹. Данная ΡΡ‚Π°Ρ‚ΡŒΡ посвящСна Ρ€Π΅ΡˆΠ΅Π½ΠΈΡŽ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹ воспроизводимости ΠΈ Ρ„ΠΎΡ€ΠΌΠ°Π»ΠΈΠ·Π°Ρ†ΠΈΠΈ процСсса краудсорсинга ΠΌΠΈΠΊΡ€ΠΎΠ·Π°Π΄Π°Ρ‡Π°ΠΌΠΈ. ΠŸΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π° модСль ΠΊΠΎΠ»Π»Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… ΠΏΠΎΡ‚ΠΎΠΊΠΎΠ²Ρ‹Ρ… вычислСний Π½Π° основС Ρ€Π°ΡΡˆΠΈΡ€Π΅Π½Π½ΠΎΠΈΜ† рСляционной ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΈ ΠΏΠΎΡ‚ΠΎΠΊΠΎΠ²ΠΎΠΈΜ† ΠΌΠΎΠ΄Π΅Π»ΠΈ вычислСний. МодСль ΠΏΡ€Π΅Π΄Π½Π°Π·Π½Π°Ρ‡Π΅Π½Π° для ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ исходных Π΄Π°Π½Π½Ρ‹Ρ… Π² Π²ΠΈΠ΄Π΅ рСляционных ΠΎΡ‚Π½ΠΎΡˆΠ΅Π½ΠΈΠΈΜ† ΠΏΡƒΡ‚Π΅ΠΌ ΠΏΠ°Ρ€Π°Π»Π»Π΅Π»ΡŒΠ½ΠΎΠ³ΠΎ выполнСния этапов Ρ€Π°Π·ΠΌΠ΅Ρ‚ΠΊΠΈ ΠΌΠΈΠΊΡ€ΠΎΠ·Π°Π΄Π°Ρ‡Π°ΠΌΠΈ ΠΈ этапов автоматичСской синхронизации. Π­Ρ‚Π°ΠΏΡ‹ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ Π΄Π°Π½Π½Ρ‹Ρ… ΠΈ связи ΠΌΠ΅ΠΆΠ΄Ρƒ Π½ΠΈΠΌΠΈ Π·Π°ΠΏΠΈΡΡ‹Π²Π°ΡŽΡ‚ΡΡ с использованиСм схСмы ΠΊΠΎΠ»Π»Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… вычислСний, ΠΏΡ€Π΅Π΄ΡΡ‚Π°Π²Π»ΡΡŽΡ‰Π΅ΠΈΜ† собой слабо связный ΠΎΡ€ΠΈΠ΅Π½Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹ΠΈΜ† ацикличСский Π³Ρ€Π°Ρ„. Описан синхронный Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌ выполнСния схСм ΠΊΠΎΠ»Π»Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… вычислСний. ΠŸΡ€ΠΎΠ΄Π΅ΠΌΠΎΠ½ΡΡ‚Ρ€ΠΈΡ€ΠΎΠ²Π°Π½Ρ‹ прилоТСния ΠΌΠΎΠ΄Π΅Π»ΠΈ Π² области ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π½ΠΎΠΈΜ† лингвистики для уточнСния лСксикализации понятий Π² элСктронных тСзаурусах ΠΈ построСния Ρ€ΠΎΠ΄ΠΎ-Π²ΠΈΠ΄ΠΎΠ²Ρ‹Ρ… ΠΎΡ‚Π½ΠΎΡˆΠ΅Π½ΠΈΠΈΜ† ΠΌΠ΅ΠΆΠ΄Ρƒ понятиями ΠΏΡ€ΠΈ ΠΏΠΎΠΌΠΎΡ‰ΠΈ краудсорсинга. ΠŸΡ€ΠΎΡ†Π΅Π΄ΡƒΡ€Π° Β«Π΄ΠΎΠ±Π°Π²ΠΈΡ‚ΡŒβ€“ΡƒΠ΄Π°Π»ΠΈΡ‚ΡŒβ€“ΠΏΠΎΠ΄Ρ‚Π²Π΅Ρ€Π΄ΠΈΡ‚ΡŒΒ» позволяСт внСсти Π² Π»Π΅ΠΊΡΠΈΠΊΠ°Π»ΠΈΠ·Π°Ρ†ΠΈΡŽ понятий Π½Π΅Π΄ΠΎΡΡ‚Π°ΡŽΡ‰ΠΈΠ΅ лСксСмы ΠΈ ΠΈΡΠΊΠ»ΡŽΡ‡ΠΈΡ‚ΡŒ посторонниС. ΠŸΡ€ΠΎΡ†Π΅Π΄ΡƒΡ€Π° Β«Ρ€ΠΎΠ΄β€“Π²ΠΈΠ΄β€“ΡΠΎΠΏΠΎΡΡ‚Π°Π²ΠΈΡ‚ΡŒΒ» позволяСт ΡΡ„ΠΎΡ€ΠΌΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Π³ΠΈΠΏΠΎ-гипСронимичСскиС ΠΎΡ‚Π½ΠΎΡˆΠ΅Π½ΠΈΡ ΠΌΠ΅ΠΆΠ΄Ρƒ понятиями Π½Π° основС ΡΠΎΠΎΡ‚Π²Π΅Ρ‚ΡΡ‚Π²ΡƒΡŽΡ‰ΠΈΡ… Ρ€ΠΎΠ΄ΠΎ-Π²ΠΈΠ΄ΠΎΠ²Ρ‹Ρ… ΠΏΠ°Ρ€ слов. Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ экспСримСнтов Π½Π° ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»Π°Ρ… ΠΎΡ‚ΠΊΡ€Ρ‹Ρ‚ΠΎΠ³ΠΎ элСктронного тСзауруса русского языка ΠΏΠΎΠ΄Ρ‚Π²Π΅Ρ€ΠΆΠ΄Π°ΡŽΡ‚ ΠΏΡ€ΠΈΠΌΠ΅Π½ΠΈΠΌΠΎΡΡ‚ΡŒ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚Π°Π½Π½Ρ‹Ρ… ΠΏΡ€ΠΎΡ†Π΅Π΄ΡƒΡ€ для развития лСксичСских рСсурсов. Π’ экспСримСнтах приняли участиС ΠΊΠ°ΠΊ Π²ΠΎΠ»ΠΎΠ½Ρ‚Π΅Ρ€Ρ‹ ΠΈΠ· популярных ΡΠΎΡ†ΠΈΠ°Π»ΡŒΠ½Ρ‹Ρ… сСтСй, Ρ‚Π°ΠΊ ΠΈ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»ΠΈ Π±ΠΈΡ€ΠΆ краудсорсинга (Π·Π° Π²ΠΎΠ·Π½Π°Π³Ρ€Π°ΠΆΠ΄Π΅Π½ΠΈΠ΅ Π² Ρ„ΠΎΡ€ΠΌΠ΅ ΠΌΠΈΠΊΡ€ΠΎΠΏΠ»Π°Ρ‚Π΅ΠΆΠ΅ΠΈΜ†).

    What can crowd computing do for the next generation of AI systems?

    No full text
    The unprecedented rise in the adoption of artificial intelligence techniques and automation in many contexts is concomitant with shortcomings of such technology with respect to robustness, interpretability, usability, and trustworthiness. Crowd computing offers a viable means to leverage human intelligence at scale for data creation, enrichment, and interpretation, demonstrating a great potential to improve the performance of AI systems and increase the adoption of AI in general. Existing research and practice has mainly focused on leveraging crowd computing for training data creation. However, this perspective is rather limiting in terms of how AI can fully benefit from crowd computing. In this vision paper, we identify opportunities in crowd computing to propel better AI technology, and argue that to make such progress, fundamental problems need to be tackled from both computation and interaction standpoints. We discuss important research questions in both these themes, with an aim to shed light on the research needed to pave a future where humans and AI can work together seamlessly, while benefiting from each other.</p

    Improving hypernymy extraction with distributional semantic classes

    No full text
    In this paper, we show how distributionally-induced semantic classes can be helpful for extracting hypernyms. We present methods for inducing sense-aware semantic classes using distributional semantics and using these induced semantic classes for filtering noisy hypernymy relations. Denoising of hypernyms is performed by labeling each semantic class with its hypernyms. On the one hand, this allows us to filter out wrong extractions using the global structure of distributionally similar senses. On the other hand, we infer missing hypernyms via label propagation to cluster terms. We conduct a large-scale crowdsourcing study showing that processing of automatically extracted hypernyms using our approach improves the quality of the hypernymy extraction in terms of both precision and recall. Furthermore, we show the utility of our method in the domain taxonomy induction task, achieving the state-of-the-art results on a SemEval'16 task on taxonomy induction

    Unsupervised, knowledge-free, and interpretable word sense disambiguation

    No full text
    Interpretability of a predictive model is a powerful feature that gains the trust of users in the correctness of the predictions. In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as hypernyms, usage examples, and images. We present a WSD system that bridges the gap between these two so far disconnected groups of methods. Namely, our system, providing access to several state-of-the-art WSD models, aims to be interpretable as a knowledge-based system while it remains completely unsupervised and knowledge-free. The presented tool features a Web interface for all-word disambiguation of texts that makes the sense predictions human readable by providing interpretable word sense inventories, sense representations, and disambiguation results. We provide a public API, enabling seamless integration
    corecore