10 research outputs found

    Π˜ΡΡ‚ΠΎΡ€ΠΈΡ, направлСния ΠΈ Π½Π΅ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹ соврСмСнных исслСдований краудсорсинга ΠΊΠ°ΠΊ Π½Π°ΡƒΡ‡Π½ΠΎ-практичСской дисциплины

    Full text link
    БСгодня краудсорсинг являСтся ΡˆΠΈΡ€ΠΎΠΊΠΎ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅ΠΌΡ‹ΠΌ способом Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ ΠΌΠ½ΠΎΠ³ΠΈΡ… Π·Π°Π΄Π°Ρ‡ сбора ΠΈ Π°Π³Ρ€Π΅Π³Π°Ρ†ΠΈΠΈ Π΄Π°Π½Π½Ρ‹Ρ…. Π’ Π΄Π°Π½Π½ΠΎΠΉ Ρ€Π°Π±ΠΎΡ‚Π΅ ΠΏΡ€ΠΎΠ²Π΅Π΄Π΅Π½ ΠΎΠ±Π·ΠΎΡ€ исслСдований краудсорсинга ΠΊΠ°ΠΊ Π½Π°ΡƒΡ‡Π½ΠΎ-практичСской дисциплины. Π’Ρ‹Π΄Π΅Π»Π΅Π½Ρ‹ направлСния исслСдований ΠΈ сформулированы Π½Π΅ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ Π°ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½Ρ‹Π΅ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹ Π΄Π°Π½Π½ΠΎΠΉ дисциплины.Today, crowdsourcing became a popular approach for various data collecting and mining tasks. In this work, several modern crowdsourcing studies in different research trends have been discussed and some problems within these trends have been mentioned

    QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

    Full text link
    In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of "social agony" to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.Comment: Accepted in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US

    The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale

    Full text link
    In this paper, we interpret the community question answering websites on the StackExchange platform as knowledge markets, and analyze how and why these markets can fail at scale. A knowledge market framing allows site operators to reason about market failures, and to design policies to prevent them. Our goal is to provide insights on large-scale knowledge market failures through an interpretable model. We explore a set of interpretable economic production models on a large empirical dataset to analyze the dynamics of content generation in knowledge markets. Amongst these, the Cobb-Douglas model best explains empirical data and provides an intuitive explanation for content generation through concepts of elasticity and diminishing returns. Content generation depends on user participation and also on how specific types of content (e.g. answers) depends on other types (e.g. questions). We show that these factors of content generation have constant elasticity---a percentage increase in any of the inputs leads to a constant percentage increase in the output. Furthermore, markets exhibit diminishing returns---the marginal output decreases as the input is incrementally increased. Knowledge markets also vary on their returns to scale---the increase in output resulting from a proportionate increase in all inputs. Importantly, many knowledge markets exhibit diseconomies of scale---measures of market health (e.g., the percentage of questions with an accepted answer) decrease as a function of number of participants. The implications of our work are two-fold: site operators ought to design incentives as a function of system size (number of participants); the market lens should shed insight into complex dependencies amongst different content types and participant actions in general social networks.Comment: The 27th International Conference on World Wide Web (WWW), 201

    The role of diverse strategies in the sustainability of online communities

    Get PDF
    Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities

    The Role of Diverse Strategies in Sustainable Knowledge Production

    Get PDF
    abstract: Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.014915

    Neural Language Models for Data-Driven Programming Support

    Full text link
    Programming can be hard to learn and master. Search engines and social Q&A websites offer tremendous help to programmers, but great expertise (e.g., β€œGoogle-fu”) is required to efficiently use these resources and successfully solve complex problems. An integrated system that can recognize a programmer’s tasks and provide contextualized solutions is thus desirable, and ideally programmers can interact with the system using natural input channels, in a way similar to how they communicate with a human expert. To enable such an integrated system, neural language models constitute a promising solution. These models encode programming language in the same high-dimensional space with data of other modalities, and can be trained in an end-to-end fashion. By leveraging the massive data about programming knowledge that are available online, including social Q&A websites, tutorials, blogs, and open-source code repositories, we can train neural language models to support a variety of user intentions, including the long-tail ones. We propose three studies related to using neural language models to solve programming problems in practice. First, we introduce CodeMend, an intelligent programming assistant that supports interactive programming. The system employs a bimodal embedding model to encode programming language and natural language in the same vector space. We demonstrate that this model can effectively understand the code context and associate it with user input to suggest relevant code modifications. We also develop novel user interface to render search results in a way that makes the problem solving process more efficient. Second, we propose a deep learning pipeline that converts data visualization images to source code. The pipeline is built by using computer vision techniques and recurrent neural networks, and it supports the user to get source code generated based on visual examples. We develop novel techniques that augment existing a limited set of training samples via code parameterization and random variation. We also propose strategies that can adapt the general-purpose neural language model to fit the task of predicting source code. Third, we introduce LAMVI, a set of visualization tools for diagnosing issues with neural language models. It tracks the ranks of individual candidate outputs for user-selected queries, and supports the exploration of the corresponding hidden-layer activations. It also tracks influential training instances, and provides guidance for taking actions for tuning the model. The system is evaluated on simulated datasets facilitates the user to efficiently adapt mature neural language models to new datasets or new tasks. Collectively, these three components form an integral solution to computer-assisted problem solving for programmers driven by big data, and may have impact on various different domains, including natural language processing, machine learning, software engineering, and interactive data visualization.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138509/1/ronxin_1.pd

    Π€ΠΈΠ·ΠΈΠΊΠ°. Π’Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. Π˜Π½Π½ΠΎΠ²Π°Ρ†ΠΈΠΈ: сборник Π½Π°ΡƒΡ‡Π½Ρ‹Ρ… Ρ‚Ρ€ΡƒΠ΄ΠΎΠ² : Выпуск 1

    Full text link
    Π‘Π±ΠΎΡ€Π½ΠΈΠΊ Π½Π°ΡƒΡ‡Π½Ρ‹Ρ… Ρ‚Ρ€ΡƒΠ΄ΠΎΠ² Β«Π€ΠΈΠ·ΠΈΠΊΠ°. Π’Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. Π˜Π½Π½ΠΎΠ²Π°Ρ†ΠΈΠΈΒ» раскрываСт Π°ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½Ρ‹Π΅ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹ соврСмСнной Ρ„ΠΈΠ·ΠΈΠΊΠΈ, ΠΈΠ½Π½ΠΎΠ²Π°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… ΠΈ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… Ρ‚Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΉ, Π° Ρ‚Π°ΠΊΠΆΠ΅ ΡΠΎΡ†ΠΈΠ°Π»ΡŒΠ½Ρ‹Ρ… Π½Π°ΡƒΠΊ. Π’ Π΄Π°Π½Π½ΠΎΠΌ выпускС ΠΎΠ±ΡŠΠ΅Π΄ΠΈΠ½Π΅Π½Ρ‹ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ Π½Π°ΡƒΡ‡Π½ΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° ΠΈ эмпиричСских исслСдований, прСдставлСнных извСстными ΡƒΡ‡Π΅Π½Ρ‹ΠΌΠΈ, Π²Π΅Π΄ΡƒΡ‰ΠΈΠΌΠΈ спСциалистами ΠΈ ΠΌΠΎΠ»ΠΎΠ΄Ρ‹ΠΌΠΈ исслСдоватСлями. Π‘Π±ΠΎΡ€Π½ΠΈΠΊ Π±ΡƒΠ΄Π΅Ρ‚ интСрСсСн Π½Π°ΡƒΡ‡Π½Ρ‹ΠΌ дСятСлям ΠΈ ΠΏΡ€Π°ΠΊΡ‚ΠΈΠΊΡƒΡŽΡ‰ΠΈΠΌ спСциалистам Π² области Ρ„ΠΈΠ·ΠΈΠΊΠΈ, Ρ…ΠΈΠΌΠΈΠΈ, ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… Ρ‚Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΉ, Ρ„ΠΈΠ»ΠΎΠ»ΠΎΠ³ΠΈΠΈ, социологии, истории, экологии
    corecore