Search CORE

10 research outputs found

История, направления и некоторые проблемы современных исследований краудсорсинга как научно-практической дисциплины

Author: Усталов Д. А.
Publication venue: Уральский федеральный университет
Publication date: 01/01/2015
Field of study

Сегодня краудсорсинг является широко используемым способом решения многих задач сбора и агрегации данных. В данной работе проведен обзор исследований краудсорсинга как научно-практической дисциплины. Выделены направления исследований и сформулированы некоторые актуальные проблемы данной дисциплины.Today, crowdsourcing became a popular approach for various data collecting and mining tasks. In this work, several modern crowdsourcing studies in different research trends have been discussed and some problems within these trends have been mentioned

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

Author: Moosavi Sobhan
Parthasarathy Srinivasan
Ramnath Rajiv
Sun Jiankai
Publication venue
Publication date: 20/04/2018
Field of study

In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of "social agony" to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.Comment: Accepted in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale

Author: Dev Himel
Geigle Chase
Hu Qingtao
Sundaram Hari
Zheng Jiahui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

In this paper, we interpret the community question answering websites on the StackExchange platform as knowledge markets, and analyze how and why these markets can fail at scale. A knowledge market framing allows site operators to reason about market failures, and to design policies to prevent them. Our goal is to provide insights on large-scale knowledge market failures through an interpretable model. We explore a set of interpretable economic production models on a large empirical dataset to analyze the dynamics of content generation in knowledge markets. Amongst these, the Cobb-Douglas model best explains empirical data and provides an intuitive explanation for content generation through concepts of elasticity and diminishing returns. Content generation depends on user participation and also on how specific types of content (e.g. answers) depends on other types (e.g. questions). We show that these factors of content generation have constant elasticity---a percentage increase in any of the inputs leads to a constant percentage increase in the output. Furthermore, markets exhibit diminishing returns---the marginal output decreases as the input is incrementally increased. Knowledge markets also vary on their returns to scale---the increase in output resulting from a proportionate increase in all inputs. Importantly, many knowledge markets exhibit diseconomies of scale---measures of market health (e.g., the percentage of questions with an accepted answer) decrease as a function of number of participants. The implications of our work are two-fold: site operators ought to design incentives as a function of system size (number of participants); the market lens should shed insight into complex dependencies amongst different content types and participant actions in general social networks.Comment: The 27th International Conference on World Wide Web (WWW), 201

arXiv.org e-Print Archive

Crossref

The role of diverse strategies in the sustainability of online communities

Author: Baggio Jacopo A.
Janssen Marco A.
Wu Lingfei
Publication venue: Hosted by Utah State University Libraries
Publication date: 07/12/2016
Field of study

Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities

DigitalCommons@USU

The Role of Diverse Strategies in Sustainable Knowledge Production

Author
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 02/03/2016
Field of study

abstract: Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.014915

ASU Digital Repository

Neural Language Models for Data-Driven Programming Support

Author: Rong Xin
Publication venue
Publication date
Field of study

Programming can be hard to learn and master. Search engines and social Q&A websites offer tremendous help to programmers, but great expertise (e.g., “Google-fu”) is required to efficiently use these resources and successfully solve complex problems. An integrated system that can recognize a programmer’s tasks and provide contextualized solutions is thus desirable, and ideally programmers can interact with the system using natural input channels, in a way similar to how they communicate with a human expert. To enable such an integrated system, neural language models constitute a promising solution. These models encode programming language in the same high-dimensional space with data of other modalities, and can be trained in an end-to-end fashion. By leveraging the massive data about programming knowledge that are available online, including social Q&A websites, tutorials, blogs, and open-source code repositories, we can train neural language models to support a variety of user intentions, including the long-tail ones. We propose three studies related to using neural language models to solve programming problems in practice. First, we introduce CodeMend, an intelligent programming assistant that supports interactive programming. The system employs a bimodal embedding model to encode programming language and natural language in the same vector space. We demonstrate that this model can effectively understand the code context and associate it with user input to suggest relevant code modifications. We also develop novel user interface to render search results in a way that makes the problem solving process more efficient. Second, we propose a deep learning pipeline that converts data visualization images to source code. The pipeline is built by using computer vision techniques and recurrent neural networks, and it supports the user to get source code generated based on visual examples. We develop novel techniques that augment existing a limited set of training samples via code parameterization and random variation. We also propose strategies that can adapt the general-purpose neural language model to fit the task of predicting source code. Third, we introduce LAMVI, a set of visualization tools for diagnosing issues with neural language models. It tracks the ranks of individual candidate outputs for user-selected queries, and supports the exploration of the corresponding hidden-layer activations. It also tracks influential training instances, and provides guidance for taking actions for tuning the model. The system is evaluated on simulated datasets facilitates the user to efficiently adapt mature neural language models to new datasets or new tasks. Collectively, these three components form an integral solution to computer-assisted problem solving for programmers driven by big data, and may have impact on various different domains, including natural language processing, machine learning, software engineering, and interactive data visualization.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138509/1/ronxin_1.pd

Deep Blue Documents at the University of Michigan

Физика. Технологии. Инновации: сборник научных трудов : Выпуск 1

Author
Publication venue: УрФУ
Publication date: 01/01/2015
Field of study

Сборник научных трудов «Физика. Технологии. Инновации» раскрывает актуальные проблемы современной физики, инновационных и информационных технологий, а также социальных наук. В данном выпуске объединены результаты научного анализа и эмпирических исследований, представленных известными учеными, ведущими специалистами и молодыми исследователями. Сборник будет интересен научным деятелям и практикующим специалистам в области физики, химии, информационных технологий, филологии, социологии, истории, экологии

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin