10 research outputs found
ΠΡΡΠΎΡΠΈΡ, Π½Π°ΠΏΡΠ°Π²Π»Π΅Π½ΠΈΡ ΠΈ Π½Π΅ΠΊΠΎΡΠΎΡΡΠ΅ ΠΏΡΠΎΠ±Π»Π΅ΠΌΡ ΡΠΎΠ²ΡΠ΅ΠΌΠ΅Π½Π½ΡΡ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ ΠΊΡΠ°ΡΠ΄ΡΠΎΡΡΠΈΠ½Π³Π° ΠΊΠ°ΠΊ Π½Π°ΡΡΠ½ΠΎ-ΠΏΡΠ°ΠΊΡΠΈΡΠ΅ΡΠΊΠΎΠΉ Π΄ΠΈΡΡΠΈΠΏΠ»ΠΈΠ½Ρ
Π‘Π΅Π³ΠΎΠ΄Π½Ρ ΠΊΡΠ°ΡΠ΄ΡΠΎΡΡΠΈΠ½Π³ ΡΠ²Π»ΡΠ΅ΡΡΡ ΡΠΈΡΠΎΠΊΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΠΌ ΡΠΏΠΎΡΠΎΠ±ΠΎΠΌ ΡΠ΅ΡΠ΅Π½ΠΈΡ ΠΌΠ½ΠΎΠ³ΠΈΡ
Π·Π°Π΄Π°Ρ ΡΠ±ΠΎΡΠ° ΠΈ Π°Π³ΡΠ΅Π³Π°ΡΠΈΠΈ Π΄Π°Π½Π½ΡΡ
. Π Π΄Π°Π½Π½ΠΎΠΉ ΡΠ°Π±ΠΎΡΠ΅ ΠΏΡΠΎΠ²Π΅Π΄Π΅Π½ ΠΎΠ±Π·ΠΎΡ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ ΠΊΡΠ°ΡΠ΄ΡΠΎΡΡΠΈΠ½Π³Π° ΠΊΠ°ΠΊ Π½Π°ΡΡΠ½ΠΎ-ΠΏΡΠ°ΠΊΡΠΈΡΠ΅ΡΠΊΠΎΠΉ Π΄ΠΈΡΡΠΈΠΏΠ»ΠΈΠ½Ρ. ΠΡΠ΄Π΅Π»Π΅Π½Ρ Π½Π°ΠΏΡΠ°Π²Π»Π΅Π½ΠΈΡ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ ΠΈ ΡΡΠΎΡΠΌΡΠ»ΠΈΡΠΎΠ²Π°Π½Ρ Π½Π΅ΠΊΠΎΡΠΎΡΡΠ΅ Π°ΠΊΡΡΠ°Π»ΡΠ½ΡΠ΅ ΠΏΡΠΎΠ±Π»Π΅ΠΌΡ Π΄Π°Π½Π½ΠΎΠΉ Π΄ΠΈΡΡΠΈΠΏΠ»ΠΈΠ½Ρ.Today, crowdsourcing became a popular approach for various data collecting and mining tasks. In this work, several modern crowdsourcing studies in different research trends have been discussed and some problems within these trends have been mentioned
QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites
In this paper, we present a framework for Question Difficulty and Expertise
Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo!
Answers and Stack Overflow, which tackles a fundamental challenge in
crowdsourcing: how to appropriately route and assign questions to users with
the suitable expertise. This problem domain has been the subject of much
research and includes both language-agnostic as well as language conscious
solutions. We bring to bear a key language-agnostic insight: that users gain
expertise and therefore tend to ask as well as answer more difficult questions
over time. We use this insight within the popular competition (directed) graph
model to estimate question difficulty and user expertise by identifying key
hierarchical structure within said model. An important and novel contribution
here is the application of "social agony" to this problem domain. Difficulty
levels of newly posted questions (the cold-start problem) are estimated by
using our QDEE framework and additional textual features. We also propose a
model to route newly posted questions to appropriate users based on the
difficulty level of the question and the expertise of the user. Extensive
experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data
demonstrate the improved efficacy of our approach over contemporary
state-of-the-art models. The QDEE framework also allows us to characterize user
expertise in novel ways by identifying interesting patterns and roles played by
different users in such CQAs.Comment: Accepted in the Proceedings of the 12th International AAAI Conference
on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US
The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale
In this paper, we interpret the community question answering websites on the
StackExchange platform as knowledge markets, and analyze how and why these
markets can fail at scale. A knowledge market framing allows site operators to
reason about market failures, and to design policies to prevent them. Our goal
is to provide insights on large-scale knowledge market failures through an
interpretable model. We explore a set of interpretable economic production
models on a large empirical dataset to analyze the dynamics of content
generation in knowledge markets. Amongst these, the Cobb-Douglas model best
explains empirical data and provides an intuitive explanation for content
generation through concepts of elasticity and diminishing returns. Content
generation depends on user participation and also on how specific types of
content (e.g. answers) depends on other types (e.g. questions). We show that
these factors of content generation have constant elasticity---a percentage
increase in any of the inputs leads to a constant percentage increase in the
output. Furthermore, markets exhibit diminishing returns---the marginal output
decreases as the input is incrementally increased. Knowledge markets also vary
on their returns to scale---the increase in output resulting from a
proportionate increase in all inputs. Importantly, many knowledge markets
exhibit diseconomies of scale---measures of market health (e.g., the percentage
of questions with an accepted answer) decrease as a function of number of
participants. The implications of our work are two-fold: site operators ought
to design incentives as a function of system size (number of participants); the
market lens should shed insight into complex dependencies amongst different
content types and participant actions in general social networks.Comment: The 27th International Conference on World Wide Web (WWW), 201
The role of diverse strategies in the sustainability of online communities
Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities
The Role of Diverse Strategies in Sustainable Knowledge Production
abstract: Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze Stack Exchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the Stack Exchange system and quantify individual answering strategies using the linking dynamics on attention networks. We identify two answering strategies. Strategy A aims at performing maintenance by doing simple tasks, whereas strategy B aims at investing time in doing challenging tasks. Both strategies are important: empirical evidence shows that strategy A decreases the median waiting time for answers and strategy B increases the acceptance rate of answers. In investigating the strategic persistence of users, we find that users tends to stick on the same strategy over time in a community, but switch from one strategy to the other across communities. This finding reveals the different sets of knowledge and skills between users. A balance between the population of users taking A and B strategies that approximates 2:1, is found to be optimal to the sustainable growth of communities.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.014915
Neural Language Models for Data-Driven Programming Support
Programming can be hard to learn and master. Search engines and social Q&A websites offer tremendous help to programmers, but great expertise (e.g., βGoogle-fuβ) is required to efficiently use these resources and successfully solve complex problems. An integrated system that can recognize a programmerβs tasks and provide contextualized solutions is thus desirable, and ideally programmers can interact with the system using natural input channels, in a way similar to how they communicate with a human expert. To enable such an integrated system, neural language models constitute a promising solution. These models encode programming language in the same high-dimensional space with data of other modalities, and can be trained in an end-to-end fashion. By leveraging the massive data about programming knowledge that are available online, including social Q&A websites, tutorials, blogs, and open-source code repositories, we can train neural language models to support a variety of user intentions, including the long-tail ones. We propose three studies related to using neural language models to solve programming problems in practice. First, we introduce CodeMend, an intelligent programming assistant that supports interactive programming. The system employs a bimodal embedding model to encode programming language and natural language in the same vector space. We demonstrate that this model can effectively understand the code context and associate it with user input to suggest relevant code modifications. We also develop novel user interface to render search results in a way that makes the problem solving process more efficient. Second, we propose a deep learning pipeline that converts data visualization images to source code. The pipeline is built by using computer vision techniques and recurrent neural networks, and it supports the user to get source code generated based on visual examples. We develop novel techniques that augment existing a limited set of training samples via code parameterization and random variation. We also propose strategies that can adapt the general-purpose neural language model to fit the task of predicting source code. Third, we introduce LAMVI, a set of visualization tools for diagnosing issues with neural language models. It tracks the ranks of individual candidate outputs for user-selected queries, and supports the exploration of the corresponding hidden-layer activations. It also tracks influential training instances, and provides guidance for taking actions for tuning the model. The system is evaluated on simulated datasets facilitates the user to efficiently adapt mature neural language models to new datasets or new tasks. Collectively, these three components form an integral solution to computer-assisted problem solving for programmers driven by big data, and may have impact on various different domains, including natural language processing, machine learning, software engineering, and interactive data visualization.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138509/1/ronxin_1.pd
Π€ΠΈΠ·ΠΈΠΊΠ°. Π’Π΅Ρ Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. ΠΠ½Π½ΠΎΠ²Π°ΡΠΈΠΈ: ΡΠ±ΠΎΡΠ½ΠΈΠΊ Π½Π°ΡΡΠ½ΡΡ ΡΡΡΠ΄ΠΎΠ² : ΠΡΠΏΡΡΠΊ 1
Π‘Π±ΠΎΡΠ½ΠΈΠΊ Π½Π°ΡΡΠ½ΡΡ
ΡΡΡΠ΄ΠΎΠ² Β«Π€ΠΈΠ·ΠΈΠΊΠ°. Π’Π΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. ΠΠ½Π½ΠΎΠ²Π°ΡΠΈΠΈΒ» ΡΠ°ΡΠΊΡΡΠ²Π°Π΅Ρ Π°ΠΊΡΡΠ°Π»ΡΠ½ΡΠ΅ ΠΏΡΠΎΠ±Π»Π΅ΠΌΡ ΡΠΎΠ²ΡΠ΅ΠΌΠ΅Π½Π½ΠΎΠΉ ΡΠΈΠ·ΠΈΠΊΠΈ, ΠΈΠ½Π½ΠΎΠ²Π°ΡΠΈΠΎΠ½Π½ΡΡ
ΠΈ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠΎΠ½Π½ΡΡ
ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΉ, Π° ΡΠ°ΠΊΠΆΠ΅ ΡΠΎΡΠΈΠ°Π»ΡΠ½ΡΡ
Π½Π°ΡΠΊ. Π Π΄Π°Π½Π½ΠΎΠΌ Π²ΡΠΏΡΡΠΊΠ΅ ΠΎΠ±ΡΠ΅Π΄ΠΈΠ½Π΅Π½Ρ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ Π½Π°ΡΡΠ½ΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° ΠΈ ΡΠΌΠΏΠΈΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ, ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Π½ΡΡ
ΠΈΠ·Π²Π΅ΡΡΠ½ΡΠΌΠΈ ΡΡΠ΅Π½ΡΠΌΠΈ, Π²Π΅Π΄ΡΡΠΈΠΌΠΈ ΡΠΏΠ΅ΡΠΈΠ°Π»ΠΈΡΡΠ°ΠΌΠΈ ΠΈ ΠΌΠΎΠ»ΠΎΠ΄ΡΠΌΠΈ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠΌΠΈ. Π‘Π±ΠΎΡΠ½ΠΈΠΊ Π±ΡΠ΄Π΅Ρ ΠΈΠ½ΡΠ΅ΡΠ΅ΡΠ΅Π½ Π½Π°ΡΡΠ½ΡΠΌ Π΄Π΅ΡΡΠ΅Π»ΡΠΌ ΠΈ ΠΏΡΠ°ΠΊΡΠΈΠΊΡΡΡΠΈΠΌ ΡΠΏΠ΅ΡΠΈΠ°Π»ΠΈΡΡΠ°ΠΌ Π² ΠΎΠ±Π»Π°ΡΡΠΈ ΡΠΈΠ·ΠΈΠΊΠΈ, Ρ
ΠΈΠΌΠΈΠΈ, ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠΎΠ½Π½ΡΡ
ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΉ, ΡΠΈΠ»ΠΎΠ»ΠΎΠ³ΠΈΠΈ, ΡΠΎΡΠΈΠΎΠ»ΠΎΠ³ΠΈΠΈ, ΠΈΡΡΠΎΡΠΈΠΈ, ΡΠΊΠΎΠ»ΠΎΠ³ΠΈΠΈ