Search CORE

37,376 research outputs found

Language Use Matters: Analysis of the Linguistic Structure of Question Texts Can Characterize Answerability in Quora

Author: Kharb Aman
Maity Suman Kalyan
Mukherjee Animesh
Publication venue
Publication date: 11/03/2017
Field of study

Quora is one of the most popular community Q&A sites of recent times. However, many question posts on this Q&A site often do not get answered. In this paper, we quantify various linguistic activities that discriminates an answered question from an unanswered one. Our central finding is that the way users use language while writing the question text can be a very effective means to characterize answerability. This characterization helps us to predict early if a question remaining unanswered for a specific time period t will eventually be answered or not and achieve an accuracy of 76.26% (t = 1 month) and 68.33% (t = 3 months). Notably, features representing the language use patterns of the users are most discriminative and alone account for an accuracy of 74.18%. We also compare our method with some of the similar works (Dror et al., Yang et al.) achieving a maximum improvement of ~39% in terms of accuracy.Comment: 1 figure, 3 tables, ICWSM 2017 as poste

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Cultures in Community Question Answering

Author: Bonchi Francesco
Iamnitchi Adriana
Kayes Imrul
Kourtellis Nicolas
Quercia Daniele
Publication venue
Publication date: 01/01/2015
Field of study

CQA services are collaborative platforms where users ask and answer questions. We investigate the influence of national culture on people's online questioning and answering behavior. For this, we analyzed a sample of 200 thousand users in Yahoo Answers from 67 countries. We measure empirically a set of cultural metrics defined in Geert Hofstede's cultural dimensions and Robert Levine's Pace of Life and show that behavioral cultural differences exist in community question answering platforms. We find that national cultures differ in Yahoo Answers along a number of dimensions such as temporal predictability of activities, contribution-related behavioral patterns, privacy concerns, and power inequality.Comment: Published in the proceedings of the 26th ACM Conference on Hypertext and Social Media (HT'15

arXiv.org e-Print Archive

Crossref

The Social World of Content Abusers in Community Question Answering

Author: Bonchi Francesco
Iamnitchi Adriana
Kayes Imrul
Kourtellis Nicolas
Quercia Daniele
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Community-based question answering platforms can be rich sources of information on a variety of specialized topics, from finance to cooking. The usefulness of such platforms depends heavily on user contributions (questions and answers), but also on respecting the community rules. As a crowd-sourced service, such platforms rely on their users for monitoring and flagging content that violates community rules. Common wisdom is to eliminate the users who receive many flags. Our analysis of a year of traces from a mature Q&A site shows that the number of flags does not tell the full story: on one hand, users with many flags may still contribute positively to the community. On the other hand, users who never get flagged are found to violate community rules and get their accounts suspended. This analysis, however, also shows that abusive users are betrayed by their network properties: we find strong evidence of homophilous behavior and use this finding to detect abusive users who go under the community radar. Based on our empirical observations, we build a classifier that is able to detect abusive users with an accuracy as high as 83%.Comment: Published in the proceedings of the 24th International World Wide Web Conference (WWW 2015

arXiv.org e-Print Archive

CiteSeerX

USFSP Digital Archive

Crossref

Scholar Commons - University of South Florida

Learning to predict closed questions on stack overflow

Author: Braslavski P.
Kuznetsov A.
Lezina G.
Браславский П. И.
Кузнецов А.
Лезина Г.
Publication venue: Казанский университет
Publication date: 01/01/2013
Field of study

The paper deals with the problem of predicting whether the user’s question will be closed by the moderator on Stack Overflow, a popular question answering service devoted to software programming. The task along with data and evaluation metrics was offered as an open machine learning competition on Kaggle platform. To solve this problem, we employed a wide range of classification features related to users, their interactions, and post content. Classification was carried out using several machine learning methods. According to the results of the experiment, the most important features are characteristics of the user and topical features of the question. The best results were obtained using Vowpal Wabbit – an implementation of online learning based on stochastic gradient descent. Our results are among the best ones in overall ranking, although they were obtained after the official competition was over

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Determinants of quality, latency, and amount of Stack Overflow answers about recent Android APIs.

Author: Filkov Vladimir
Kavaler David
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Stack Overflow is a popular crowdsourced question and answer website for programming-related issues. It is an invaluable resource for software developers; on average, questions posted there get answered in minutes to an hour. Questions about well established topics, e.g., the coercion operator in C++, or the difference between canonical and class names in Java, get asked often in one form or another, and answered very quickly. On the other hand, questions on previously unseen or niche topics take a while to get a good answer. This is particularly the case with questions about current updates to or the introduction of new application programming interfaces (APIs). In a hyper-competitive online market, getting good answers to current programming questions sooner could increase the chances of an app getting released and used. So, can developers anyhow, e.g., hasten the speed to good answers to questions about new APIs? Here, we empirically study Stack Overflow questions pertaining to new Android APIs and their associated answers. We contrast the interest in these questions, their answer quality, and timeliness of their answers to questions about old APIs. We find that Stack Overflow answerers in general prioritize with respect to currentness: questions about new APIs do get more answers, but good quality answers take longer. We also find that incentives in terms of question bounties, if used appropriately, can significantly shorten the time and increase answer quality. Interestingly, no operationalization of bounty amount shows significance in our models. In practice, our findings confirm the value of bounties in enhancing expert participation. In addition, they show that the Stack Overflow style of crowdsourcing, for all its glory in providing answers about established programming knowledge, is less effective with new API questions

Directory of Open Access Journals

eScholarship - University of California

The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale

Author: Dev Himel
Geigle Chase
Hu Qingtao
Sundaram Hari
Zheng Jiahui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

In this paper, we interpret the community question answering websites on the StackExchange platform as knowledge markets, and analyze how and why these markets can fail at scale. A knowledge market framing allows site operators to reason about market failures, and to design policies to prevent them. Our goal is to provide insights on large-scale knowledge market failures through an interpretable model. We explore a set of interpretable economic production models on a large empirical dataset to analyze the dynamics of content generation in knowledge markets. Amongst these, the Cobb-Douglas model best explains empirical data and provides an intuitive explanation for content generation through concepts of elasticity and diminishing returns. Content generation depends on user participation and also on how specific types of content (e.g. answers) depends on other types (e.g. questions). We show that these factors of content generation have constant elasticity---a percentage increase in any of the inputs leads to a constant percentage increase in the output. Furthermore, markets exhibit diminishing returns---the marginal output decreases as the input is incrementally increased. Knowledge markets also vary on their returns to scale---the increase in output resulting from a proportionate increase in all inputs. Importantly, many knowledge markets exhibit diseconomies of scale---measures of market health (e.g., the percentage of questions with an accepted answer) decrease as a function of number of participants. The implications of our work are two-fold: site operators ought to design incentives as a function of system size (number of participants); the market lens should shed insight into complex dependencies amongst different content types and participant actions in general social networks.Comment: The 27th International Conference on World Wide Web (WWW), 201

arXiv.org e-Print Archive

Crossref

Identifying Unclear Questions in Community Question Answering Websites

Author: CD Manning
I Srba
M Coleman
S Ravi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/01/2019
Field of study

Thousands of complex natural language questions are submitted to community question answering websites on a daily basis, rendering them as one of the most important information sources these days. However, oftentimes submitted questions are unclear and cannot be answered without further clarification questions by expert community members. This study is the first to investigate the complex task of classifying a question as clear or unclear, i.e., if it requires further clarification. We construct a novel dataset and propose a classification approach that is based on the notion of similar questions. This approach is compared to state-of-the-art text classification baselines. Our main finding is that the similar questions approach is a viable alternative that can be used as a stepping stone towards the development of supportive user interfaces for question formulation.Comment: Proceedings of the 41th European Conference on Information Retrieval (ECIR '19), 201

arXiv.org e-Print Archive

Crossref

University of Twente Research Information

The big five: Discovering linguistic characteristics that typify distinct personality traits across Yahoo! answers members

Author: Figueroa A.
Olivares N.
Vivanco L.M.
Publication venue: 'Instituto Politecnico Nacional/Centro de Investigacion en Computacion'
Publication date: 01/01/2018
Field of study

Indexación: Scopus.This work was partially supported by the project FONDECYT “Bridging the Gap between Askers and Answers in Community Question Answering Services” (11130094) funded by the Chilean Government.In psychology, it is widely believed that there are five big factors that determine the different personality traits: Extraversion, Agreeableness, Conscientiousness and Neuroticism as well as Openness. In the last years, researchers have started to examine how these factors are manifested across several social networks like Facebook and Twitter. However, to the best of our knowledge, other kinds of social networks such as social/informational question-answering communities (e.g., Yahoo! Answers) have been left unexplored. Therefore, this work explores several predictive models to automatically recognize these factors across Yahoo! Answers members. As a means of devising powerful generalizations, these models were combined with assorted linguistic features. Since we do not have access to ask community members to volunteer for taking the personality test, we built a study corpus by conducting a discourse analysis based on deconstructing the test into 112 adjectives. Our results reveal that it is plausible to lessen the dependency upon answered tests and that effective models across distinct factors are sharply different. Also, sentiment analysis and dependency parsing proven to be fundamental to deal with extraversion, agreeableness and conscientiousness. Furthermore, medium and low levels of neuroticism were found to be related to initial stages of depression and anxiety disorders. © 2018 Lithuanian Institute of Philosophy and Sociology. All rights reserved.https://www.cys.cic.ipn.mx/ojs/index.php/CyS/article/view/275

Repositorio Institucional Académico Universidad Andrés Bello

Dual Language and ENL Comprehension: A First Grade Study for Students at Risk for Delayed English Language Development

Author: Case Meghan
Publication venue: Fisher Digital Publications
Publication date: 01/08/2016
Field of study

This research began by asking how dual language programming impacts English comprehension for ENL students. Research was conducted within one first grade dual language cohort with five bilingual students. The data was collected by interviewing teachers and students, utilizing historical comprehension data, observing read alouds, and assessing student comprehension. Findings revealed that comprehension in a participant’s first language was positively related to English comprehension. However, individual student differences impacted the extent of the correlation. Furthermore, dual language teachers implemented common instructional practices to scaffold ENL student comprehension. Therefore, the data implied that native language instruction is integral, student backgrounds and differences need to be analyzed, and dual language educators need adequate professional development to best aid ENL comprehension

Fisher Digital Publications