261,741 research outputs found
Identifying Unclear Questions in Community Question Answering Websites
Thousands of complex natural language questions are submitted to community
question answering websites on a daily basis, rendering them as one of the most
important information sources these days. However, oftentimes submitted
questions are unclear and cannot be answered without further clarification
questions by expert community members. This study is the first to investigate
the complex task of classifying a question as clear or unclear, i.e., if it
requires further clarification. We construct a novel dataset and propose a
classification approach that is based on the notion of similar questions. This
approach is compared to state-of-the-art text classification baselines. Our
main finding is that the similar questions approach is a viable alternative
that can be used as a stepping stone towards the development of supportive user
interfaces for question formulation.Comment: Proceedings of the 41th European Conference on Information Retrieval
(ECIR '19), 201
The Social World of Content Abusers in Community Question Answering
Community-based question answering platforms can be rich sources of
information on a variety of specialized topics, from finance to cooking. The
usefulness of such platforms depends heavily on user contributions (questions
and answers), but also on respecting the community rules. As a crowd-sourced
service, such platforms rely on their users for monitoring and flagging content
that violates community rules.
Common wisdom is to eliminate the users who receive many flags. Our analysis
of a year of traces from a mature Q&A site shows that the number of flags does
not tell the full story: on one hand, users with many flags may still
contribute positively to the community. On the other hand, users who never get
flagged are found to violate community rules and get their accounts suspended.
This analysis, however, also shows that abusive users are betrayed by their
network properties: we find strong evidence of homophilous behavior and use
this finding to detect abusive users who go under the community radar. Based on
our empirical observations, we build a classifier that is able to detect
abusive users with an accuracy as high as 83%.Comment: Published in the proceedings of the 24th International World Wide Web
Conference (WWW 2015
ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters
To bridge the gap between the capabilities of the state-of-the-art in factoid
question answering (QA) and what users ask, we need large datasets of real user
questions that capture the various question phenomena users are interested in,
and the diverse ways in which these questions are formulated. We introduce
ComQA, a large dataset of real user questions that exhibit different
challenging aspects such as compositionality, temporal reasoning, and
comparisons. ComQA questions come from the WikiAnswers community QA platform,
which typically contains questions that are not satisfactorily answerable by
existing search engine technology. Through a large crowdsourcing effort, we
clean the question dataset, group questions into paraphrase clusters, and
annotate clusters with their answers. ComQA contains 11,214 questions grouped
into 4,834 paraphrase clusters. We detail the process of constructing ComQA,
including the measures taken to ensure its high quality while making effective
use of crowdsourcing. We also present an extensive analysis of the dataset and
the results achieved by state-of-the-art systems on ComQA, demonstrating that
our dataset can be a driver of future research on QA.Comment: 11 pages, NAACL 201
Lessons from Learning the Craft of Theory-Driven Research
This article presents a case study of the structure and logic of the authorâs dissertation, with a focus on theoretical content. Designed for use in proposal writing seminars or research methods courses, the article stresses the value of identifying the originating, specifying and subsidiary research questions; clarifying the subject and object of the research; situating research within a particular research tradition, and using a competing theories approach. The article stresses the need to identify conceptual problems and empirical problems and their associated conceptual and operational definitions. The primary theoretical perspective is drawn from emerging sociology of externalities rooted in ecological theory, within the institutionalist tradition
Meta-evaluation of the impacts and legacy of the London 2012 Olympic Games and Paralympic Games : Developing methods paper
This report brings together the interim findings from the Developing Meta-Evaluation Methods study, which is being undertaken in conjunction with the Meta-Evaluation of the Impacts and Legacy of the London 2012 Olympic Games and Paralympic Games.
The work on methods is funded by the Economic and Social Research Council (ESRC). The aim of this paper is to review the existing evidence on conducting meta-evaluation, and provide guidance appropriate to the Meta Evaluation of the Games as well as other meta-evaluation studies
Cultures in Community Question Answering
CQA services are collaborative platforms where users ask and answer
questions. We investigate the influence of national culture on people's online
questioning and answering behavior. For this, we analyzed a sample of 200
thousand users in Yahoo Answers from 67 countries. We measure empirically a set
of cultural metrics defined in Geert Hofstede's cultural dimensions and Robert
Levine's Pace of Life and show that behavioral cultural differences exist in
community question answering platforms. We find that national cultures differ
in Yahoo Answers along a number of dimensions such as temporal predictability
of activities, contribution-related behavioral patterns, privacy concerns, and
power inequality.Comment: Published in the proceedings of the 26th ACM Conference on Hypertext
and Social Media (HT'15
- âŠ