8 research outputs found

    Studying the Artifacts of Q&A Platforms: The Central Role of the Crowd

    Get PDF
    Question and Answer (Q&A) websites serve as a platform that brings together individuals posting questions with those that can provide possible answers to those questions. Since Q&A platforms are human-made information technology (IT) artifacts, this study seeks to better understand how the designed interactive components of a platform, particularly those reflective of the crowd, impact the communication between help seekers (those that post questions) and solution providers (those that provide answers). This study sheds light on the composite role that the formation of questions and answers, along with feedback from the crowd, play in arriving at a validated solution (i.e., accepted answer) for a posed question. Using empirical data from one of the largest Q&A platforms, and applying the novel analytical technique of composite modeling, this study finds that the crowd is central in understanding how answers are perceived on the platform, and how a validated solution crystallizes from the set of answers provided

    Social Roles, Interactions and Community Sustainability in Social Q&A Sites: A Resource-based Perspective

    Get PDF
    Online tech support communities have become valuable channels for users to seek and provide solutions to specific problems. From the resource exchange perspective, the sustainability of a social system is contingent upon the size of its members as well as their communication activities. To further extend the resource-based model, the current research identifies a variety of social roles in a large tech support Q&A forum and examines longitudinal changes in the community’s structure based on the identification. Moreover, this study also investigates the relationship between the community’s functionality and its traffic. Results suggest that the proportion of unsolved questions negatively impacts the number of future incoming questions and the outcome of a given question is not only dependent on users’ interactions within the discussion, but also on the community activities preceding the question. These observations can help community managers to improve system design and task allocation

    Assessing Order Effects in Online Community-based Health Forums

    Get PDF
    Measuring the quality of health content in online health forums is a challenging task. The majority of the existing measures are based on evaluations of forum users and may not be reliable. We employed machine learning techniques, text mining methods, and Big Data platforms to construct four measures of textual quality to automatically determine the similarity of a given answer to professional answers. We then used them to assess the quality of 66,888 answers posted on Yahoo! Answers Health section. All four measures of textual quality revealed a higher quality for asker-selected best answers indicating that askers, to some extent, have a proper judgment to select the best answers. We also studied the presence of order effects in online health forums. Our results suggest that the textual quality of the first answer positively influences the mean textual quality of the subsequent answers and negatively influences the quantity of subsequent answers

    The impact of surface features on choice of (in)secure answers by Stackoverflow readers

    Get PDF
    Existing research has shown that developers will use StackOverflow to answer programming questions: but what draws them to one particular answer over any other? The choice of answer they select can mean the difference between a secure application and insecure one, as the quality of supposedly secure answers can vary. Prior work has studied people posting on Stack Overflow—a two-way communication between the original poster and the Stack Overflow community. Instead, we study the situation of one-way communication, where people only read a Stack Overflow thread without being actively involved in it, sometimes long after a thread has closed. We report on a mixed-method study including a controlled between-groups experiment and qualitative analysis of participants' rationale (N=1188), investigating whether explanation detail, answer scoring, accepted answer marks, as well as the security of the code snippet itself affect the answers participants accept. Our findings indicate that explanation detail affects what answers participants reading a thread select (p0.05)—the inverse of what research has shown for those asking and answering questions. The qualitative analysis of participants' rationale further explains how several cognitive biases underpin these findings. Correspondence bias, in particular, plays an important role in instilling readers with a false sense of confidence in an answer through the way it looks, regardless of whether it works, is secure, or if the community agrees with it. As a result, we argue that StackOverflow's use as a knowledge base by people not actively involved in threads'when there is only one-way-communication—may inadvertently contribute to the spread of insecure code, as the community's voting mechanisms hold little power to deter them from answers

    Identifying reputation collectors in community question answering (CQA) sites: Exploring the dark side of social media

    Get PDF
    YesThis research aims to identify users who are posting as well as encouraging others to post low-quality and duplicate contents on community question answering sites. The good guys called Caretakers and the bad guys called Reputation Collectors are characterised by their behaviour, answering pattern and reputation points. The proposed system is developed and analysed over publicly available Stack Exchange data dump. A graph based methodology is employed to derive the characteristic of Reputation Collectors and Caretakers. Results reveal that Reputation Collectors are primary sources of low-quality answers as well as answers to duplicate questions posted on the site. The Caretakers answer limited questions of challenging nature and fetches maximum reputation against those questions whereas Reputation Collectors answers have so many low-quality and duplicate questions to gain the reputation point. We have developed algorithms to identify the Caretakers and Reputation Collectors of the site. Our analysis finds that 1.05% of Reputation Collectors post 18.88% of low quality answers. This study extends previous research by identifying the Reputation Collectors and 2 how they collect their reputation points

    FACTORS INFLUENCING USER’S CONTINUANCE INTENTION ON PAID QUESTION AND ANSWER SERVICE ----A STUDY ON WEIBO IN CHINA

    Get PDF
    This thesis addresses the research question “Why do users continue to use paid Q&A in China” by means showed below: First, this research introduces research background of paid Q&A in China and raises corresponding research question and highlights the research significance of this thesis topic; Second, the author concludes previous research on paid Q&A in aspects of Q&A system, paid subscription and sharing economy, and finds that most of prior research focuses on exploring the influence of usefulness but not enjoyment on the users’ willingness of continuing using a paid Q&A system; Third, the thesis introduces the VAM theory and build a modified model based on it, this modified model highlights the importance of pleasure on users’ continuance intention in using paid Q&A; Finally, the empirical study combining an Exploratory Factor Analysis and a Confirmatory Factor Analysis proves that, after integrating factors extracted from previous research and the proposed model, the research is tested to be explanatorily capable and hypotheses related to the model are mostly proved to be supported. As a conclusion, this study conducts an investigation on the constructs and related theories that influence users’ continuance intention to use paid Q&A, from a hedonic perspective. In this thesis, VAM theory is selected as the prototype of proposed research model which reveals factors affecting users’ continuance intention to use a Chinese paid Q&A product named Weibo Paid Q&A. In this thesis, the proposed model makes predictions that the constructs perceived fee and community atmosphere along with perceived enjoyment construct have critical effect on users’ continuance willingness in using Weibo Paid Q&A in China. With the assistance of PLS–SEM, this study analyzes data collected from users in WPQA, the empirical study verifies that users' continuance intention is assuredly dependent on perceived fee and community atmosphere along with perceived enjoyment. The study also reveals that quality of answerers and quality of answer positively exert significant influences on perceived enjoyment

    Towards Supporting Visual Question and Answering Applications

    Get PDF
    abstract: Visual Question Answering (VQA) is a new research area involving technologies ranging from computer vision, natural language processing, to other sub-fields of artificial intelligence such as knowledge representation. The fundamental task is to take as input one image and one question (in text) related to the given image, and to generate a textual answer to the input question. There are two key research problems in VQA: image understanding and the question answering. My research mainly focuses on developing solutions to support solving these two problems. In image understanding, one important research area is semantic segmentation, which takes images as input and output the label of each pixel. As much manual work is needed to label a useful training set, typical training sets for such supervised approaches are always small. There are also approaches with relaxed labeling requirement, called weakly supervised semantic segmentation, where only image-level labels are needed. With the development of social media, there are more and more user-uploaded images available on-line. Such user-generated content often comes with labels like tags and may be coarsely labelled by various tools. To use these information for computer vision tasks, I propose a new graphic model by considering the neighborhood information and their interactions to obtain the pixel-level labels of the images with only incomplete image-level labels. The method was evaluated on both synthetic and real images. In question answering, my research centers on best answer prediction, which addressed two main research topics: feature design and model construction. In the feature design part, most existing work discussed how to design effective features for answer quality / best answer prediction. However, little work mentioned how to design features by considering the relationship between answers of one given question. To fill this research gap, I designed new features to help improve the prediction performance. In the modeling part, to employ the structure of the feature space, I proposed an innovative learning-to-rank model by considering the hierarchical lasso. Experiments with comparison with the state-of-the-art in the best answer prediction literature have confirmed that the proposed methods are effective and suitable for solving the research task.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case Study of Stack Overflow

    Get PDF
    Technical question and answer (Q&A) websites have changed how developers seek information on the web and become more popular due to the shortcomings in official documentation and alternative knowledge sharing resources. Stack Overflow (SO) is one of the largest and most popular online Q&A websites for developers where they can share knowledge by answering questions and learn new skills by asking questions. Unfortunately, a large number of questions (up to 29%) are not answered at all, which might hurt the quality or purpose of this community-oriented knowledge base. In this thesis, we first attempt to detect the potentially unanswered questions during their submission using machine learning models. We compare unanswered and answered questions quantitatively and qualitatively. The quantitative analysis suggests that topics discussed in the question, the experience of the question submitter, and readability of question texts could often determine whether a question would be answered or not. Our qualitative study also reveals why the questions remain unanswered that could guide novice users to improve their questions. During analyzing the questions of SO, we see that many of them remain unanswered and unresolved because they contain such code segments that could potentially have programming issues (e.g., error, unexpected behavior); unfortunately, the issues could always not be reproduced by other users. This irreproducibility of issues might prevent questions of SO from getting answers or appropriate answers. In our second study, we thus conduct an exploratory study on the reproducibility of the issues discussed in questions and the correlation between issue reproducibility status (of questions) and corresponding answer meta-data such as the presence of an accepted answer. According to our analysis, a question with reproducible issues has at least three times higher chance of receiving an accepted answer than the question with irreproducible issues. However, users can improve the quality of questions and answers by editing. Unfortunately, such edits may be rejected (i.e., rollback) due to undesired modifications and ambiguities. We thus offer a comprehensive overview of reasons and ambiguities in the SO rollback edits. We identify 14 reasons for rollback edits and eight ambiguities that are often present in those edits. We also develop algorithms to detect ambiguities automatically. During the above studies, we find that about half of the questions that received working solutions have negative scores. About 18\% of the accepted answers also do not score the maximum votes. Furthermore, many users are complaining against the downvotes that are cast to their questions and answers. All these findings cast serious doubts on the reliability of the evaluation mechanism employed at SO. We thus concentrate on the assessment mechanism of SO to ensure a non-biased, reliable quality assessment mechanism of SO. This study compares the subjective assessment of questions with their objective assessment using 2.5 million questions and ten text analysis metrics. We also develop machine learning models to classify the promoted and discouraged questions and predict them during their submission time. We believe that the findings from our studies and proposed techniques have the potential to (1) help the users to ask better questions with appropriate code examples, and (2) improve the editing and assessment mechanism of SO to promote better content quality
    corecore