1,696 research outputs found

    Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

    Full text link
    Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online

    Supporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation

    Get PDF
    Software bugs and failures cost trillions of dollars every year, and could even lead to deadly accidents (e.g., Therac-25 accident). During maintenance, software developers fix numerous bugs and implement hundreds of new features by making necessary changes to the existing software code. Once an issue report (e.g., bug report, change request) is assigned to a developer, she chooses a few important keywords from the report as a search query, and then attempts to find out the exact locations in the software code that need to be either repaired or enhanced. As a part of this maintenance, developers also often select ad hoc queries on the fly, and attempt to locate the reusable code from the Internet that could assist them either in bug fixing or in feature implementation. Unfortunately, even the experienced developers often fail to construct the right search queries. Even if the developers come up with a few ad hoc queries, most of them require frequent modifications which cost significant development time and efforts. Thus, construction of an appropriate query for localizing the software bugs, programming concepts or even the reusable code is a major challenge. In this thesis, we overcome this query construction challenge with six studies, and develop a novel, effective code search solution (BugDoctor) that assists the developers in localizing the software code of interest (e.g., bugs, concepts and reusable code) during software maintenance. In particular, we reformulate a given search query (1) by designing novel keyword selection algorithms (e.g., CodeRank) that outperform the traditional alternatives (e.g., TF-IDF), (2) by leveraging the bug report quality paradigm and source document structures which were previously overlooked and (3) by exploiting the crowd knowledge and word semantics derived from Stack Overflow Q&A site, which were previously untapped. Our experiment using 5000+ search queries (bug reports, change requests, and ad hoc queries) suggests that our proposed approach can improve the given queries significantly through automated query reformulations. Comparison with 10+ existing studies on bug localization, concept location and Internet-scale code search suggests that our approach can outperform the state-of-the-art approaches with a significant margin

    Mining Knowledge Bases for Question & Answers Websites

    Get PDF
    We studied the problem of searching answers for questions on a Question-and-Answer Website from knowledge bases. A number of research efforts had been developed using Stack Overflow data, which is available for the public. Surprisingly, only a few papers tried to improve the search for better answers. Furthermore, current approaches for searching a Question-and-Answer Website are usually limited to the question database, which is usually the website own content. We showed it is feasible to use knowledge bases as sources for answers. We implemented both vector-space and topic-space representations for our datasets and compared these distinct techniques. Finally, we proposed a hybrid ranking approach that took advantage of a machine-learned classifier to incorporate the tag information into the ranking and showed that it was able to improve the retrieval performance

    "Always Nice and Confident, Sometimes wrong": Developer's Experiences Engaging Generative AI Chatbots Versus Human-Powered Q&A Platforms

    Full text link
    Software engineers have historically relied on human-powered Q&A platforms, like Stack Overflow (SO), as coding aids. With the rise of generative AI, developers have adopted AI chatbots, such as ChatGPT, in their software development process. Recognizing the potential parallels between human-powered Q&A platforms and AI-powered question-based chatbots, we investigate and compare how developers integrate this assistance into their real-world coding experiences by conducting thematic analysis of Reddit posts. Through a comparative study of SO and ChatGPT, we identified each platform's strengths, use cases, and barriers. Our findings suggest that ChatGPT offers fast, clear, comprehensive responses and fosters a more respectful environment than SO. However, concerns about ChatGPT's reliability stem from its overly confident tone and the absence of validation mechanisms like SO's voting system. Based on these findings, we recommend leveraging each platform's unique features to improve developer experiences in the future

    Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case Study of Stack Overflow

    Get PDF
    Technical question and answer (Q&A) websites have changed how developers seek information on the web and become more popular due to the shortcomings in official documentation and alternative knowledge sharing resources. Stack Overflow (SO) is one of the largest and most popular online Q&A websites for developers where they can share knowledge by answering questions and learn new skills by asking questions. Unfortunately, a large number of questions (up to 29%) are not answered at all, which might hurt the quality or purpose of this community-oriented knowledge base. In this thesis, we first attempt to detect the potentially unanswered questions during their submission using machine learning models. We compare unanswered and answered questions quantitatively and qualitatively. The quantitative analysis suggests that topics discussed in the question, the experience of the question submitter, and readability of question texts could often determine whether a question would be answered or not. Our qualitative study also reveals why the questions remain unanswered that could guide novice users to improve their questions. During analyzing the questions of SO, we see that many of them remain unanswered and unresolved because they contain such code segments that could potentially have programming issues (e.g., error, unexpected behavior); unfortunately, the issues could always not be reproduced by other users. This irreproducibility of issues might prevent questions of SO from getting answers or appropriate answers. In our second study, we thus conduct an exploratory study on the reproducibility of the issues discussed in questions and the correlation between issue reproducibility status (of questions) and corresponding answer meta-data such as the presence of an accepted answer. According to our analysis, a question with reproducible issues has at least three times higher chance of receiving an accepted answer than the question with irreproducible issues. However, users can improve the quality of questions and answers by editing. Unfortunately, such edits may be rejected (i.e., rollback) due to undesired modifications and ambiguities. We thus offer a comprehensive overview of reasons and ambiguities in the SO rollback edits. We identify 14 reasons for rollback edits and eight ambiguities that are often present in those edits. We also develop algorithms to detect ambiguities automatically. During the above studies, we find that about half of the questions that received working solutions have negative scores. About 18\% of the accepted answers also do not score the maximum votes. Furthermore, many users are complaining against the downvotes that are cast to their questions and answers. All these findings cast serious doubts on the reliability of the evaluation mechanism employed at SO. We thus concentrate on the assessment mechanism of SO to ensure a non-biased, reliable quality assessment mechanism of SO. This study compares the subjective assessment of questions with their objective assessment using 2.5 million questions and ten text analysis metrics. We also develop machine learning models to classify the promoted and discouraged questions and predict them during their submission time. We believe that the findings from our studies and proposed techniques have the potential to (1) help the users to ask better questions with appropriate code examples, and (2) improve the editing and assessment mechanism of SO to promote better content quality
    corecore