3,265 research outputs found
On The Relationship Between The Vocabulary Of Bug Reports And Source Code
The use of text retrieval techniques on concept location and bug localization yields remarkable benefits. The artifacts found in source code and bug reports contain important information related to the bug localization process. When locating the bugs, it is a programmer\u27s task to formulate effective queries such that most of the predicted terms in the query appear in the relevant defect code, but not in most of the non-relevant source files. These queries are built based on the textual content found in the bug reports, especially the bug title and the description. A large body of research uses bug descriptions to evaluate bug localization techniques using text retrieval. All these studies are conducted under the implicit assumption that the bug description and the relevant source code files share important terms. This paper presents an empirical study that explores this conjecture. We found that bug reports share more terms with the patched classes than with the other classes in the software system. Moreover, the study revealed that the class names are more likely to share terms with the bug descriptions than other code locations. We also found that more verbose parts of the source code, such as, comments share more words. Furthermore, we discovered that the shared terms may be better predictors for bug localization than some other text retrieval techniques, such as, LSI
Exploiting natural language structures in software informal documentation
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Communication means, such as issue trackers, mailing lists, Q&A forums, and app reviews, are premier means of collaboration among developers, and between developers and end-users. Analyzing such sources of information is crucial to build recommenders for developers, for example suggesting experts, re-documenting source code, or transforming user feedback in maintenance and evolution strategies for developers. To ease this analysis, in previous work we proposed DECA (Development Emails Content Analyzer), a tool based on Natural Language Parsing that classifies with high precision development emails' fragments according to their purpose. However, DECA has to be trained through a manual tagging of relevant patterns, which is often effort-intensive, error-prone and requires specific expertise in natural language parsing. In this paper, we first show, with a study involving Master's and Ph.D. students, the extent to which producing rules for identifying such patterns requires effort, depending on the nature and complexity of patterns. Then, we propose an approach, named NEON (Nlp-based softwarE dOcumentation aNalyzer), that automatically mines such rules, minimizing the manual effort. We assess the performances of NEON in the analysis and classification of mobile app reviews, developers discussions, and issues. NEON simplifies the patterns' identification and rules' definition processes, allowing a savings of more than 70% of the time otherwise spent on performing such activities manually. Results also show that NEON-generated rules are close to the manually identified ones, achieving comparable recall
Assessing the Quality of the Steps to Reproduce in Bug Reports
A major problem with user-written bug reports, indicated by developers and
documented by researchers, is the (lack of high) quality of the reported steps
to reproduce the bugs. Low-quality steps to reproduce lead to excessive manual
effort spent on bug triage and resolution. This paper proposes Euler, an
approach that automatically identifies and assesses the quality of the steps to
reproduce in a bug report, providing feedback to the reporters, which they can
use to improve the bug report. The feedback provided by Euler was assessed by
external evaluators and the results indicate that Euler correctly identified
98% of the existing steps to reproduce and 58% of the missing ones, while 73%
of its quality annotations are correct.Comment: In Proceedings of the 27th ACM Joint European Software Engineering
Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE
'19), August 26-30, 2019, Tallinn, Estoni
The Localisation of Video Games
The present thesis is a study of the translation of video games with a particular emphasis on
the Spanish-English language pair, although other languages are brought into play when they offer a
clearer illustration of a particular point in the discussion. On the one hand, it offers a
descriptive analysis of the video game industry understood as a global phenomenon in entertainment,
with the aim of understanding the norms governing present game development and publishing
practices. On the other hand, it discusses particular translation issues that seem to be unique to
these entertainment products due to their multichannel and polysemiotic nature, in which verbal and
nonverbal signs are intimately interconnected in search of maximum game interactivity.
Although this research positions itself within the theoretical framework of Descriptive Translation
Studies, it actually goes beyond the mere accounting of current processes to propose changes
whenever professional practice seems to be unable to rid itself of old unsatisfactory habits. Of a
multidisciplinary nature, the present thesis is greatly informed by various areas of knowledge such
as audiovisual translation, software localisation, computer assisted translation and translation
memory tools, comparative literature, and video game production and marketing, amongst others.
The conclusions are an initial breakthrough in terms of research into this new area, challenging
some of the basic tenets current in translation studies thanks to its multidisciplinary approach,
and its solid grounding on current game localisation industry practice. The results can be useful
in order to boost professional quality and to promote the
training of translators in video game localisation in higher education centres.Open Acces
Supporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation
Software bugs and failures cost trillions of dollars every year, and could even lead to deadly accidents (e.g., Therac-25 accident). During maintenance, software developers fix numerous bugs and implement hundreds of new features by making necessary changes to the existing software code. Once an issue report (e.g., bug report, change request) is assigned to a developer, she chooses a few important keywords from the report as a search query, and then attempts to find out the exact locations in the software code that need to be either repaired or enhanced. As a part of this maintenance, developers also often select ad hoc queries on the fly, and attempt to locate the reusable code from the Internet that could assist them either in bug fixing or in feature implementation. Unfortunately, even the experienced developers often fail to construct the right search queries. Even if the developers come up with a few ad hoc queries, most of them require frequent modifications which cost significant development time and efforts. Thus, construction of an appropriate query for localizing the software bugs, programming concepts or even the reusable code is a major challenge. In this thesis, we overcome this query construction challenge with six studies, and develop a novel, effective code search solution (BugDoctor) that assists the developers in localizing the software code of interest (e.g., bugs, concepts and reusable code) during software maintenance. In particular, we reformulate a given search query (1) by designing novel keyword selection algorithms (e.g., CodeRank) that outperform the traditional alternatives (e.g., TF-IDF), (2) by leveraging the bug report quality paradigm and source document structures which were previously overlooked and (3) by exploiting the crowd knowledge and word semantics derived from Stack Overflow Q&A site, which were previously untapped. Our experiment using 5000+ search queries (bug reports, change requests, and ad hoc queries) suggests that our proposed approach can improve the given queries significantly through automated query reformulations. Comparison with 10+ existing studies on bug localization, concept location and Internet-scale code search suggests that our approach can outperform the state-of-the-art approaches with a significant margin
Recommended from our members
Facilitating software evolution through natural language comments and dialogue
Software projects are continually evolving, as developers incorporate changes to refactor code, support new functionality, and fix bugs. To uphold software quality amidst constant changes and also facilitate prompt implementation of critical changes, it is desirable to have automated tools for supporting and driving software evolution. In this thesis, we explore tasks and data and design machine learning approaches which leverage natural language to serve this purpose.
When developers make code changes, they sometimes fail to update the accompanying natural language comments documenting various aspects of the code, which can lead to confusion and vulnerability to bugs. We present our work on alerting developers of inconsistent comments upon code changes and suggesting updates by learning to correlate comments and code.
When a bug is reported, developers engage in a dialogue to collaboratively understand it and ultimately resolve it. While the solution is likely formulated within the discussion, it is often buried in a large amount of text, making it difficult to comprehend, which delays its implementation through the necessary repository changes. To guide developers in more easily absorbing information relevant towards making these changes and consequently expedite bug resolution, we investigate generating a concise natural language description of the solution by synthesizing relevant content as it emerges in the discussion. We benchmark models for generating solution descriptions and design a classifier for determining when sufficient context for generating an informative description becomes available. We investigate approaches for real-time generation, entailing separately trained and jointly trained classification and generation models. Furthermore, we also study techniques for deriving natural language context from bug report discussions and generated solution descriptions to guide models in generating suggested bug-resolving code changes.Computer Science
Courier Volume III, Number 1, Whole Number 17, March 1963
The Iron Monster, the Crackling Insects of Onondaga County, and Stephen Crane by Lester G. Wells; Horatio Alger and Ralph D. Gardner; Park Benjamin, Lillian B. Gilkes, and the Lanier Library; I Knew Stephen Crane at Syracuse; Chiromancers off the Scent; How about it, Tar Heels?; The Value of a Box of Old Letters; The Sir John Simeon Collection of Victorian Correspondence; Adult Education Materials; Dawson\u27s of Los Angeles; Page Proofs of Elizabeth Barrett Browning\u27s Aurora Leigh; Gypsies at Leeds; The Passing of a Great Lady; Leary\u27s Emporium Librorum; Back Numbers of the Courier; James Gibbons Huneker and Dr. Arnold T. Schwab; The Courier\u27s Hall of Donors Fame; Faults Escaped; Memorials
- …