1,674 research outputs found
Grand Challenges of Traceability: The Next Ten Years
In 2007, the software and systems traceability community met at the first
Natural Bridge symposium on the Grand Challenges of Traceability to establish
and address research goals for achieving effective, trustworthy, and ubiquitous
traceability. Ten years later, in 2017, the community came together to evaluate
a decade of progress towards achieving these goals. These proceedings document
some of that progress. They include a series of short position papers,
representing current work in the community organized across four process axes
of traceability practice. The sessions covered topics from Trace Strategizing,
Trace Link Creation and Evolution, Trace Link Usage, real-world applications of
Traceability, and Traceability Datasets and benchmarks. Two breakout groups
focused on the importance of creating and sharing traceability datasets within
the research community, and discussed challenges related to the adoption of
tracing techniques in industrial practice. Members of the research community
are engaged in many active, ongoing, and impactful research projects. Our hope
is that ten years from now we will be able to look back at a productive decade
of research and claim that we have achieved the overarching Grand Challenge of
Traceability, which seeks for traceability to be always present, built into the
engineering process, and for it to have "effectively disappeared without a
trace". We hope that others will see the potential that traceability has for
empowering software and systems engineers to develop higher-quality products at
increasing levels of complexity and scale, and that they will join the active
community of Software and Systems traceability researchers as we move forward
into the next decade of research
Grand Challenges of Traceability: The Next Ten Years
In 2007, the software and systems traceability community met at the first
Natural Bridge symposium on the Grand Challenges of Traceability to establish
and address research goals for achieving effective, trustworthy, and ubiquitous
traceability. Ten years later, in 2017, the community came together to evaluate
a decade of progress towards achieving these goals. These proceedings document
some of that progress. They include a series of short position papers,
representing current work in the community organized across four process axes
of traceability practice. The sessions covered topics from Trace Strategizing,
Trace Link Creation and Evolution, Trace Link Usage, real-world applications of
Traceability, and Traceability Datasets and benchmarks. Two breakout groups
focused on the importance of creating and sharing traceability datasets within
the research community, and discussed challenges related to the adoption of
tracing techniques in industrial practice. Members of the research community
are engaged in many active, ongoing, and impactful research projects. Our hope
is that ten years from now we will be able to look back at a productive decade
of research and claim that we have achieved the overarching Grand Challenge of
Traceability, which seeks for traceability to be always present, built into the
engineering process, and for it to have "effectively disappeared without a
trace". We hope that others will see the potential that traceability has for
empowering software and systems engineers to develop higher-quality products at
increasing levels of complexity and scale, and that they will join the active
community of Software and Systems traceability researchers as we move forward
into the next decade of research
Supporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation
Software bugs and failures cost trillions of dollars every year, and could even lead to deadly accidents (e.g., Therac-25 accident). During maintenance, software developers fix numerous bugs and implement hundreds of new features by making necessary changes to the existing software code. Once an issue report (e.g., bug report, change request) is assigned to a developer, she chooses a few important keywords from the report as a search query, and then attempts to find out the exact locations in the software code that need to be either repaired or enhanced. As a part of this maintenance, developers also often select ad hoc queries on the fly, and attempt to locate the reusable code from the Internet that could assist them either in bug fixing or in feature implementation. Unfortunately, even the experienced developers often fail to construct the right search queries. Even if the developers come up with a few ad hoc queries, most of them require frequent modifications which cost significant development time and efforts. Thus, construction of an appropriate query for localizing the software bugs, programming concepts or even the reusable code is a major challenge. In this thesis, we overcome this query construction challenge with six studies, and develop a novel, effective code search solution (BugDoctor) that assists the developers in localizing the software code of interest (e.g., bugs, concepts and reusable code) during software maintenance. In particular, we reformulate a given search query (1) by designing novel keyword selection algorithms (e.g., CodeRank) that outperform the traditional alternatives (e.g., TF-IDF), (2) by leveraging the bug report quality paradigm and source document structures which were previously overlooked and (3) by exploiting the crowd knowledge and word semantics derived from Stack Overflow Q&A site, which were previously untapped. Our experiment using 5000+ search queries (bug reports, change requests, and ad hoc queries) suggests that our proposed approach can improve the given queries significantly through automated query reformulations. Comparison with 10+ existing studies on bug localization, concept location and Internet-scale code search suggests that our approach can outperform the state-of-the-art approaches with a significant margin
A Systematic Review of Automated Query Reformulations in Source Code Search
Fixing software bugs and adding new features are two of the major maintenance
tasks. Software bugs and features are reported as change requests. Developers
consult these requests and often choose a few keywords from them as an ad hoc
query. Then they execute the query with a search engine to find the exact
locations within software code that need to be changed. Unfortunately, even
experienced developers often fail to choose appropriate queries, which leads to
costly trials and errors during a code search. Over the years, many studies
attempt to reformulate the ad hoc queries from developers to support them. In
this systematic literature review, we carefully select 70 primary studies on
query reformulations from 2,970 candidate studies, perform an in-depth
qualitative analysis (e.g., Grounded Theory), and then answer seven research
questions with major findings. First, to date, eight major methodologies (e.g.,
term weighting, term co-occurrence analysis, thesaurus lookup) have been
adopted to reformulate queries. Second, the existing studies suffer from
several major limitations (e.g., lack of generalizability, vocabulary mismatch
problem, subjective bias) that might prevent their wide adoption. Finally, we
discuss the best practices and future opportunities to advance the state of
research in search query reformulations.Comment: 81 pages, accepted at TOSE
Improving information retrieval-based concept location using contextual relationships
For software engineers to find all the relevant program elements implementing a business concept, existing techniques based on information retrieval (IR) fall short in providing adequate solutions. Such techniques usually only consider the conceptual relations based on lexical similarities during concept mapping. However, it is also fundamental to consider the contextual relationships existing within an application’s business domain to aid in concept location. As an example, this paper proposes to use domain specific ontological relations during concept mapping and location activities when implementing business requirements
Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design
The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface
Extracting Tasks from Customize Portal using Natural Language Processing
In software documentation, product knowledge and software requirement are very important to improve product quality. Within maintenance stage, reading of whole documentation of large corpus won’t be possible by developers. They need to receive software documentation i.e. (development, designing and testing etc.) in a short period of time. Important documents are able to record in software documentation. There live a space between information which developer wants and software documentation. To solve this problem, an approach for extracting relevant task that is based on heuristically matching the structure of the documentation under three phases of software documentation (i.e. documentation, development and testing) is described. Our main idea is that task is extracted automatically from the software documentation, freeing the developer easily get the required data from software documentation with customize portal using WordNet library and machine learning technique. And then the category of task can be generated easily from existing applications using natural language processing. Our approach use WordNet library to identify relevant tasks for calculating frequency of each word which allows developers in a piece of software to discover the word usage
Github application programme interface and wordnet for code reuse
It is clear that code reuse is important task in software development and maintenance. As a lot of software application and source code have been used as libraries in version control systems, such that Git, SVN, LibreSource and related web sites, such that GitHub.com, sourceforge.net, projectsgeek.com, Googlecode.com, more and more companies, especially Small and Medium Enterprises (SMEs), are reusing open source code to develop their own software. The problem in code reuse is, after download all relevant code, we need to identify most relevant code among pool of code. In this paper we use keyword search with n-gram NLP technique using GitHub Application Program Interface (API). Before search the source code, we retrieve all Repository name in GitHub belongs to particular programing language (JAVA, C++, etc.), as well as we retrieve all .java file name if we search java libraries using GitHub API. Then compare our keyword with this list, if the keyword extracted from Software architecture is connected word, then we will split using Apache Camel Splitter. If the particular keyword related to any project, we download the project. Otherwise using WordNet, get some synonym and do the above process again. For further relevancy, we will use a speech recognition technique (Dynamic Time Warping (DTW)) and a NLP technique (Part of Speech Tagging (POS)). Because of this is a part of the whole research, in this paper we will consider only GitHub API
- …