4,253 research outputs found
A Case Study in Matching Service Descriptions to Implementations in an Existing System
A number of companies are trying to migrate large monolithic software systems
to Service Oriented Architectures. A common approach to do this is to first
identify and describe desired services (i.e., create a model), and then to
locate portions of code within the existing system that implement the described
services. In this paper we describe a detailed case study we undertook to match
a model to an open-source business application. We describe the systematic
methodology we used, the results of the exercise, as well as several
observations that throw light on the nature of this problem. We also suggest
and validate heuristics that are likely to be useful in partially automating
the process of matching service descriptions to implementations.Comment: 20 pages, 19 pdf figure
On the Concept of Variable Roles and its Use in Software Analysis
Human written source code in imperative programming languages exhibits
typical patterns for variable use such as flags, loop iterators, counters,
indices, bitvectors etc. Although it is widely understood by practitioners that
these variable roles are important for automated software analysis tools, they
are not systematically studied by the formal methods community, and not well
documented in the research literature. In this paper, we study the notion of
variable roles on the example of basic types (int, float, char) in C. We
propose a classification of the variables in a program by variable roles, and
demonstrate that classical data flow analysis lends itself naturally both as a
specification formalism and an analysis paradigm for this classification
problem. We demonstrate the practical applicability of our method by predicting
membership of source files to the different categories of the software
verification competition SVCOMP 2013
Improving information retrieval-based concept location using contextual relationships
For software engineers to find all the relevant program elements implementing a business concept, existing techniques based on information retrieval (IR) fall short in providing adequate solutions. Such techniques usually only consider the conceptual relations based on lexical similarities during concept mapping. However, it is also fundamental to consider the contextual relationships existing within an application’s business domain to aid in concept location. As an example, this paper proposes to use domain specific ontological relations during concept mapping and location activities when implementing business requirements
On the Generation, Structure, and Semantics of Grammar Patterns in Source Code Identifiers
Identifier names are the atoms of program comprehension. Weak identifier names decrease developer productivity and degrade the performance of automated approaches that leverage identifier names in source code analysis; threatening many of the advantages which stand to be gained from advances in artificial intelligence and machine learning. Therefore, it is vital to support developers in naming and renaming identifiers. In this paper, we extend our prior work, which studies the primary method through which names evolve: rename refactorings. In our prior work, we contextualize rename changes by examining commit messages and other refactorings. In this extension, we further consider data type changes which co-occur with these renames, with a goal of understanding how data type changes influence the structure and semantics of renames. In the long term, the outcomes of this study will be used to support research into: (1) recommending when a rename should be applied, (2) recommending how to rename an identifier, and (3) developing a model that describes how developers mentally synergize names using domain and project knowledge. We provide insights into how our data can support rename recommendation and analysis in the future, and reflect on the significant challenges, highlighted by our study, for future research in recommending renames
Conclave: Writing programs to understand programs
Software maintainers are often challenged with source code changes to improve software systems, or eliminate defects, in unfamiliar programs. To undertake these tasks a sufficient understanding of the system, or at least a small part of it, is required. One of the most time consuming tasks of this process is locating which parts of the code are responsible for some key functionality or feature. This paper introduces Conclave, an environment for software analysis, that enhances program comprehension activities. Programmers use natural languages to describe and discuss the problem domain, programming languages to write source code, and markup languages to have programs talking with other programs, and so this system has to cope with this heterogeneity of dialects, and provide tools in all these areas to effectively contribute to the understanding process. The source code, the problem domain, and the side effects of running the program are represented in the system using ontologies. A combination of tools (specialized in different kinds of languages) create mappings between the different domains. Conclave provides facilities for feature location, code search, and views of the software that ease the process of understanding the code, devising changes. The underlying feature location technique explores natural language terms used in programs (e.g. function and variable names); using textual analysis and a collection of Natural Language Processing techniques, computes synonymous sets of terms. These sets are used to score relatedness between program elements, and search queries or problem domain concepts, producing sorted ranks of program elements that address the search criteria, or concepts respectively. © Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques.(undefined)info:eu-repo/semantics/publishedVersio
Beyond Notations: Hygienic Macro Expansion for Theorem Proving Languages
In interactive theorem provers (ITPs), extensible syntax is not only crucial
to lower the cognitive burden of manipulating complex mathematical objects, but
plays a critical role in developing reusable abstractions in libraries. Most
ITPs support such extensions in the form of restrictive "syntax sugar"
substitutions and other ad hoc mechanisms, which are too rudimentary to support
many desirable abstractions. As a result, libraries are littered with
unnecessary redundancy. Tactic languages in these systems are plagued by a
seemingly unrelated issue: accidental name capture, which often produces
unexpected and counterintuitive behavior. We take ideas from the Scheme family
of programming languages and solve these two problems simultaneously by
proposing a novel hygienic macro system custom-built for ITPs. We further
describe how our approach can be extended to cover type-directed macro
expansion resulting in a single, uniform system offering multiple abstraction
levels that range from supporting simplest syntax sugars to elaboration of
formerly baked-in syntax. We have implemented our new macro system and
integrated it into the upcoming version (v4) of the Lean theorem prover.
Despite its expressivity, the macro system is simple enough that it can easily
be integrated into other systems.Comment: accepted to IJCAR 202
FOCE Intermodule Communications System
Done in conjunction with Chad Kecy of the Monterey Bay Aquarium Reasearch Institute (MBARI), this project is an expansion of the Free Ocean Carbon Emission (FOCE) project. This iteration created a communications protocol between the different modules in an expandable array of PIC24-based boards
Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming
A considerable part of the source code is identifier names-- unique lexical tokens that provide information about entities, and entity interactions, within the code. Identifier names provide human-readable descriptions of classes, functions, variables, etc. Poor or ambiguous identifier names (i.e., names that do not correctly describe the code behavior they are associated with) will lead developers to spend more time working towards understanding the code\u27s behavior. Bad naming can also have detrimental effects on tools that rely on natural language clues; degrading the quality of their output and making them unreliable. Additionally, misinterpretations of the code, caused by poor names, can result in the injection of quality issues into the system under maintenance. Thus, improved identifier naming increases developer effectiveness, higher-quality software, and higher-quality software analysis tools. In this dissertation, I establish several novel concepts that help measure and improve the quality of identifiers. The output of this dissertation work is a set of identifier name appraisal and quality tools that integrate into the developer workflow. Through a sequence of empirical studies, I have formulated a series of heuristics and linguistic patterns to evaluate the quality of identifier names in the code and provide naming structure recommendations. I envision and working towards supporting developers in integrating my contributions, discussed in this dissertation, into their development workflow to significantly improve the process of crafting and maintaining high-quality identifier names in the source code
- …