Search CORE

20 research outputs found

Relating Developers’ Concepts and Artefact Vocabulary in a Financial Software Module

Author: Dilshener Tezcan
Wermelinger Michel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2011
Field of study

Developers working on unfamiliar systems are challenged to accurately identify where and how high-level concepts are implemented in the source code. Without additional help, concept location can become a tedious, time-consuming and error-prone task. In this paper we study an industrial financial application for which we had access to the user guide, the source code, and some change requests. We compared the relative importance of the domain concepts, as understood by developers, in the user manual and in the source code. We also searched the code for the concepts occurring in change requests, to see if they could point developers to code to be modified. We varied the searches (using exact and stem matching, discarding stop-words, etc.) and present the precision and recall. We discuss the implication of our results for maintenance

Crossref

Open Research Online (The Open University)

A Case Study in Matching Service Descriptions to Implementations in an Existing System

Author: D'Souza Deepak
Gupta Hari S.
Komondoor Raghavan
Rama Girish M.
Publication venue
Publication date: 01/01/2010
Field of study

A number of companies are trying to migrate large monolithic software systems to Service Oriented Architectures. A common approach to do this is to first identify and describe desired services (i.e., create a model), and then to locate portions of code within the existing system that implement the described services. In this paper we describe a detailed case study we undertook to match a model to an open-source business application. We describe the systematic methodology we used, the results of the exercise, as well as several observations that throw light on the nature of this problem. We also suggest and validate heuristics that are likely to be useful in partially automating the process of matching service descriptions to implementations.Comment: 20 pages, 19 pdf figure

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications

Identifying class name inconsistency in hierarchy: a first simple heuristic

Author: Alidra Abdelghani
Anquetil Nicolas
Ducasse Stéphane
Saker Moussa
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/09/2017
Field of study

International audienceGiving good class names is an important task. Good programmers often report that they take several attempts to find an adequate one. Often programmers do not name consistently classes within a package, project or hierarchy. This is a problem because it hampers understanding the systems. In this article we present a simple heuristic (a distribution) to characterise class naming. We combine such a heuristic with structural information to identify inconsistent class names. In addition, we use this simple heuristic to give packages a shape. We applied such heuristic to 285 packages in Pharo to identify misnamed classes. Some of these misnamed classes are reported and discussed here

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Instrumentación de Programas Escritos en Java para Interconectar los Dominios del Problema y del Programa

Author: Bernardis Hernán
Publication venue
Publication date: 19/09/2022
Field of study

La Comprensión de Programas (CP) es una disciplina de la Ingeniería de Software cuyo objetivo es facilitar el entendimiento de los sistemas; mediante el desarrollo de Métodos, Técnicas, Estrategias y Herramientas que permiten comprender las funcionalidades del sistema de estudio. Uno de los principales desafíos en CP es establecer una relación entre los Dominios del Problema y del Programa. El primero se relaciona con el comportamiento del sistema de estudio; mientras que el segundo se centra en las componentes del programa para producir dicho comportamiento. Una forma de construir esta relación consiste en elaborar una representación para cada dominio y luego establecer un procedimiento de vinculación entre ambas representaciones. La tarea anterior implica extraer información de ambos dominios, para lo cual existen múltiples técnicas. En este artículo se describe un esquema de extracción de información dinámica desde el dominio del programa, que es muy útil para la implementación de estrategias de comprensión.Sociedad Argentina de Informática e Investigación Operativ

Servicio de Difusión de la Creación Intelectual

Efficient Information Retrieval for Software Bug Localization

Author: Khatiwada Saket
Publication venue: LSU Digital Commons
Publication date: 12/03/2022
Field of study

Software systems are often shipped with defects. When a bug is reported, developers use the information available in the associated report to locate source code fragments that need to be modified to fix the bug. However, as software systems evolve in size and complexity, bug localization can become a tedious and time-consuming process. Contemporary bug localization tools utilize Information Retrieval (IR) methods for automated support to minimize the manual effort. IR methods exploit the textual content of bug reports to capture and rank relevant buggy source files. However, for an IR-based bug localization tool to be useful, it must achieve adequate retrieval accuracy. Lower precision and recall can leave developers with large amounts of incorrect information to wade through. Motivated by these observations, in this dissertation, we propose a new paradigm of information-theoretic IR methods to support bug localization tasks in software systems. These methods exploit the co-occurrence patterns of code terms in software systems to reveal latent semantic information that other methods often fail to capture. We further investigate the impact of combining various IR methods on the retrieval accuracy of bug localization engines. The main assumption is that different IR methods, targeting different dimensions of similarity between software artifacts, can enhance the confidence in each other\u27s results. Furthermore, we propose a novel approach for enhancing the performance of IR-enabled bug localization methods in the context of Open-Source Software (OSS). The proposed approach exploits knowledge from previously resolved bugs to help localize new bugs. Our analysis uses multiple datasets generated for multiple open-source and closed source projects. Our results show that a) information-theoretic IR methods can significantly outperform classical IR methods in bug localization tasks, b) optimized IR-hybrids can significantly outperform individual IR methods, and near-optimal global configurations can be determined for different combinations of IR methods, and c) information extracted from previously resolved bug reports can significantly enhance the accuracy of IR-enabled bug localization methods in OSS

Louisiana State University

A three-layer model of source code comprehension

Author: Agrawal Ashish
Belmonte Javier
Dugerdil Philippe
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

In this paper we first propose a source code comprehension model built as a hierarchy of three abstraction levels from the source code to the purpose (goal) of the program. The elements belonging to each layer have been precisely defined as well as their links to the elements in the adjacent layers. Consequently this model allows to bridge the semantic gap between the purpose of the program defined in business terms and the code that implements it. The model leverages two ontologies: an action ontology, which is specific to our approach, and a domain concept ontology. Next this model has been implemented as a tool under Eclipse and two experiments have been performed to assess the relevance of our approach in the maintenance of a large-scale program. The results of this experiment are very encouraging. The contribution of the paper is the presentation of our program comprehension model built on a novel approach based on an action ontology, the description of the tool we developed to assess the relevance of model and the testing of the latter with two controlled experiments

Crossref

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

RERO DOC Digital Library

Restructuring source code identifiers

Author: Eshkevari Laleh Mousavi
Publication venue
Publication date: 01/09/2010
Field of study

In software engineering, maintenance cost 60% of overall project lifecycle costs of any software product. Program comprehension is a substantial part of maintenance and evolution cost and, thus, any advancement in maintenance, evolution, and program understanding will potentially greatly reduce the total cost of ownership of any software products. Identifiers are an important source of information during program understanding and maintenance. Programmers often use identifiers to build their mental models of the software artifacts. Thus, poorly-chosen identifiers have been reported in the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or simple strings. We conjecture that the use of identical terms in different contexts may increase the risk of faults, and hence maintenance effort. We investigate our conjecture using a measure combining term entropy and term context-coverage to study whether certain terms increase the odds ratios of methods to be fault-prone. We compute term entropy and context-coverage of terms extracted from identifiers in Rhino 1.4R3 and ArgoUML 0.16. We show statistically that methods containing terms with high entropy and context-coverage are more fault-prone than others, and that the new measure is only partially correlated with size. We will build on this study, and will apply summarization technique for extracting linguistic information form methods and classes. Using this information, we will extract domain concepts from source code, and propose linguistic based refactoring

PolyPublie