Search CORE

62 research outputs found

Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods and C functions

Author: Abran
Baggen
Basili
Breusch
Carr
Chambers
Chidamber
Coleman
Conte
Edgell
El Emam
Feng
Fenton
Gill
Gorla
Graves
Henry
Herraiz
Jay
Jbara
Joanes
Kemerer
Khomh
Kitchenham
Kvålseth
Li
Lind
Linstead
Loehle
Ma
Malhotra
Manning
Mayrhauser
McCabe
Moores
Mordal
Mota Silveira Neto
Munson
Myers
Paige
Pearson
Saltelli
Schneidewind
Shepperd
Sheskin
Shull
Spearman
Succi
Tashtoush
Troster
Woodward
Yu
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Measuring the internal quality of source code is one of the traditional goals of making software development into an engineering discipline. Cyclomatic Complexity (CC) is an often used source code quality metric, next to Source Lines of Code (SLOC). However, the use of the CC metric is challenged by the repeated claim that CC is redundant with respect to SLOC due to strong linear correlation. We conducted an extensive literature study of the CC/SLOC correlation results. Next, we tested correlation on large Java (17.6 M methods) and C (6.3 M functions) corpora. Our results show that linear correlation between SLOC and CC is only moderate as caused by increasingly high variance. We further observe that aggregating CC and SLOC as well as performing a power transform improves the correlation. Our conclusion is that the observed linear correlation between CC and SLOC of Java methods or C functions is not strong enough to conclude that CC is redundant with SLOC. This conclusion contradicts earlier claims from literature, but concurs with the widely accepted practice of measuring of CC next to SLOC

Repository TU/e

Reverse engineering source code:Empirical studies of limitations and opportunities

Author: Landman D.
Publication venue
Publication date: 01/01/2017
Field of study

How Scale Affects Structure in Java Programs

Author: Lopes Cristina V.
Ossher Joel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/08/2015
Field of study

Many internal software metrics and external quality attributes of Java programs correlate strongly with program size. This knowledge has been used pervasively in quantitative studies of software through practices such as normalization on size metrics. This paper reports size-related super- and sublinear effects that have not been known before. Findings obtained on a very large collection of Java programs -- 30,911 projects hosted at Google Code as of Summer 2011 -- unveils how certain characteristics of programs vary disproportionately with program size, sometimes even non-monotonically. Many of the specific parameters of nonlinear relations are reported. This result gives further insights for the differences of "programming in the small" vs. "programming in the large." The reported findings carry important consequences for OO software metrics, and software research in general: metrics that have been known to correlate with size can now be properly normalized so that all the information that is left in them is size-independent.Comment: ACM Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), October 2015. (Preprint

arXiv.org e-Print Archive

Decision Modules in Models and Implementations

Author: Roubtsov Serguei
Roubtsova E.E.
Publication venue
Publication date: 01/01/2014
Field of study

Open University of the Netherlands Research Portal

Repository TU/e

Comparing the performance of string operations across programming languages

Author: Pelkonen N. (Niko)
Publication venue: University of Oulu
Publication date: 16/01/2020
Field of study

Abstract. In this thesis, the performance of string operations are compared across programming languages. Handling strings effectively is important especially when performance is a crucial factor and large string sizes may emerge. Common examples where large string sizes emerge are during digitalization of a product, reading string data from a database, reading and handling large CSV-files and Excel-files, converting file format to another file format (e.g. CSV to Excel and vice versa), and reading and handling a DOM-tree of a website. There has been a lot of corresponding research where programming languages are benchmarked, but none of them focus directly on string operations. The main goal of this thesis is to fill this gap in literature and try to find out which programming languages have the best results on string operations in terms of execution time and memory (maximum RSS) usage. The test environment was formed by creating randomly generated string files with sizes varying from ten thousand characters to 100 million characters. The generated characters were ‘a’, ‘b’, and ‘ ‘ (whitespace character). The programming languages selected for this thesis were Python, C, C++, Java, Perl, Ruby, Go, and Swift. Go seemed to be the most effective language in execution times, although it was not the fastest in many operations. C used very little memory, but only five operations were implemented in it. Every operation was implemented in Python, and it used additional memory to loading the string file in only one operation, which was sorting a string. Swift had quite bad results, and this could be caused by the Linux version of Swift that was used. In regular expressions, Perl and C++ were overwhelmingly effective. Java used the most memory in every operation

Understanding the impact of introducing Lambda expressions in Java Programs

Author: Bonifácio Rodrigo
Canedo Edna
Fortes José
Lima Fernanda
Lopes Francisco
Lucas Walter
Marcílio Diego
Saraiva João
Publication venue: 'Sociedade Brasileira de Computacao - SB'
Publication date: 01/01/2020
Field of study

Background: The Java programming language version eight introduced several features that encourage the func tional style of programming, including the support for lambda expressions and the Stream API. Currently, there is a common wisdom that refactoring legacy code to introduce lambda expressions, besides other potential benefits, simplifies the code and improves program comprehension. Aims: The purpose of this work is to investigate this belief, conducting an indepth study to evaluate the effect of introducing lambda expressions on program comprehension. Method: We conducted this research using a mixedmethod approach. For the quantitative method, we quantitatively analyzed 158 pairs of code snippets extracted directly either from GitHub or from recommendations from three tools (RJTL, NetBeans, and IntelliJ). We also surveyed practitioners to collect their perceptions about the benefits on program comprehension when introducing lambda expressions. We asked practitioners to evaluate and rate sets of pairs of code snippets. Results: We found contradictory results in our research. Based on the quantitative assessment, we could not find evidence that the introduction of lambda expressions improves software readability— one of the components of program comprehension. Our results suggest that the transformations recommended by the aforementioned tools decrease program comprehension when assessed by two stateoftheart models to estimate readability. Differently, our findings of the qualitative assessment suggest that the introduction of lambda expression improves program comprehension in three scenarios when: we convert anonymous inner classes to a lambda expression, use structural loops with inner conditional to an anyMatch operator, and apply structural loops to filter operator combined with a collect method. Implications: We argue in this paper that one can improve program comprehension when he/she applies particular transformations to introduce lambda expressions (e.g., replacing anonymous inner classes with lambda expressions). Also, the opinion of the participants highlights which kind of transformation for introducing lambda might be advantageous. This might support the implementation of effective tools for automatic program transformations

Recommended from our members

Beyond Similar Code: Leveraging Social Coding Websites

Author: Yang Di
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Programmers often write code with similarity to existing code written somewhere. Code search tools can help developers find similar solutions and identify possible improvements. For code search tools, good search results rely on valid data collection. Social coding websites, such as Question & Answer forum Stack Overflow (SO) and project repository GitHub, are popular destinations when programmers look for how to achieve certain programming tasks. Over the years, SO and GitHub have accumulated an enormous knowledge base of, and around, code. Since these software artifacts are publicly available, it is possible to leverage them in code search tools. This dissertation explores the opportunities of leveraging software artifacts from the social coding websites in searching for not just similar, but related, code. Programmers query SO and GitHub extensively to search for suitable code for reuse, however, not much is known about the usability or quality of the available code from each website. This dissertation first investigates under what circumstances the software artifacts found in social coding websites can be leveraged for purposes other than their immediate use by developers. It points out a number of problems that need to be addressed before those artifacts can be leveraged for code search and development tools. Specifically, triviality, fragility, and duplication, dominate these artifacts. However, when these problems are addressed, there is still a considerable amount of good quality artifacts that can be leveraged.SO and GitHub are not only two separate data resources, moreover, they together, belong to a larger system of software development process: the same users that rely on facilities of GitHub often seeks support on SO for their problems, and return to GitHub to apply the knowledge acquired. This dissertation further studies the crossover of software artifacts between SO and GitHub, and categorizes the adaptations from a SO code snippet to its GitHub counterparts. Existing search tools only recommend other code locations that are syntactically or semantically similar to the given code but do not reason about other kinds of relevant code that a developer should also pay attention to, e.g., auxiliary code to accomplish a complete task. With the good quality software artifacts and crossover between the two systems available, this dissertation presents two approaches that leverage these artifacts in searching for related code. Aroma indexes GitHub projects, takes a partial code snippet as input, searches the corpus for methods containing the partial code snippet, and clusters and intersects the results of the search to recommend. Aroma is evaluated on randomly selected queries created from the GitHub corpus, as well as queries derived from SO code snippets. It recommends related code for error checking and handling, objects configuring, etc. Furthermore, a user study is conducted where industrial developers are asked to complete programming tasks using Aroma and provide feedback. The results indicate that Aroma is capable of retrieving and recommending relevant code snippets efficiently. CodeAid reuses the crossover between SO and GitHub and recommends related code outside of a method body. For each SO snippet as a query, CodeAid retrieves the co-occurring code fragments for its GitHub counterparts and clusters them to recommend common ones. 74% of the common co-occurring code fragments represent related functionality that should be included in code search results. Three major types of relevancy--complementary, supplementary, and alternative methods, are identified

eScholarship - University of California

Recommended from our members

A Large-Scale Study of Modern Code Review and Security in Open Source Projects.

Author: Thompson Christopher
Wagner David A
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

eScholarship - University of California