67 research outputs found
Code Clones: Reconsidering Terminology
This report discusses terminology choices and considerations relating to
copied or redundant code within software systems, i.e., relating to "code
clones." Inadequacies of existing terminology are raised and alternative
terms are discussed
06301 Abstracts Collection -- Duplication, Redundancy, and Similarity in Software
From 23.07.06 to 26.07.06, the Dagstuhl Seminar 06301 ``Duplication, Redundancy, and Similarity in Software\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Clone Detector Use Questions: A List of Desirable Empirical Studies
Code "clones" are similar segments of code that are frequently introduced
by "scavenging" existing code, that is, reusing code by copying it and
adapting it for a new use. In order to scavenge the code, the developer
must be aware of it already, or must find it. Little is known about how
tools - particularly search tools - impact the clone construction process,
nor how developers use them for this purpose. This paper lists five
outstanding research questions in this area and proposes sketches of
designs for five empirical studies that might be conducted to help shed
light on those questions
An Efficient Resilient MPC Scheme via Constraint Tightening against Cyberattacks: Application to Vehicle Cruise Control
We propose a novel framework for designing a resilient Model Predictive
Control (MPC) targeting uncertain linear systems under cyber attack. Assuming a
periodic attack scenario, we model the system under Denial of Service (DoS)
attack, also with measurement noise, as an uncertain linear system with
parametric and additive uncertainty. To detect anomalies, we employ a Kalman
filter-based approach. Then, through our observations of the intensity of the
launched attack, we determine a range of possible values for the system
matrices, as well as establish bounds of the additive uncertainty for the
equivalent uncertain system. Leveraging a recent constraint tightening robust
MPC method, we present an optimization-based resilient algorithm. Accordingly,
we compute the uncertainty bounds and corresponding constraints offline for
various attack magnitudes. Then, this data can be used efficiently in the MPC
computations online. We demonstrate the effectiveness of the developed
framework on the Adaptive Cruise Control (ACC) problem.Comment: To Appear in ICINCO 202
In situ reuse of logically extracted functional components
Abstract Programmers often identify functionality within a compiled program that they wish they could reuse in a manner other than that intended by the program's original authors. The traditional approach to reusing pre-existing functionality contained within a binary executable is that of physical extraction; that is, the recreation of the desired functionality in some executable module separate from the program in which it was originally found. Towards overcoming the inherent limitations of physical extraction, we propose in situ reuse of logically extracted functional components. Logical extraction consists of identifying and retaining information about the locations of the elements comprising the functional component within its original program, and in situ reuse is the process of driving the original program to execute the logically extracted functional component in whatever manner the new programmer sees fit
Similarity in Programs
An overview of the concept of program similarity is presented. It divides
similarity into two types - syntactic and semantic - and provides a review
of eight categories of methods that may be used to measure program
similarity. A summary of some applications of these methods is included.
The paper is intended to be a starting point for a more comprehensive
analysis of the subject of similarity in programs, which is critical to
understand if progress is to be made in fields such as clone detection
funcGNN: A Graph Neural Network Approach to Program Similarity
Program similarity is a fundamental concept, central to the solution of
software engineering tasks such as software plagiarism, clone identification,
code refactoring and code search. Accurate similarity estimation between
programs requires an in-depth understanding of their structure, semantics and
flow. A control flow graph (CFG), is a graphical representation of a program
which captures its logical control flow and hence its semantics. A common
approach is to estimate program similarity by analysing CFGs using graph
similarity measures, e.g. graph edit distance (GED). However, graph edit
distance is an NP-hard problem and computationally expensive, making the
application of graph similarity techniques to complex software programs
impractical. This study intends to examine the effectiveness of graph neural
networks to estimate program similarity, by analysing the associated control
flow graphs. We introduce funcGNN, which is a graph neural network trained on
labeled CFG pairs to predict the GED between unseen program pairs by utilizing
an effective embedding vector. To our knowledge, this is the first time graph
neural networks have been applied on labeled CFGs for estimating the similarity
between high-level language programs. Results: We demonstrate the effectiveness
of funcGNN to estimate the GED between programs and our experimental analysis
demonstrates how it achieves a lower error rate (0.00194), with faster (23
times faster than the quickest traditional GED approximation method) and better
scalability compared with the state of the art methods. funcGNN posses the
inductive learning ability to infer program structure and generalise to unseen
programs. The graph embedding of a program proposed by our methodology could be
applied to several related software engineering problems (such as code
plagiarism and clone identification) thus opening multiple research directions.Comment: 11 pages, 8 figures, 3 table
SLACC: Simion-based Language Agnostic Code Clones
Successful cross-language clone detection could enable researchers and
developers to create robust language migration tools, facilitate learning
additional programming languages once one is mastered, and promote reuse of
code snippets over a broader codebase. However, identifying cross-language
clones presents special challenges to the clone detection problem. A lack of
common underlying representation between arbitrary languages means detecting
clones requires one of the following solutions: 1) a static analysis framework
replicated across each targeted language with annotations matching language
features across all languages, or 2) a dynamic analysis framework that detects
clones based on runtime behavior.
In this work, we demonstrate the feasibility of the latter solution, a
dynamic analysis approach called SLACC for cross-language clone detection. Like
prior clone detection techniques, we use input/output behavior to match clones,
though we overcome limitations of prior work by amplifying the number of inputs
and covering more data types; and as a result, achieve better clusters than
prior attempts. Since clusters are generated based on input/output behavior,
SLACC supports cross-language clone detection. As an added challenge, we target
a static typed language, Java, and a dynamic typed language, Python. Compared
to HitoshiIO, a recent clone detection tool for Java, SLACC retrieves 6 times
as many clusters and has higher precision (86.7% vs. 30.7%).
This is the first work to perform clone detection for dynamic typed languages
(precision = 87.3%) and the first to perform clone detection across languages
that lack a common underlying representation (precision = 94.1%). It provides a
first step towards the larger goal of scalable language migration tools.Comment: 11 Pages, 3 Figures, Accepted at ICSE 2020 technical trac
- …