860 research outputs found
Detecting and Characterizing Semantic Inconsistencies in Ported Code
Adding similar features and bug fixes often requires porting program patches from reference implementations and adapting them to target implementations. Porting errors may result from faulty adaptations or inconsistent updates. This paper investigates (1) the types of porting errors found in practice, and (2) how to detect and characterize potential porting errors. Analyzing version histories, we define five categories of porting errors, including incorrect control- and data-flow, code redundancy, inconsistent identifier renamings, etc. Leveraging this categorization, we design a static control- and data-dependence analysis technique, SPA, to detect and characterize porting inconsistencies. Our evaluation on code from four open-source projects shows that SPA can detect porting inconsistencies with 65% to 73% precision and 90% recall, and identify inconsistency types with 58% to 63% precision and 92% to 100% recall. In a comparison with two existing error detection tools, SPA improves precision by 14 to 17 percentage points
The Impact of Systematic Edits in History Slicing
While extracting a subset of a commit history, specifying the necessary
portion is a time-consuming task for developers. Several commit-based history
slicing techniques have been proposed to identify dependencies between commits
and to extract a related set of commits using a specific commit as a slicing
criterion. However, the resulting subset of commits become large if commits for
systematic edits whose changes do not depend on each other exist. We
empirically investigated the impact of systematic edits on history slicing. In
this study, commits in which systematic edits were detected are split between
each file so that unnecessary dependencies between commits are eliminated. In
several histories of open source systems, the size of history slices was
reduced by 13.3-57.2% on average after splitting the commits for systematic
edits.Comment: 5 pages, MSR 201
Recommended from our members
A case study of cross-branch porting in Linux Kernel
textTo meet different requirements for different stakeholders, branches are widely used to maintain multiple product variants simultaneously. For example, Linux Kernel has a main development branch, known as the mainline; 35 branches to maintain older product versions which are called stable branches; and hundreds of branches for experimental features. To maintain multiple branch-based product variants in parallel, developers often port new features or bug-fixes from one branch to another. In particular, the process of propagating bug-fixes or feature additions to an older version is commonly called backporting. Prior to our study, backporting practices in large scale projects have not been systematically studied. This lack of empirical knowledge makes it difficult to improve the current backporting process in the industry. We hypothesized that cross-branch porting practice is frequent, repetitive, and error-prone. It required significant effort for developers to select patches that need to be backported and then apply them to the target implementation. We carried out two complementary studies to examine this hypothesis. To investigate the extent and effort of porting practice, this thesis first conducted a quantitative study of backporting activities in Linux Kernel with a total of 8 years version history using the data of the main branch and the 35 stable branches. Our study showed that backporting happened at a rate of 149 changes per month, and it took 51 days to propagate patches on average. 40% of changes in the stable branches were ported from the mainline and 64% of ported patches propagated to more than one branch. Out of all backporting changes from the mainline to stable branches, 97.5% were applied without any manual modifications. To understand how Linux Kernel developers keep up to date with development activities across different branches, we carried out an online survey with engineers who may have ported code from the mainline to stable branches based on our prior analysis of Linux Kernel version history. We received 14 complete responses. The participants have 12.6 years of Linux development experience on average and are either maintainers or experts of Linux Kernel. The survey showed that most backporting work was done by the maintainers who knew the program quite well. Those experienced maintainers could easily identify the edits that need to be ported and propagate them with all relevant changes to ensure consistency in multiple branches. Inexperience developers were seldom given an opportunity to backport features or bug-fixes to stable branches. In summary, based on the version history study and the online survey, we concluded that cross-branch porting is frequent, periodic, and repetitive. It required a manual effort to selectively identify the changes that need to be ported, to analyze the dependency of the selected changes, and to apply all required changes to ensure consistency. To eliminate human's omission mistakes, most backporting work was done only by experienced maintainers who could identify all relevant changes along with the change that needed to be backported. Currently inexperienced developers were excluded from cross-branch porting activities from the mainline to stable branches in Linux Kernel. Our results call for an automated approach to identify the patches that require to be ported, to collect context information to help developers become aware of relevant changes, and to notify pertinent developers who may be responsible for the corresponding porting events.Electrical and Computer Engineerin
Checking smart contracts with structural code embedding
Ministry of Education, Singapore under its Academic Research Funding Tier
Recommended from our members
Modeling and Simulating a Software Architecture Design Space
Frequently, a similar type of software system is used in the implementation of many different software applications. Databases are an example. Two software development approaches are common to Þll the need for instances from a class of similar systems: (1) repeated custom development of similar instances, one for each different application, or (2) development of one or more general purpose off-the-shelf systems that are used many times in the different applications. Each approach has advantages and disadvantages. Custom development can closely match the requirements of an application, but has an associated high development cost. General purpose systems may have a lower cost when amortized across multiple applications, but may not closely match the requirements of all the different applications. It can be difÞcult for application developers to determine which approach is best for their application. Do any of the existing off-the-shelf systems sufÞciently satisfy the application requirements? If so, which ones provide the best match? Would a custom implementation be sufÞciently better to justify the cost difference between an off-the-shelf solution? These difÞcult buy-versus-build decisions are extremely important in todayÕs fastpaced, competitive, unforgiving software application market. In this thesis we propose and study a software engineering approach for evaluating how well off-the-shelf and custom software architectures within the design space of a class of OODB systems satisfy the requirements for different applications. The approach is based on the ability to explicitly enumerate and represent the key dimensions of commonality and variability in the space of OODB designs. We demonstrate that modeling and simulation of OODB software architectures can be used to help software developers rapidly converge on OODB requirements for an application and identify OODB software architectures that satisfy those requirements. The technical focus of this work is on the circular relationships between requirements, software architectures, and system properties such as OODB functionality, size, and performance. We capture these relationships in a parametrized OODB architectural model, together with an OODB simulation and modeling tool that allows software developers to reÞne application requirements on an OODB, identify corresponding custom and offthe- shelf OODB software architectures, evaluate how well the software architecture properties satisfy the application requirements, and identify potential reÞnements to requirements
PROGRAM INSPECTION AND TESTING TECHNIQUES FOR CODE CLONES AND REFACTORINGS IN EVOLVING SOFTWARE
Developers often perform copy-and-paste activities. This practice causes the similar code fragment (aka code clones) to be scattered throughout a code base. Refactoring for clone removal is beneficial, preventing clones from having negative effects on software quality, such as hidden bug propagation and unintentional inconsistent changes. However, recent research has provided evidence that factoring out clones does not always reduce the risk of introducing defects, and it is often difficult or impossible to remove clones using standard refactoring techniques. To investigate which or how clones can be refactored, developers typically spend a significant amount of their time managing individual clone instances or clone groups scattered across a large code base.
To address the problem, this research proposes two techniques to inspect and validate refactoring changes. First, we propose a technique for managing clone refactorings, Pattern-based clone Refactoring Inspection (PRI), using refactoring pattern templates. By matching the refactoring pattern templates against a code base, it summarizes refactoring changes of clones, and detects the clone instances not consistently factored out as potential anomalies. Second, we propose Refactoring Investigation and Testing technique, called RIT. RIT improves the testing efficiency for validating refactoring changes. RIT uses PRI to identify refactorings by analyzing original and edited versions of a program. It then uses the semantic impact of a set of identified refactoring changes to detect tests whose behavior may have been affected and modified by refactoring edits. Given each failed asserts, RIT helps developers focus their attention on logically related program statements by applying program slicing for minimizing each test. For debugging purposes, RIT determines specific failure-inducing refactoring edits, separating from other changes that only affect other asserts or tests
Eight years of rider measurement in the Android malware ecosystem: evolution and lessons learned
Despite the growing threat posed by Android malware,
the research community is still lacking a comprehensive
view of common behaviors and trends exposed by malware families
active on the platform. Without such view, the researchers
incur the risk of developing systems that only detect outdated
threats, missing the most recent ones. In this paper, we conduct
the largest measurement of Android malware behavior to date,
analyzing over 1.2 million malware samples that belong to 1.2K
families over a period of eight years (from 2010 to 2017). We
aim at understanding how the behavior of Android malware
has evolved over time, focusing on repackaging malware. In
this type of threats different innocuous apps are piggybacked
with a malicious payload (rider), allowing inexpensive malware
manufacturing.
One of the main challenges posed when studying repackaged
malware is slicing the app to split benign components apart from
the malicious ones. To address this problem, we use differential
analysis to isolate software components that are irrelevant to the
campaign and study the behavior of malicious riders alone. Our
analysis framework relies on collective repositories and recent
advances on the systematization of intelligence extracted from
multiple anti-virus vendors. We find that since its infancy in
2010, the Android malware ecosystem has changed significantly,
both in the type of malicious activity performed by the malicious
samples and in the level of obfuscation used by malware to avoid
detection. We then show that our framework can aid analysts
who attempt to study unknown malware families. Finally, we
discuss what our findings mean for Android malware detection
research, highlighting areas that need further attention by the
research community.Accepted manuscrip
- …