3,661 research outputs found
Model-Driven Engineering in the Large: Refactoring Techniques for Models and Model Transformation Systems
Model-Driven Engineering (MDE) is a software engineering paradigm that
aims to increase the productivity of developers by raising the
abstraction level of software development. It envisions the use of
models as key artifacts during design, implementation and deployment.
From the recent arrival of MDE in large-scale industrial software
development – a trend we refer to as MDE in the large –, a set of
challenges emerges: First, models are now developed at distributed
locations, by teams of teams. In such highly collaborative settings, the
presence of large monolithic models gives rise to certain issues, such
as their proneness to editing conflicts. Second, in large-scale system
development, models are created using various domain-specific modeling
languages. Combining these models in a disciplined manner calls for
adequate modularization mechanisms. Third, the development of models is
handled systematically by expressing the involved operations using model
transformation rules. Such rules are often created by cloning, a
practice related to performance and maintainability issues.
In this thesis, we contribute three refactoring techniques, each aiming
to tackle one of these challenges. First, we propose a technique to
split a large monolithic model into a set of sub-models. The aim of this
technique is to enable a separation of concerns within models, promoting
a concern-based collaboration style: Collaborators operate on the
submodels relevant for their task at hand. Second, we suggest a
technique to encapsulate model components by introducing modular
interfaces in a set of related models. The goal of this technique is to
establish modularity in these models. Third, we introduce a refactoring
to merge a set of model transformation rules exhibiting a high degree of
similarity. The aim of this technique is to improve maintainability and
performance by eliminating the drawbacks associated with cloning. The
refactoring creates variability-based rules, a novel type of rule
allowing to capture variability by using annotations.
The refactoring techniques contributed in this work help to reduce the
manual effort during the refactoring of models and transformation rules
to a large extent. As indicated in a series of realistic case studies,
the output produced by the techniques is comparable or, in the case of
transformation rules, partly even preferable to the result of manual
refactoring, yielding a promising outlook on the applicability in
real-world settings
Set-Based Pre-Processing for Points-To Analysis
We present set-based pre-analysis: a virtually universal op-
timization technique for flow-insensitive points-to analysis.
Points-to analysis computes a static abstraction of how ob-
ject values flow through a program’s variables. Set-based
pre-analysis relies on the observation that much of this rea-
soning can take place at the set level rather than the value
level. Computing constraints at the set level results in sig-
nificant optimization opportunities: we can rewrite the in-
put program into a simplified form with the same essential
points-to properties. This rewrite results in removing both
local variables and instructions, thus simplifying the sub-
sequent value-based points-to computation. E
ectively, set-
based pre-analysis puts the program in a normal form opti-
mized for points-to analysis.
Compared to other techniques for o
-line optimization of
points-to analyses in the literature, the new elements of our
approach are the ability to eliminate statements, and not just
variables, as well as its modularity: set-based pre-analysis
can be performed on the input just once, e.g., allowing the
pre-optimization of libraries that are subsequently reused
many times and for di
erent analyses. In experiments with
Java programs, set-based pre-analysis eliminates 30% of the
program’s local variables and 30% or more of computed
context-sensitive points-to facts, over a wide set of bench-
marks and analyses, resulting in a
20% average speedup
(max: 110%, median: 18%)
A heuristic-based approach to code-smell detection
Encapsulation and data hiding are central tenets of the object oriented paradigm. Deciding what data and behaviour to form into a class and where to draw the line between its public and private details can make the difference between a class that is an understandable, flexible and reusable abstraction and one which is not. This decision is a difficult one and may easily result in poor encapsulation which can then have serious implications for a number of system qualities. It is often hard to identify such encapsulation problems within large software systems until they cause a maintenance problem (which is usually too late) and attempting to perform such analysis manually can also be tedious and error prone. Two of the common encapsulation problems that can arise as a consequence of this decomposition process are data classes and god classes. Typically, these two problems occur together – data classes are lacking in functionality that has typically been sucked into an over-complicated and domineering god class. This paper describes the architecture of a tool which automatically detects data and god classes that has been developed as a plug-in for the Eclipse IDE. The technique has been evaluated in a controlled study on two large open source systems which compare the tool results to similar work by Marinescu, who employs a metrics-based approach to detecting such features. The study provides some valuable insights into the strengths and weaknesses of the two approache
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs
Binary code analysis allows analyzing binary code without having access to
the corresponding source code. A binary, after disassembly, is expressed in an
assembly language. This inspires us to approach binary analysis by leveraging
ideas and techniques from Natural Language Processing (NLP), a rich area
focused on processing text of various natural languages. We notice that binary
code analysis and NLP share a lot of analogical topics, such as semantics
extraction, summarization, and classification. This work utilizes these ideas
to address two important code similarity comparison problems. (I) Given a pair
of basic blocks for different instruction set architectures (ISAs), determining
whether their semantics is similar or not; and (II) given a piece of code of
interest, determining if it is contained in another piece of assembly code for
a different ISA. The solutions to these two problems have many applications,
such as cross-architecture vulnerability discovery and code plagiarism
detection. We implement a prototype system INNEREYE and perform a comprehensive
evaluation. A comparison between our approach and existing approaches to
Problem I shows that our system outperforms them in terms of accuracy,
efficiency and scalability. And the case studies utilizing the system
demonstrate that our solution to Problem II is effective. Moreover, this
research showcases how to apply ideas and techniques from NLP to large-scale
binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium
201
- …