12 research outputs found
Using BERT for the Detection of Architectural Tactics in Code
Quality-driven design decisions are often addressed by using architectural tactics that are re-usable solution options for certain quality concerns. However, it is not suficient to only make good design decisions but also to review the realization of design decisions in code. As manual creation of traceability links for design decisions into code is costly, some approaches perform structural analyses to recover traceability links. However, architectural tactics are high-level solutions described in terms of roles and interactions and there is a wide range of possibilities to implement each. Therefore, structural analyses only yield limited results. Transfer-learning approaches using language models like BERT are a recent trend in the field of natural language processing. These approaches yield state-of-the-art results for tasks like text classification. We intent to experiment with BERT and present an approach to detect architectural tactics in code by fine-tuning BERT. A 10-fold cross-validation shows promising results with an average F1-Score of 90%, which is on a par with state-of-the-art approaches. We additionally apply our approach to a case study, where the results of our approach show promising potential but fall behind the state-of-the-art. Therefore, we discuss our approach and look at potential reasons and downsides as well as potential improvements
JAVA DESIGN PATTERN OBFUSCATION
Software Reverse Engineering (SRE) consists of analyzing the design and imple- mentation of software. Typically, we assume that the executable file is available, but not the source code. SRE has many legitimate uses, including analysis of software when no source code is available, porting old software to a modern programming language, and analyzing code for security vulnerabilities. Attackers also use SRE to probe for weaknesses in closed-source software, to hack software activation mecha- nisms (or otherwise change the intended function of software), to cheat at games, etc. There are many tools available to aid the aspiring reverse engineer. For example, there are several tools that recover design patterns from Java byte code or source code. In this project, we develop and analyze a technique to obfuscate design patterns. We show that our technique can defeat design pattern detection tools, thereby making reverse engineering attacks more difficult
Formalization, Selection and Detection of Security Patterns
Generally, software requirement analysis and design methodologies based on different UML (Unified Modelling Language) diagrams need to be strengthened by the use of a number of security patterns. Security Patterns provide a way for the software developers to communicate at security level in more comprehensive way. Over the last few years, a number of security patterns has been gradually increased and still increasing. Large number of security patterns has given rise to critical problem of selecting the appropriate security pattern to solve the problem at hand. In this study, an attempt has been made for automated verification of security pattern and an approach is proposed for selection of appropriate security patterns that fulfills security requirements. In order to demonstrate this approach, four security patterns have been selected such as Single Access Point, CheckPoint, Role and Session. A grammar has been developed for the verification of selected security patterns. Goal-Oriented Requirement Language (GRL) has been used for creating the repository of formalized security patterns, this GRL model is used for extracting facts which are then represented as relational instances. Queries have been made to the instances to find appropriate security pattern which fulfils security requirements. This approach clearly identifies the contribution and consequences of a security pattern towards the security related Non Functional Requirements (NFRs). It also checks for the relationships and dependences among the security patterns, which helps in finding the pre-requisite patterns for the selected security patterns. Finally, a method for detection of security patterns using similarity score is presented
Formal Verification, Quantitative Analysis and Automated Detection of Design Patterns
Present-day software engineering concepts emphasize on developing software based on design patterns. Design patterns form the basis of generic solution to a recurring design problem. Software requirement analysis and design methodologies based on different Unified Modelling Language (UML) diagrams need to be strengthened by the use of a number of design patterns. In this study, an attempt has been made for automated verification of the design patterns. A grammar has been developed for verification and recognition of selected design patterns. ANTLR (ANother Tool for Language Recognition) tool has been used for verification of developed grammar. After proper verification and validation of design patterns, there comes a need to quantitatively determine the quality of design patterns. Hence, we have provided a methodology to compare the quality attributes of a system having design pattern solution with a system having non-pattern solution, both the system intending to provide same functionalities. Using Quality Model for Object-Oriented Design (QMOOD) approach, the cut-off points are calculated in order to provide the exact size of the system in terms of the number of classes, for which the solution adopted using design pattern, provides more quality parameters. Again Design Pattern Detection (DPD) has also considered as an emerging field of Software Reverse Engineering. An attempt has been made to present a noble approach for design pattern detection with the help of Graph Isomorphism and Normalized Cross Correlation (NCC) techniques. Eclipse Plugin i.e., ObjectAid is used to extract UML class diagrams as well as the eXtensible Markup Language (XML) files from the Software System and Design Pattern. An algorithm is proposed to extract relevant information from the XML files, and Graph Isomorphism technique is used to find the pattern subgraph. Use of NCC provides the percentage existence of the pattern in the system
Change impact analysis of multi-language and heterogeneously-licensed software
Today software systems are built with heterogeneous languages such as Java, C, C++,
XML, Perl or Python just to name a few. This introduces new challenges both for the
software analysis domain and program evolution as programmers must to cope with a
variety of programming paradigms and languages. We believe that there is the need of
global views supporting developers to effectively cope with complexity and to facilitate
program comprehension and analysis of such heterogeneous systems. Furthermore, the
heterogeneity of the systems is not limited to the language but also impacts the components
licensing. In fact, licensing is another type of heterogeneity introduced by the large
reuse of open source code. The heterogeneity of licenses also introduces challenges such
how to legally combine components in different programming languages and licenses in
the same system and how the change of the software can create a violation of licenses. In
this context, we would like to develop a re-engineering tool for analysing change impact
of heterogeneously licensed system considering multi-language environment. First, we
want to study change impact analysis in multi-language system in general and extend
it to support the issue of licenses
Identification of behavioral and creational design patterns through dynamic analysis
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal
Customizable Feature based Design Pattern Recognition Integrating Multiple Techniques
Die Analyse und Rückgewinnung von Architekturinformationen
aus existierenden Altsystemen ist eine komplexe, teure und zeitraubende
Aufgabe, was der kontinuierlich steigenden Komplexität von Software und dem
Aufkommen der modernen Technologien geschuldet ist. Die Wartung von
Altsystemen wird immer stärker nachgefragt und muss dabei mit den neuesten
Technologien und neuen Kundenanforderungen umgehen können. Die
Wiederverwendung der Artefakte aus Altsystemen für neue Entwicklungen wird
sehr bedeutsam und überlebenswichtig für die Softwarebranche. Die
Architekturen von Altsystemen unterliegen konstanten Veränderungen, deren
Projektdokumentation oft unvollständig, inkonsistent und veraltet ist.
Diese Dokumente enthalten ungenügend Informationen über die innere Struktur
der Systeme. Häufig liefert nur der Quellcode zuverlässige Informationen
über die Struktur von Altsystemen. Das Extrahieren von Artefakten aus
Quellcode von Altsystemen unterstützt das Programmverständnis, die Wartung,
das Refactoring, das Reverse Engineering, die nachträgliche Dokumentation
und Reengineering Methoden. Das Ziel dieser Dissertation ist es
Entwurfsinformationen von Altsystemen zu extrahieren, mit Fokus auf die
Wiedergewinnung von Architekturmustern. Architekturmuster sind
Schlüsselelemente, um Architekturentscheidungen aus Quellcode von
Altsystemen zu extrahieren. Die Verwendung von Mustern bei der Entwicklung
von Applikationen wird allgemein als qualitätssteigernd betrachtet und
reduziert Entwicklungszeit und kosten. In der Vergangenheit wurden
unterschiedliche Methoden entwickelt, um Muster in Altsystemen zu erkennen.
Diese Techniken erkennen Muster mit unterschiedlicher Genauigkeit, da ein
und dasselbe Muster unterschiedlich spezifiziert und implementiert wird.
Der Lösungsansatz dieser Dissertation basiert auf anpassbaren und
wiederverwendbaren Merkmal-Typen, die statische und dynamische Parameter
nutzen, um variable Muster zu definieren. Jeder Merkmal-Typ verwendet eine
wählbare Suchtechnik (SQL Anfragen, Reguläre Ausdrücke oder Quellcode
Parser), um ein bestimmtes Merkmal eines Musters im Quellcode zu
identifizieren. Insbesondere zur Erkennung verschiedener Varianten eines
Musters kommen im entwickelten Verfahren statische, dynamische und
semantische Analysen zum Einsatz. Die Verwendung unterschiedlicher
Suchtechniken erhöht die Genauigkeit der Mustererkennung bei verschiedenen
Softwaresystemen. Zusätzlich wurde eine neue Semantik für Annotationen im
Quellcode von existierenden Softwaresystemen entwickelt, welche die
Effizienz der Mustererkennung steigert. Eine prototypische
Implementierung des Ansatzes, genannt UDDPRT, wurde zur Erkennung
verschiedener Muster in Softwaresystemenen unterschiedlicher
Programmiersprachen (JAVA, C/C++, C#) verwendet. UDDPRT erlaubt die
Anpassung der Mustererkennung durch den Benutzer. Alle Abfragen und deren
Zusammenspiel sind konfigurierbar und erlauben dadurch die Erkennung von
neuen und abgewandelten Mustern. Es wurden umfangreiche Experimente mit
diversen Open Source Software Systemen durchgeführt und die erzielten
Ergebnisse wurden mit denen anderer Ansätze verglichen. Dabei war es
möglich eine deutliche Steigerung der Genauigkeit im entwickelten Verfahren
gegenüber existierenden Ansätzen zu zeigen.Recovering design information from legacy applications is a
complex, expensive, quiet challenging, and time consuming task due to ever
increasing complexity of software and advent of modern technology. The
growing demand for maintenance of legacy systems, which can cope with the
latest technologies and new business requirements, the reuse of artifacts
from the existing legacy applications for new developments become very
important and vital for software industry. Due to constant evolution in
architecture of legacy systems, they often have incomplete, inconsistent
and obsolete documents which do not provide enough information about the
structure of these systems. Mostly, source code is the only reliable source
of information for recovering artifacts from legacy systems. Extraction of
design artifacts from the source code of existing legacy systems supports
program comprehension, maintenance, code refactoring, reverse engineering,
redocumentation and reengineering methodologies. The objective of approach
used in this thesis is to recover design information from legacy code with
particular focus on the recovery of design patterns. Design patterns are
key artifacts for recovering design decisions from the legacy source code.
Patterns have been extensively tested in different applications and reusing
them yield quality software with reduced cost and time frame. Different
techniques, methodologies and tools are used to recover patterns from
legacy applications in the past. Each technique recovers patterns with
different precision and recall rates due to different specifications and
implementations of same pattern. The approach used in this thesis is based
on customizable and reusable feature types which use static and dynamic
parameters to define variant pattern definitions. Each feature type allows
user to switch/select between multiple searching techniques (SQL queries,
Regular Expressions and Source Code Parsers) which are used to match
features of patterns with source code artifacts. The technique focuses on
detecting variants of different design patterns by using static, dynamic
and semantic analysis techniques. The integrated use of SQL queries, source
code parsers, regular expressions and annotations improve the precision and
recall for pattern extraction from different legacy systems. The approach
has introduced new semantics of annotations to be used in the source code
of legacy applications, which reduce search space and time for detecting
patterns. The prototypical implementation of approach, called UDDPRT is
used to recognize different design patterns from the source code of
multiple languages (Java, C/C++, C#). The prototype is flexible and
customizable that novice user can change the SQL queries and regular
expressions for detecting implementation variants of design patterns. The
approach has improved significant precision and recall of pattern
extraction by performing experiments on number of open source systems taken
as baselines for comparisons
Détection, Explications et Restructuration de défauts de conception : les patrons abîmés.
Models driven engineering considers models first class entities for the software development. The models driven processes must be able to take into account the know-how of experts, generally expressed in terms of analysis, architectural of design patterns. To choose the right pattern and to ensure its correct integration within a model constitute curbs with the systematic use of the good design practices. In order to reduce these tasks, we propose an approach based on the automatic inspection of models. In the same manner that there are code review activities aiming at checking the absence of bad coding practices in a program, we have tooled a design review activity identifying, explaining and correcting the bad design practices in a model. A spoiled pattern is comparable with a design pattern, its instantiations solving the same types of problems, but with a different and certainly improvable architecture. Experiments were carried out in order to collect spoiled patterns, allowing us to propose a catalog of bad practices, complementary to the GoF catalog. The detection of the instantiations of spoiled patterns in a UML model is related with a wide graph homomorphism. Graphs UML having typed vertexes, detection is based on local and global structural properties allowing the solving of this NP-Complete problem by successive filtering. Thus, this algorithm is able to detect all the possible instantiations of a spoiled pattern, by managing moreover prohibited and optional edges. The model fragment semantics is given by its intent which is validated by the designer. The intent of the detected fragments and the benefit of a replacement by the adequate pattern are deduced by requests on an ontology conceived for this purpose. The transformation of the fragments into instantiations of design pattern is carried out thanks to model refactoring automatically deduced from the structural differences between a spoiled pattern and an design pattern.L'ingénierie des modèles considère les modèles comme des entités de première classe pour le développement logiciel. Les processus dirigés par les modèles se doivent d'être capables de prendre en compte le savoir-faire d'experts, généralement exprimé en termes de patrons, qu'ils soient d'analyse, de conception ou d'architecture. Choisir le bon patron et assurer sa bonne intégration au sein d'une modélisation constitue des freins à l'utilisation systématique des bonnes pratiques de conception. Afin d'alléger ces tâches, nous proposons une approche basée sur l'inspection automatique des modèles. De la même manière qu'il existe des revues de code visant à vérifier l'absence de mauvaises pratiques de codage dans un programme, nous avons outillé une activité de revue de conception identifiant, expliquant et corrigeant les mauvaises pratiques de conception dans un modèle. Un patron abîmé est comparable à un patron de conception, ses contextualisations résolvant les mêmes types de problèmes, mais avec une architecture différente et certainement améliorable. Des expérimentations ont été menées afin de collecter des patrons abîmés, nous amenant à proposer un catalogue de mauvaises pratiques, complémentaire au catalogue du GoF. La détection des contextualisations de patrons abîmés dans un modèle UML est apparentée à un morphisme de graphe étendu. Les graphes UML ayant des sommets typés, la détection s'appuie sur des particularités structurelles locales et globales permettant de résoudre ce problème NP-Complet par des filtrages successifs. Cet algorithme est ainsi capable de détecter toutes les contextualisations possibles d'un patron abîmé, en gérant de plus les arcs interdits et facultatifs. La sémantique d'un fragment de modèle est donnée par son intention et celle-ci est validée par le concepteur. L'intention des fragments détectés et les bénéfices d'un remplacement par le patron adéquat sont déduits par des requêtes sur une ontologie conçue à cet effet. La transformation des fragments en contextualisations de patrons de conception est réalisée grâce à des restructurations de modèles déduites automatiquement des différences structurelles entre un patron abîmé et un patron de conception