1,104 research outputs found
Information Flow Control with System Dependence Graphs - Improving Modularity, Scalability and Precision for Object Oriented Languages
Die vorliegende Arbeit befasst sich mit dem Gebiet der statischen Programmanalyse
— insbesondere betrachten wir Analysen, deren Ziel es ist,
bestimmte Sicherheitseigenschaften, wie etwa Integrität und Vertraulichkeit,
für Programme zu garantieren. Hierfür verwenden wir sogenannte
Abhängigkeitsgraphen, welche das potentielle Verhalten des Programms
sowie den Informationsfluss zwischen einzelnen Programmpunkten
abbilden. Mit Hilfe dieser Technik können wir sicherstellen, dass z.B. ein
Programm keinerlei Information über ein geheimes Passwort preisgibt.
Im Speziellen liegt der Fokus dieser Arbeit auf Techniken, die das
Erstellen des Abhängigkeitsgraphen verbessern, da dieser die Grundlage
für viele weiterführende Sicherheitsanalysen bildet. Die vorgestellten
Algorithmen und Verbesserungen wurden in unser Analysetool Joana
integriert und als Open-Source öffentlich verfügbar gemacht. Zahlreiche
Kooperationen und Veröffentlichungen belegen, dass die Verbesserungen
an Joana auch in der Forschungspraxis relevant sind.
Diese Arbeit besteht im Wesentlichen aus drei Teilen. Teil 1 befasst sich
mit Verbesserungen bei der Berechnung des Abhängigkeitsgraphen, Teil 2
stellt einen neuen Ansatz zur Analyse von unvollständigen Programmen
vor und Teil 3 zeigt aktuelle Verwendungsmöglichkeiten von Joana an
konkreten Beispielen.
Im ersten Teil gehen wir detailliert auf die Algorithmen zum Erstellen
eines Abhängigkeitsgraphen ein, dabei legen wir besonderes Augenmerk
auf die Probleme und Herausforderung bei der Analyse von Objektorientierten
Sprachen wie Java. So stellen wir z.B. eine Analyse vor,
die den durch Exceptions ausgelösten Kontrollfluss präzise behandeln
kann. Hauptsächlich befassen wir uns mit der Modellierung von
Seiteneffekten, die bei der Kommunikation über Methodengrenzen hinweg
entstehen können. Bei Abhängigkeitsgraphen werden Seiteneffekte, also
Speicherstellen, die von einer Methode gelesen oder verändert werden,
in Form von zusätzlichen Knoten dargestellt. Dabei zeigen wir, dass die
Art und Weise der Darstellung, das sogenannte Parametermodel, enormen
Einfluss sowohl auf die Präzision als auch auf die Laufzeit der gesamten
Analyse hat. Wir erklären die Schwächen des alten Parametermodels,
das auf Objektbäumen basiert, und präsentieren unsere Verbesserungen
in Form eines neuen Modells mit Objektgraphen. Durch das gezielte
Zusammenfassen von redundanten Informationen können wir die Anzahl
der berechneten Parameterknoten deutlich reduzieren und zudem
beschleunigen, ohne dabei die Präzision des resultierenden Abhängigkeitsgraphen
zu verschlechtern. Bereits bei kleineren Programmen im
Bereich von wenigen tausend Codezeilen erreichen wir eine im Schnitt
8-fach bessere Laufzeit — während die Präzision des Ergebnisses in der
Regel verbessert wird. Bei größeren Programmen ist der Unterschied
sogar noch deutlicher, was dazu führt, dass einige unserer Testfälle und
alle von uns getesteten Programme ab einer Größe von 20000 Codezeilen
nur noch mit Objektgraphen berechenbar sind. Dank dieser Verbesserungen
kann Joana mit erhöhter Präzision und bei wesentlich größeren
Programmen eingesetzt werden.
Im zweiten Teil befassen wir uns mit dem Problem, dass bisherige,
auf Abhängigkeitsgraphen basierende Sicherheitsanalysen nur vollständige
Programme analysieren konnten. So war es z.B. unmöglich,
Bibliothekscode ohne Kenntnis aller Verwendungsstellen zu betrachten
oder vorzuverarbeiten. Wir entdeckten bei der bestehenden Analyse
eine Monotonie-Eigenschaft, welche es uns erlaubt, Analyseergebnisse
von Programmteilen auf beliebige Verwendungsstellen zu übertragen.
So lassen sich zum einen Programmteile vorverarbeiten und zum anderen
auch generelle Aussagen über die Sicherheitseigenschaften von
Programmteilen treffen, ohne deren konkrete Verwendungsstellen zu
kennen. Wir definieren die Monotonie-Eigenschaft im Detail und skizzieren
einen Beweis für deren Korrektheit. Darauf aufbauend entwickeln
wir eine Methode zur Vorverarbeitung von Programmteilen, die es uns
ermöglicht, modulare Abhängigkeitsgraphen zu erstellen. Diese Graphen
können zu einem späteren Zeitpunkt der jeweiligen Verwendungsstelle
angepasst werden. Da die präzise Erstellung eines modularen Abhängigkeitsgraphen
sehr aufwendig werden kann, entwickeln wir einen
Algorithmus basierend auf sogenannten Zugriffspfaden, der die Skalierbarkeit
verbessert. Zuletzt skizzieren wir einen Beweis, der zeigt, dass
dieser Algorithmus tatsächlich immer eine konservative Approximation
des modularen Graphen berechnet und deshalb die Ergebnisse darauf
aufbauender Sicherheitsanalysen weiterhin gültig sind.
Im dritten Teil präsentieren wir einige erfolgreiche Anwendungen
von Joana, die im Rahmen einer Kooperation mit Ralf Küsters von der
Universität Trier entstanden sind. Hier erklären wir zum einen, wie
man unser Sicherheitswerkzeug Joana generell verwenden kann. Zum
anderen zeigen wir, wie in Kombination mit weiteren Werkzeugen und
Techniken kryptographische Sicherheit für ein Programm garantiert
werden kann - eine Aufgabe, die bisher für auf Informationsfluss basierende
Analysen nicht möglich war. In diesen Anwendungen wird
insbesondere deutlich, wie die im Rahmen dieser Arbeit vereinfachte
Bedienung die Verwendung von Joana erleichtert und unsere Verbesserungen
der Präzision des Ergebnisses die erfolgreiche Analyse erst
ermöglichen
Modular Collaborative Program Analysis
With our world increasingly relying on computers, it is important to ensure the quality, correctness, security, and performance of software systems. Static analysis that computes properties of computer programs without executing them has been an important method to achieve this for decades. However, static analysis faces major chal-
lenges in increasingly complex programming languages and software systems and increasing and sometimes conflicting demands for soundness, precision, and scalability. In order to cope with these challenges, it is necessary to build static analyses for complex problems from small, independent, yet collaborating modules that can be developed in isolation and combined in a plug-and-play manner.
So far, no generic architecture to implement and combine a broad range of dissimilar static analyses exists. The goal of this thesis is thus to design such an architecture and implement it as a generic framework for developing modular, collaborative static analyses. We use several, diverse case-study analyses from which we systematically derive requirements to guide the design of the framework. Based on this, we propose the use of a blackboard-architecture style collaboration of analyses that we implement in the OPAL framework. We also develop a formal model of our architectures core concepts and show how it enables freely composing analyses while retaining their soundness guarantees.
We showcase and evaluate our architecture using the case-study analyses, each of which shows how important and complex problems of static analysis can be addressed using a modular, collaborative implementation style. In particular, we show how a modular architecture for the construction of call graphs ensures consistent soundness of different algorithms. We show how modular analyses for different aspects of immutability mutually benefit each other. Finally, we show how the analysis of method purity can benefit from the use of other complex analyses in a collaborative manner and from exchanging different analysis implementations that exhibit different characteristics. Each of these case studies improves over the respective state of the art in terms of soundness, precision, and/or scalability and shows how our architecture enables experimenting with and fine-tuning trade-offs between these qualities
Impact-Analyse für AspectJ - Eine kritische Analyse mit werkzeuggestütztem Ansatz
Aspect-Oriented Programming (AOP) has been promoted as a solution for modularization problems known as the tyranny of the dominant decomposition in literature. However, when analyzing AOP languages it can be doubted that uncontrolled AOP is indeed a silver bullet. The contributions of the work presented in this thesis are twofold. First, we critically analyze AOP language constructs and their effects on program semantics to sensitize programmers and researchers to resulting problems. We further demonstrate that AOP—as available in AspectJ and similar languages—can easily result in less understandable, less evolvable, and thus error prone code—quite opposite to its claims. Second, we examine how tools relying on both static and dynamic program analysis can help to detect problematical usage of aspect-oriented constructs. We propose to use change impact analysis techniques to both automatically determine the impact of aspects and to deal with AOP system evolution. We further introduce an analysis technique to detect potential semantical issues related to undefined advice precedence. The thesis concludes with an overview of available open source AspectJ systems and an assessment of aspect-oriented programming considering both fundamentals of software engineering and the contents of this thesis
Locating Potential Aspect Interference Using Clustering Analysis
Software design continues to evolve from the structured programming paradigm of the 1970s and 1980s and the object-oriented programming (OOP) paradigm of the 1980s and 1990s. The functional decomposition design methodology used in these paradigms reduced the prominence of non-functional requirements, which resulted in scattered and tangled code to address non-functional elements. Aspect-oriented programming (AOP) allowed the removal of crosscutting concerns scattered throughout class code into single modules known as aspects. Aspectization resulted in increased modularity in class code, but introduced new types of problems that did not exist in OOP. One such problem was aspect interference, in which aspects meddled with the data flow or control flow of a program. Research has developed various solutions for detecting and addressing aspect interference using formal design and specification methods, and by programming techniques that specify aspect precedence. Such explicit specifications required practitioners to have a complete understanding of possible aspect interference in an AOP system under development. However, as system size increased, understanding of possible aspect interference could decrease. Therefore, practitioners needed a way to increase their understanding of possible aspect interference within a program. This study used clustering analysis to locate potential aspect interference within an aspect-oriented program under development, using k-means partitional clustering. Vector space models, using two newly defined metrics, interference potential (IP) and interference causality potential (ICP), and an existing metric, coupling on advice execution (CAE), provided input to the clustering algorithms. Resulting clusters were analyzed via an internal strategy using the R-Squared, Dunn, Davies-Bouldin, and SD indexes. The process was evaluated on both a smaller scale AOP system (AspectTetris), and a larger scale AOP system (AJHotDraw). By seeding potential interference problems into these programs and comparing results using visualizations, this study found that clustering analysis provided a viable way for detecting interference problems in aspect-oriented software. The ICP model was best at detecting interference problems, while the IP model produced results that were more sporadic. The CAE clustering models were not effective in pinpointing potential aspect interference problems. This was the first known study to use clustering analysis techniques specifically for locating aspect interference
Collective program analysis
Encouraged by the success of data-driven software engineering (SE) techniques that have found numerous applications e.g. in defect prediction, specification inference, etc, the demand for mining and analyzing source code repositories at scale has significantly increased. However, analyzing source code at scale remains expensive to the extent that data-driven solutions to certain SE problems are beyond our reach today. Extant techniques have focused on leveraging distributed computing to solve this problem, but with a concomitant increase in computational resource needs. In this thesis, we propose collective program analysis (CPA), a technique to accelerate ultra-large-scale source code mining without demanding more computational resources and by utilizing the similarity between millions of source code artifacts.
First, we describe the general concept of collective program analysis. Given a mining task that is required to be run on thousands of artifacts, the artifacts with similar interactions are clustered together, such that the mining task is required to be run on only one candidate from each cluster to produce the mining result and the results for other candidates in the same cluster can be produced using extrapolation. The two technical innovations of collective program analysis are: mining task specific similarity and interaction pattern graph. Mining task specific similarity is about whether two or more artifacts can be considered similar for a given mining task. An interaction pattern graph represents the interaction between the mining task and the artifact when the mining task is run on the artifact. An interaction pattern graph is used to determine mining task specific similarity between artifacts.
Given a mining task and an artifact producing an interaction pattern graph soundly and efficiently can be very challenging. We propose a pre-analysis and program compaction technique to achieve this. Given a source code mining task and thousands of input programs on which the mining task needs to be run, our technique first extracts the information about what parts of an input program are relevant for the mining task and then removes the irrelevant parts from input programs, prior to running the mining task on them. Our key technical contributions are a static analysis to extract information about the parts of program that are relevant for a mining task and a sound program compaction technique that produces a reduced program on which the mining task has similar output as original program.
Upon producing interaction pattern graphs of thousands of artifacts, they have to be clustered and the mining task results have to be reused between similar artifacts to achieve acceleration. In the final part of this thesis, we fully describes collective program analysis and illustrate mining millions of control flow graphs (CFGs) by clustering similar CFGs
Analysis of Code Blocks for Concern Detection in MATLAB Systems
It is known that the support provided by MATLAB for module decomposition is
limited. Such limitations give rise to code symptoms, which can be explored for the
development of techniques for the detection of concerns, namely unmodularised concerns.
Recent work in the area of concern detection in MATLAB systems identified several
recurring code patterns that can be associated to the presence of specific concerns. Some
of the concerns detected proved to be unmodularised: they cut across the MATLAB
system’s modular decomposition.
The techniques already developed for detecting unmodularised concerns in MATLAB
systems still lack precision and accuracy. As proposed in previous work, the techniques
and tools for pinpointing and representing concern-detection patterns need maturing.
This thesis contributes with a more accurate structure for representing MATLAB code
bases in an intelligent repository for MATLAB code, developed prior to this work. It
perfects the structure representing MATLAB code on which the repository is based, by refining
the notion of code block, and collects code patterns found in previous publications
aggregating them into a catalogue. Subsequently, a preliminary study is made on the
application of codes of blocks for the detection of concerns, validating previous concern
related patterns and evaluate the existence of new ones
Development of an e-portfolio social network using emerging web technologies
Dissertação de mestrado em Informatics EngineeringDigital portfolios (also known as e-Portfolios) can be described as digital collections of artifacts, being both a
product (a digital collection of artifacts) and a process (reflecting on those artifacts and what they represent). It
is an extension of the traditional Curriculum Vitae, which tells the educational and professional milestones of
someone, while the portfolio proves and qualifies them (e.g.: annually thousands of students finish a Master
degree on Informatics, but only one has built Vue, Twitter or Facebook – the Portfolio goes beyond the CV
milestones by specifying the person’s output throughout life and distinguishing them). e-Portfolios augment this
by introducing new digital representations and workflows, exposed to a community, being both a product and
a process. This approach can be useful for individual self-reflection, education or even job markets, where
companies seek talented individuals, because it expands the traditional CV concept and empowers individual
merit. There have been many studies, theories, and methodologies related with e-Portfolios, but transpositions
to web applications have been unsuccessful, untuitive and too complex (in opposition to the CV format, which
had success in various applications, for example LinkedIn).
This project aims to study new approaches and develop an exploratory web/mobile application of this method ology, by exploring the potential of social networks to promote them, augmented by emergent web technologies.
Its main output is the prototype of a new product (a social network of e-Portfolio) and its design decisions, with
new theoretical approaches applied to web development. By the end of this project, we will have idealized a web
infrastructure for interacting with networks of users, their skills, and communities seeking them.
The approach to the development of this platform will be to integrate emerging technologies like WebAssembly
and Rust in its development cycle and document our findings. At the end of this project, in addition to the
prototype of a new product, we hope to have contributed to the State of the Art of Web Engineering and to be
able to answer questions regarding new emerging web development ecosystems.Os portfólios digitais (também conhecidos como e-Portfolios) podem ser descritos como coleções digitais de
artefatos, sendo tanto um produto (uma coleção digital de artefatos) quanto um processo (refletindo sobre esses
artefatos e o que eles representam). É uma extensão do tradicional Curriculum Vitae, onde o primeiro conta os
marcos educacionais e profissionais de alguém, enquanto que o segundo, o Portfólio, comprova-os e qualifica-os
(e.g.: anualmente milhares de alunos concluem graduações em Informática, no entanto apenas um consebeu
o Vue, o Twitter ou o Facebook - o Portfólio vai além dos indicadores quantitativos do CV, especificando e
qualificando a produção da pessoa ao longo da vida e distinguindo-a). Os e-Portfolios expandem este conceito
com a introdução de novas representações digitais e fluxos de trabalho, expostos a uma comunidade, sendo
tanto um produto como um processo. Esta abordagem pode ser útil para a autorreflexão individual, educação ou
mesmo mercados de trabalho, onde as empresas procuram indivíduos talentosos, porque expande o conceito
tradicional de CV e potencializa o mérito individual. Existem muitos estudos, teorias e metodologias relacionadas
com os e-Portfolios, mas as transposições para aplicações web têm sido mal sucedidas, pouco intuitivas e muito
complexas (em oposição ao formato CV, que tem tido sucesso em várias aplicações, por exemplo no LinkedIn).
Este projeto visa estudar novas abordagens neste domínio e desenvolver uma aplicação exploratória web/mobile que melhor exprima os e-Portfolios, explorando o potencial das redes sociais para os promover em conjunto
com tecnologias web emergentes. As principais produções esperadadas deste trabalho são um protótipo de
um novo produto (uma rede social de e-Portfolio) e documentar novas abordagens teóricas aplicadas ao desenvolvimento web. No final deste projeto, teremos idealizado uma infraestrutura web para interagir com redes de
utilizadores, as suas competências e comunidades que os procurem.
A abordagem ao desenvolvimento desta plataforma será integrar tecnologias emergentes como WebAssembly e Rust no seu ciclo de desenvolvimento e documentar as nossas descobertas e decisões. No final deste
projeto, para além do protótipo de uma plataforma, esperamos ter contribuido para o Estado da Arte da Engenharia Web e responder a questões sobre novos ecossistemas emergentes de desenvolvimento web
- …