30,811 research outputs found

    Analysis of Software Binaries for Reengineering-Driven Product Line Architecture\^aAn Industrial Case Study

    Full text link
    This paper describes a method for the recovering of software architectures from a set of similar (but unrelated) software products in binary form. One intention is to drive refactoring into software product lines and combine architecture recovery with run time binary analysis and existing clustering methods. Using our runtime binary analysis, we create graphs that capture the dependencies between different software parts. These are clustered into smaller component graphs, that group software parts with high interactions into larger entities. The component graphs serve as a basis for further software product line work. In this paper, we concentrate on the analysis part of the method and the graph clustering. We apply the graph clustering method to a real application in the context of automation / robot configuration software tools.Comment: In Proceedings FMSPLE 2015, arXiv:1504.0301

    Lightweight Multilingual Software Analysis

    Full text link
    Developer preferences, language capabilities and the persistence of older languages contribute to the trend that large software codebases are often multilingual, that is, written in more than one computer language. While developers can leverage monolingual software development tools to build software components, companies are faced with the problem of managing the resultant large, multilingual codebases to address issues with security, efficiency, and quality metrics. The key challenge is to address the opaque nature of the language interoperability interface: one language calling procedures in a second (which may call a third, or even back to the first), resulting in a potentially tangled, inefficient and insecure codebase. An architecture is proposed for lightweight static analysis of large multilingual codebases: the MLSA architecture. Its modular and table-oriented structure addresses the open-ended nature of multiple languages and language interoperability APIs. We focus here as an application on the construction of call-graphs that capture both inter-language and intra-language calls. The algorithms for extracting multilingual call-graphs from codebases are presented, and several examples of multilingual software engineering analysis are discussed. The state of the implementation and testing of MLSA is presented, and the implications for future work are discussed.Comment: 15 page

    Efficient Exact and Approximate Algorithms for Computing Betweenness Centrality in Directed Graphs

    Full text link
    Graphs are an important tool to model data in different domains, including social networks, bioinformatics and the world wide web. Most of the networks formed in these domains are directed graphs, where all the edges have a direction and they are not symmetric. Betweenness centrality is an important index widely used to analyze networks. In this paper, first given a directed network GG and a vertex rV(G)r \in V(G), we propose a new exact algorithm to compute betweenness score of rr. Our algorithm pre-computes a set RV(r)\mathcal{RV}(r), which is used to prune a huge amount of computations that do not contribute in the betweenness score of rr. Time complexity of our exact algorithm depends on RV(r)|\mathcal{RV}(r)| and it is respectively Θ(RV(r)E(G))\Theta(|\mathcal{RV}(r)|\cdot|E(G)|) and Θ(RV(r)E(G)+RV(r)V(G)logV(G))\Theta(|\mathcal{RV}(r)|\cdot|E(G)|+|\mathcal{RV}(r)|\cdot|V(G)|\log |V(G)|) for unweighted graphs and weighted graphs with positive weights. RV(r)|\mathcal{RV}(r)| is bounded from above by V(G)1|V(G)|-1 and in most cases, it is a small constant. Then, for the cases where RV(r)\mathcal{RV}(r) is large, we present a simple randomized algorithm that samples from RV(r)\mathcal{RV}(r) and performs computations for only the sampled elements. We show that this algorithm provides an (ϵ,δ)(\epsilon,\delta)-approximation of the betweenness score of rr. Finally, we perform extensive experiments over several real-world datasets from different domains for several randomly chosen vertices as well as for the vertices with the highest betweenness scores. Our experiments reveal that in most cases, our algorithm significantly outperforms the most efficient existing randomized algorithms, in terms of both running time and accuracy. Our experiments also show that our proposed algorithm computes betweenness scores of all vertices in the sets of sizes 5, 10 and 15, much faster and more accurate than the most efficient existing algorithms.Comment: arXiv admin note: text overlap with arXiv:1704.0735

    Reverse-engineering of polynomial dynamical systems

    Get PDF
    Multivariate polynomial dynamical systems over finite fields have been studied in several contexts, including engineering and mathematical biology. An important problem is to construct models of such systems from a partial specification of dynamic properties, e.g., from a collection of state transition measurements. Here, we consider static models, which are directed graphs that represent the causal relationships between system variables, so-called wiring diagrams. This paper contains an algorithm which computes all possible minimal wiring diagrams for a given set of state transition measurements. The paper also contains several statistical measures for model selection. The algorithm uses primary decomposition of monomial ideals as the principal tool. An application to the reverse-engineering of a gene regulatory network is included. The algorithm and the statistical measures are implemented in Macaulay2 and are available from the authors

    Performance Analysis of Legacy Perl Software via Batch and Interactive Trace Visualization

    Get PDF
    Performing an analysis of established software usually is challenging. Based on reverse engineering through dynamic analysis, it is possible to perform a software performance analysis, in order to detect performance bottlenecks or issues. This process is often divided into two consecutive tasks. The first task concerns the monitoring of the legacy software, and the second task covers analysing and visualizing the results. Dynamic analysis is usually addressed via trace visualization, but finding an appropriate representation for a specific issue still remains a great challenge. In this paper we report on our performance analysis of the Perl-based open repository software EPrints, which has now been continuously developed for more than fifteen years. We analyse and evaluate the software using the Kieker monitoring framework, and apply and combine two types of visualization tools, namely Graphviz and Gephi. More precisely, we employ Kieker to reconstruct architectural models from recorded monitoring data, based on dynamic analysis, and Graphviz respectively Gephi for further analysis and visualization of our monitoring results. We acquired knowledge of the software through our instrumentation and analysis via Kieker and the combined visualization of the two aforementioned tools. This allowed us, in collaboration with the EPrints development team, to reverse engineer their software EPrints, to give new and unexpected insights, and to detect potential bottlenecks

    Some issues in the 'archaeology' of software evolution

    Get PDF
    During a software project's lifetime, the software goes through many changes, as components are added, removed and modified to fix bugs and add new features. This paper is intended as a lightweight introduction to some of the issues arising from an `archaeological' investigation of software evolution. We use our own work to look at some of the challenges faced, techniques used, findings obtained, and lessons learnt when measuring and visualising the historical changes that happen during the evolution of software
    corecore