47 research outputs found

    Reverse Engineering from Assembler to Formal Specifications via Program Transformations

    Full text link
    The FermaT transformation system, based on research carried out over the last sixteen years at Durham University, De Montfort University and Software Migrations Ltd., is an industrial-strength formal transformation engine with many applications in program comprehension and language migration. This paper is a case study which uses automated plus manually-directed transformations and abstractions to convert an IBM 370 Assembler code program into a very high-level abstract specification.Comment: 10 page

    Assembler to C migration using the FermaT transformation system

    Get PDF
    The FermaT transformation system, based on research carried out over the last twelve years at Durham University and Software Migrations Ltd., is an industrial-strength formal transformation engine with many applications in program comprehension and language migration. This paper describes one application of the system: the migration of IBM 370 Assembler code to equivalent, maintainable C code. We present an example of using the tool to migrate a small, but complex, assembler module to C with no manual intervention required. We briefly discuss a mass migration exercise where 1,925 assembler modules were successfully migrated to C code

    A parallel transformations framework for cluster environments.

    Get PDF
    In recent years program transformation technology has matured into a practical solution for many software reengineering and migration tasks. FermaT, an industrial strength program transformation system, has demonstrated that legacy systems can be successfully transformed into efficient and maintainable structured C or COBOL code. Its core, a transformation engine, is based on mathematically proven program transformations and ensures that transformed programs are semantically equivalent to its original state. Its engine facilitates a Wide Spectrum Language (WSL), with low-level as well as high-level constructs, to capture as much information as possible during transformation steps. FermaT’s methodology and technique lack in provision of concurrent migration and analysis. This provision is crucial if the transformation process is to be further automated. As the constraint based program migration theory has demonstrated, it is inefficient and time consuming, trying to satisfy the enormous computation of the generated transformation sequence search-space and its constraints. With the objective to solve the above problems and to extend the operating range of the FermaT transformation system, this thesis proposes a Parallel Transformations Framework which makes parallel transformations processing within the FermaT environment not only possible but also beneficial for its migration process. During a migration process, many thousands of program transformations have to be applied. For example a 1 million line of assembler to C migration takes over 21 hours to be processed on a single PC. Various approaches of search, prediction techniques and a constraint-based approach to address the presented issues already exist but they solve them unsatisfactorily. To remedy this situation, this dissertation proposes a framework to extend transformation processing systems with parallel processing capabilities. The parallel system can analyse specified parallel transformation tasks and produce appropriate parallel transformations processing outlines. To underpin an automated objective, a formal language is introduced. This language can be utilised to describe and outline parallel transformation tasks whereas parallel processing constraints underpin the parallel objective. This thesis addresses and explains how transformation processing steps can be automatically parallelised within a reengineering domain. It presents search and prediction tactics within this field. The decomposition and parallelisation of transformation sequence search-spaces is outlined. At the end, the presented work is evaluated on practical case studies, to demonstrate different parallel transformations processing techniques and conclusions are drawn

    Search-based amorphous slicing

    Get PDF
    Amorphous slicing is an automated source code extraction technique with applications in many areas of software engineering, including comprehension, reuse, testing and reverse engineering. Algorithms for syntax-preserving slicing are well established, but amorphous slicing is harder because it requires arbitrary transformation; finding good general purpose amorphous slicing algorithms therefore remains as hard as general program transformation. In this paper we show how amorphous slices can be computed using search techniques. The paper presents results from a set of experiments designed to explore the application of genetic algorithms, hill climbing, random search and systematic search to a set of six subject programs. As a benchmark, the results are compared to those from an existing analytical algorithm for amorphous slicing, which was written specifically to perform well with the sorts of program under consideration. The results, while tentative at this stage, do give grounds for optimism. The search techniques proved able to reduce the size of the programs under consideration in all cases, sometimes equaling the performance of the specifically-tailored analytic algorithm. In one case, the search techniques performed better, highlighting a fault in the existing algorith

    Constraint based program transformation theory

    Get PDF
    Software Migration Lt

    Efficient long division via Montgomery multiply

    Full text link
    We present a novel right-to-left long division algorithm based on the Montgomery modular multiply, consisting of separate highly efficient loops with simply carry structure for computing first the remainder (x mod q) and then the quotient floor(x/q). These loops are ideally suited for the case where x occupies many more machine words than the divide modulus q, and are strictly linear time in the "bitsize ratio" lg(x)/lg(q). For the paradigmatic performance test of multiword dividend and single 64-bit-word divisor, exploitation of the inherent data-parallelism of the algorithm effectively mitigates the long latency of hardware integer MUL operations, as a result of which we are able to achieve respective costs for remainder-only and full-DIV (remainder and quotient) of 6 and 12.5 cycles per dividend word on the Intel Core 2 implementation of the x86_64 architecture, in single-threaded execution mode. We further describe a simple "bit-doubling modular inversion" scheme, which allows the entire iterative computation of the mod-inverse required by the Montgomery multiply at arbitrarily large precision to be performed with cost less than that of a single Newtonian iteration performed at the full precision of the final result. We also show how the Montgomery-multiply-based powering can be efficiently used in Mersenne and Fermat-number trial factorization via direct computation of a modular inverse power of 2, without any need for explicit radix-mod scalings.Comment: 23 pages; 8 tables v2: Tweak formatting, pagecount -= 2. v3: Fix incorrect powers of R in formulae [7] and [11] v4: Add Eldridge & Walter ref. v5: Clarify relation between Algos A/A',D and Hensel-div; clarify true-quotient mechanics; Add Haswell timings, refs to Agner Fog timings pdf and GMP asm-timings ref-page. v6: Remove stray +bw in MULL line of Algo D listing; add note re byte-LUT for qinv_

    Formal methods for legacy systems.

    Get PDF
    Paper dated January 6, 1995A method is described for obtaining useful information from legacy code. The approach uses formal proven program transformations, which preserve for refine the semantics of a construct while changing its form. The applicability of a transformation in a particular syntactic context is checked before application. By using an appropriate sequence of transformations, the extracted representation is guaranteed to be equivalent to the code. In this paper, we focus on the results of using this approach in the reverse engineering of medium scale, industrial software, written mostly in languages such as assembler and JOVIAL. Results from both benchmark algorithms and heavily modified, geriatric software are summarised. It is concluded that the approach is viable, for self-contained code, and that useful design information may be extracted from legacy systems at economic cost. We conclude that formal methods have an important practical role in the reverse engineering process.Partly funded bu Alvey project SE-088, partly through a DTI/SERC and IBM UK Ltd. funded IEATP grant "From assembler to Z using formal transformations" and partly by SERC (Science and Engineering Research Council) project "A proof theory for program refinement and equivalence: extensions"

    Prevođenje i transformisanje programa niskog nivoa

    Get PDF
    This thesis presents an approach for working with low level source code that enables automatic restructuring and raising the abstraction level of the programs. This makes it easier to understand the logic of the program, which in turn reduces the development time.The process in this thesis was designed to be flexible and consists of several independent tools. This makes the process easy to adapt as needed, while at the same time the developed tools can be used for other processes. There are usually two basic steps. First is the translation to WSL language, which has a great number of semantic preserving program transformations. The second step are the transformations of the translated WSL. Two tools were developed for translation: one that works with a subset of x86 assembly, and another that works with MicroJava bytecode. The result of the translation is a low level program in WSL.The primary goal of this thesis was to fully automate the selection of the transformations. This enables users with no domain  knowledge to efficiently use this process as needed. At the same time, the flexibility of the process enables experienced users to adapt it as needed or integrate it into other processes. The automation was achieved with a hill climbing algorithm.Experiments that were run on several types of input programs showed that the results can be excellent. The fitness function used was a built-in metric that gives the “weight” of structures in a program. On input samples that had original high level source codes, the end result metrics of the translated and transformed programs were comparable. On some samples the result was even better than the originals, on some others they were somewhat more complex. When comparing with low level original source code, the end results was always significantly improved.U okviru ove teze se predstavlja pristup radu sa programima niskog nivoa koji omogućava automatsko restrukturiranje i podizanje na više nivoe. Samim tim postaje mnogo lakše razumeti logiku programa što smanjuje vreme razvoja.Proces je dizajniran tako da bude fleksibilan i sastoji se od više nezavisnih alata. Samim tim je lako menjati proces po potrebi, ali i upotrebiti razvijene alate u drugim procesima. Tipično se mogu razlikovati dva glavna koraka. Prvi je prevođenje u jezik WSL,za koji postoji veliki broj transformacija programa koje očuvavaju semantiku. Drugi su transformacije u samom WSL-u. Za potrebe prevođenja su razvijena dva alata, jedan koji radi sa podskupom x86 asemblera i drugi koji radi sa MikroJava bajtkôdom. Rezultat prevođenja je program niskog nivoa u WSL jeziku.Primarni cilj ovog istraživanja je bila potpuna automatizacija odabira transformacija, tako da i korisnici bez iskustva u radu sa sistemom mogu efikasno da primene ovaj proces za svoje potrebe. Sa druge strane zbog fleksibilnosti procesa, iskusni korisnici mogu lakoda ga prošire ili da ga integrišu u neki drugi već postojeći   proces.Automatizacija je  postignuta pretraživanjem usponom (eng. hill climbing).Eksperimenti vršeni na nekoliko tipova ulaznih programa niskog nivoa su pokazali da rezultati mogu biti  izuzetni. Za funkciju pogodnosti je korišćena ugrađena metrika koja daje “težinu” struktura u programu. Kod ulaza za koje je originalni izvorni kôd bio dostupan, krajnje metrike najboljih varijanti prevedenih i transformisanih programa su bile na sličnom nivou. Neki primeri su bolji od originala, dok su drugi bili nešto kompleksniji. Rezultati su uvek pokazivali značajna unapređenja u odnosu na originalni kôd niskog nivoa

    Data re-engineering using formal transformations

    Get PDF
    This thesis presents and analyses a solution to the problem of formally re- engineering program data structures, allowing new representations of a program to be developed. The work is based around Ward's theory of program transformations which uses a Wide Spectrum Language, WSL, whose semantics were specially developed for use in proof of program transformations. The re-engineered code exhibits equivalent functionality to the original but differs in the degree of data abstraction and representation. Previous transformational re-engineering work has concentrated upon control flow restructuring, which has highlighted a lack of support for data restructuring in the maintainer's tool-set. Problems have been encountered during program transformation due to the lack of support for data re-engineering. A lack of strict data semantics and manipulation capabilities has left the maintainer unable to produce optimally re-engineered solutions. It has also hindered the migration of programs into other languages because it has not been possible to convert data structures into an appropriate form in the target language. The main contribution of the thesis is the Data Re-Engineering and Abstraction Mechanism (DREAM) which allows theories about type equivalence to be represented and used in a re-engineering environment. DREAM is based around the technique of "ghosting", a way of introducing different representations of data, which provides the theoretical underpinning of the changes applied to the program. A second major contribution is the introduction of data typing into the WSL language. This allows DREAM to be integrated into the existing transformation theories within WSL. These theoretical extensions of the original work have been shown to be practically viable by implementation within a prototype transformation tool, the Maintainer's Assistant. The extended tool has been used to re-engineer heavily modified, commercial legacy code. The results of this have shown that useful re-engineering work can be performed and that DREAM integrates well with existing control flow transformations

    A wide spectrum type system for transformation theory

    Get PDF
    One of the most difficult tasks a programmer can be confronted with is the migration of a legacy system. Usually, these systems are unstructured, poorly documented and contain complex program logic. The reason for this, in most cases, is an emphasis on raw performance rather than on clean and structured code as well as a long period of applying quick fixes and enhancements rather than doing a proper software reengineering process including a full redesign during major enhancements. Nowadays, the old programming paradigms are becoming an increasingly serious problem. It has been identified that 90% of the costs of a typical software system arise in the maintenance phase. Many companies are simply too afraid of changing their software infrastructure and prefer to continue with principles like "never touch a running system". These companies experience growing pressure to migrate their legacy systems onto newer platforms because the maintenance of such systems is expensive and dangerous as the risk of losing vital parts of sources code or its documentation increases drastically over time. The FermaT transformation system has shown the ability to automatically or semi-automatically restructure and abstract legacy code within a special intermediate language called WSL (Wide Spectrum Language). Unfortunately, the current transformation process only supports the migration of assembler as WSL lacks the ability to handle data types properly. The data structures in assembler are currently directly translated into C data types which involves many assumptional “hard coded” conversions. The absence of an adequate type system for WSL caused several flaws for the whole transformation process and limits its abilities significantly. The main aim of the presented research is to tackle these problems by investigating and formulating how a type system can contribute to a safe and reliable migration of legacy systems. The described research includes the definition of key aspects of type related problems in the FermaT migration process and how to solve them with a suitable type system approach. Since software migration often includes a change in programming language the type system for WSL has to be able to support various type system approaches including the representation of all relevant details to avoid assumptions. This is especially difficult as most programming languages are designed for a special purpose which means that their possible programming constructs and data types differ significantly. This ranges from languages with simple type systems whose program sare prone to unintended side-effects, to languages with strict type systems which are constrained n their flexibility. It is important to include as many type related details as necessary to avoid making assumptions during language to language translation. The result of the investigation is a novel multi layered type system specifically designed to satisfy the needs of WSL for a sophisticated solution without imposing too many limitations on its abilities. The type system has an adjustable expressiveness, able to represent a wide spectrum of typing approaches ranging from weak typing which allows direct memory access and down casting, via very strict typing with a high diversity of data types to object oriented typing which supports encapsulation and data hiding. Looking at the majority of commercial relevant statically typed programming languages, two fundamental properties of type strictness and safety can be identified. A type system can be either weakly or strongly typed and may or may not allow unsafe features such as direct memory access. Each layer of the Wide Spectrum Type System has a different combination of these properties. The approach also includes special Type System Transformations which can be used to move a given WSL program among these layers. Other emphasised key features are explicit typing and scalability. The whole approach is based on a sound mathematical foundation which assures correctness and integrates seamlessly into the present mathematical definition of WSL. The type system is formally introduced to WSL by constructing an attribute grammar for the language. Type checking and type inference are used to annotate the Abstract Syntax Tree of a given WSL program with type derivations which can be used to reveal and indicate possible typing errors or to infer types if the program did not feature explicit type declarations in the first place. Notable in this approach is also the fact that object orientation is introduced to a procedural programming language without the introduction of new semantics. It is shown that object orientation can be introduced just by adjusting type checking rules and adding some syntactical notations. The approach was implemented and tested on two case studies. The thesis describes and discusses both cases in detail and shows how a migration which ignores type systems could accidentally introduce errors due to assumptions during translation. Both case studies use all important aspects of the approach, Including type transformations and object identification. The thesis finalises by summarising the whole work, identifying limitations, presenting future perspectives and drawing conclusion
    corecore