4,561 research outputs found
Metamorphic Code Generation from LLVM IR Bytecode
Metamorphic software changes its internal structure across generations with its functionality remaining unchanged. Metamorphism has been employed by malware writers as a means of evading signature detection and other advanced detection strate- gies. However, code morphing also has potential security benefits, since it increases the “genetic diversity” of software. In this research, we have created a metamorphic code generator within the LLVM compiler framework. LLVM is a three-phase compiler that supports multiple source languages and target architectures. It uses a common intermediate representation (IR) bytecode in its optimizer. Consequently, any supported high-level programming language can be transformed to this IR bytecode as part of the LLVM compila- tion process. Our metamorphic generator functions at the IR bytecode level, which provides many advantages over previously developed metamorphic generators. The morphing techniques that we employ include dead code insertion—where the dead code is actually executed within the morphed code—and subroutine permutation. We have tested the effectiveness of our code morphing using hidden Markov model analysis
Amortising the Cost of Mutation Based Fault Localisation using Statistical Inference
Mutation analysis can effectively capture the dependency between source code
and test results. This has been exploited by Mutation Based Fault Localisation
(MBFL) techniques. However, MBFL techniques suffer from the need to expend the
high cost of mutation analysis after the observation of failures, which may
present a challenge for its practical adoption. We introduce SIMFL (Statistical
Inference for Mutation-based Fault Localisation), an MBFL technique that allows
users to perform the mutation analysis in advance against an earlier version of
the system. SIMFL uses mutants as artificial faults and aims to learn the
failure patterns among test cases against different locations of mutations.
Once a failure is observed, SIMFL requires either almost no or very small
additional cost for analysis, depending on the used inference model. An
empirical evaluation of SIMFL using 355 faults in Defects4J shows that SIMFL
can successfully localise up to 103 faults at the top, and 152 faults within
the top five, on par with state-of-the-art alternatives. The cost of mutation
analysis can be further reduced by mutation sampling: SIMFL retains over 80% of
its localisation accuracy at the top rank when using only 10% of generated
mutants, compared to results obtained without sampling
Experience in using a typed functional language for the development of a security application
In this paper we present our experience in developing a security application
using a typed functional language. We describe how the formal grounding of its
semantic and compiler have allowed for a trustworthy development and have
facilitated the fulfillment of the security specification.Comment: In Proceedings F-IDE 2014, arXiv:1404.578
Linear Haskell: practical linearity in a higher-order polymorphic language
Linear type systems have a long and storied history, but not a clear path
forward to integrate with existing languages such as OCaml or Haskell. In this
paper, we study a linear type system designed with two crucial properties in
mind: backwards-compatibility and code reuse across linear and non-linear users
of a library. Only then can the benefits of linear types permeate conventional
functional programming. Rather than bifurcate types into linear and non-linear
counterparts, we instead attach linearity to function arrows. Linear functions
can receive inputs from linearly-bound values, but can also operate over
unrestricted, regular values.
To demonstrate the efficacy of our linear type system - both how easy it can
be integrated in an existing language implementation and how streamlined it
makes it to write programs with linear types - we implemented our type system
in GHC, the leading Haskell compiler, and demonstrate two kinds of applications
of linear types: mutable data with pure interfaces; and enforcing protocols in
I/O-performing functions
Near-optimal loop tiling by means of cache miss equations and genetic algorithms
The effectiveness of the memory hierarchy is critical for the performance of current processors. The performance of the memory hierarchy can be improved by means of program transformations such as loop tiling, which is a code transformation targeted to reduce capacity misses. This paper presents a novel systematic approach to perform near-optimal loop tiling based on an accurate data locality analysis (cache miss equations) and a powerful technique to search the solution space that is based on a genetic algorithm. The results show that this approach can remove practically all capacity misses for all considered benchmarks. The reduction of replacement misses results in a decrease of the miss ratio that can be as significant as a factor of 7 for the matrix multiply kernel.Peer ReviewedPostprint (published version
- …