thesis

Detection and Exploitation of Information Flow Leaks

Abstract

This thesis contributes to the field of language-based information flow analysis with a focus on detection and exploitation of information flow leaks in programs. To achieve this goal, this thesis presents a number of precise semi-automatic approaches that allow one to detect, exploit and judge the severity of information flow leaks in programs. The first part of the thesis develops an approach to detect and demonstrate information flow leaks in a program. This approach analyses a given program statically using symbolic execution and self-composition with the aim to generate so-called insecurity formulas whose satisfying models (obtained by SMT solvers) give rise to pairs of initial states that demonstrate insecure information flows. Based on these models, small unit test cases, so-called leak demonstrators, are created that check for the detected information flow leaks and fail if these exist. The developed approach is able to deal with unbounded loops and recursive method invocation by using program specifications like loop invariants or method contracts. This allows the approach to be fully precise (if needed) but also to abstract and allow for false positives in exchange for a higher degree of automation and simpler specifications. The approach supports several information flow security policies, namely, noninterference, delimited information release, and information erasure. The second part of the thesis builds upon the previous approach that allows the user to judge the severity of an information flow leak by exploiting the detected leaks in order to infer the secret information. This is achieved by utilizing a hybrid analysis which conducts an adaptive attack by performing a series of experiments. An experiment constitutes a concrete program run which serves to accumulate the knowledge about the secret. Each experiment is carried out with optimal low inputs deduced from the prior distribution and the knowledge of secret so that the potential leakage is maximized. We propose a novel approach to quantify information leakages as explicit functions of low inputs using symbolic execution and parametric model counting. Depending on the chosen security metric, general nonlinear optimization tools or Max-SMT solvers are used to find optimal low inputs, i.e., inputs that cause the program to leak a maximum of information. For the purpose of evaluation, both approaches have been fully implemented in the tool KEG, which is based on the state-of-the-art program verification system KeY. KEG supports a rich subset of sequential Java programs and generates executable JUnit tests as leak demonstrators. For the secret inference, KEG produces executable Java programs and runs them to perform the adaptive attack. The thesis discusses the planning, execution, and results of the evaluation. The evaluation has been performed on a collection of micro-benchmarks as well as two case studies, which are taken from the literature. The evaluation using the micro-benchmarks shows that KEG detects successfully all information flow leaks and is able to generate correct demonstrators in case the supplied specifications are correct and strong enough. With respect to secret inference, it shows that the approach presented in this thesis (which computes optimal low inputs) helps an attacker to learn the secret much more efficiently compared to approaches using arbitrary low inputs. KEG has also been evaluated in two case studies. The first case study is performed on an e-voting software which has been extracted in a simplified form from a real-world e-voting system. This case study focuses on the leak detection and demonstrator generation approach. The e-voting case study shows that KEG is able to deal with relatively complicated programs that include unbounded loops, objects, and arrays. Moreover, the case study demonstrates that KEG can be integrated with a specification generation tool to obtain both precision and full automation. The second case study is conducted on a PIN integrity checking program, adapted from a real-world ATM PIN verifying system. This case study mainly demonstrates the secret inference feature of KEG. It shows that KEG can help an attacker to learn the secret more efficiently given a good enough assumption about the prior distribution of secret

    Similar works