Search CORE

1,305 research outputs found

Enhancing Symbolic Execution of Heap-based Programs with Separation Logic for Test Input Generation

Author: B Hillery
C Calcagno
CS Păsăreanu
GT Leavens
Herbert B. Enderton
J Berdine
J Geldenhuys
JC King
Long H. Pham
M Tatsuta
N Rosner
P Müller
QL Le
QL Le
QL Le
QL Le
S Khurshid
WN Chin
Publication venue
Publication date: 16/09/2019
Field of study

Symbolic execution is a well established method for test input generation. Despite of having achieved tremendous success over numerical domains, existing symbolic execution techniques for heap-based programs are limited due to the lack of a succinct and precise description for symbolic values over unbounded heaps. In this work, we present a new symbolic execution method for heap-based programs based on separation logic. The essence of our proposal is context-sensitive lazy initialization, a novel approach for efficient test input generation. Our approach differs from existing approaches in two ways. Firstly, our approach is based on separation logic, which allows us to precisely capture preconditions of heap-based programs so that we avoid generating invalid test inputs. Secondly, we generate only fully initialized test inputs, which are more useful in practice compared to those partially initialized test inputs generated by the state-of-the-art tools. We have implemented our approach as a tool, called Java StarFinder, and evaluated it on a set of programs with complex heap inputs. The results show that our approach significantly reduces the number of invalid test inputs and improves the test coverage

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

Teeside University's Research Repository

A Survey of Symbolic Execution Techniques

Author: Baldoni Roberto
Coppa Emilio
D'Elia Daniele Cono
Demetrescu Camil
Finocchi Irene
Publication venue
Publication date: 01/01/2018
Field of study

Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Archivio della ricerca- Università di Roma La Sapienza

Concolic Testing Heap-Manipulating Programs

Author: Le Quang Loc
Pham Long H.
Phan Quoc-Sang
Sun Jun
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 12/06/2019
Field of study

Concolic testing is a test generation technique which works effectively by integrating random testing generation and symbolic execution. Existing concolic testing engines focus on numeric programs. Heap-manipulating programs make extensive use of complex heap objects like trees and lists. Testing such programs is challenging due to multiple reasons. Firstly, test inputs for such program are required to satisfy non-trivial constraints which must be specified precisely. Secondly, precisely encoding and solving path conditions in such programs are challenging and often expensive. In this work, we propose the first concolic testing engine called CSF for heap-manipulating programs based on separation logic. CSF effectively combines specification-based testing and concolic execution for test input generation. It is evaluated on a set of challenging heap-manipulating programs. The results show that CSF generates valid test inputs with high coverage efficiently. Furthermore, we show that CSF can be potentially used in combination with precondition inference tools to reduce the user effort

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Teeside University's Research Repository

S2TD: a Separation Logic Verifier that Supports Reasoning of the Absence and Presence of Bugs

Author: Le Quang Loc
Pham Long H.
Qin Shengchao
Sun Jun
Publication venue
Publication date: 19/09/2022
Field of study

Heap-manipulating programs are known to be challenging to reason about. We present a novel verifier for heap-manipulating programs called S2TD, which encodes programs systematically in the form of Constrained Horn Clauses (CHC) using a novel extension of separation logic (SL) with recursive predicates and dangling predicates. S2TD actively explores cyclic proofs to address the path explosion problem. S2TD differentiates itself from existing CHC-based verifiers by focusing on heap-manipulating programs and employing cyclic proof to efficiently verify or falsify them with counterexamples. Compared with existing SL-based verifiers, S2TD precisely specifies the heaps of de-allocated pointers to avoid false positives in reasoning about the presence of bugs. S2TD has been evaluated using a comprehensive set of benchmark programs from the SV-COMP repository. The results show that S2TD is more effective than state-of-art program verifiers and is more efficient than most of them.Comment: 24 page

arXiv.org e-Print Archive

Automatic Data Structure Repair using Separation Logic

Author: Le Quang Loc
Nguyen ThanhVu
Phan Quoc-Sang
Zheng Guolong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/10/2018
Field of study

Teeside University's Research Repository

Verification of Pointer-Based Programs with Partial Information

Author: LUO CHENGUANG
Publication venue
Publication date: 01/01/2011
Field of study

The proliferation of software across all aspects of people's life means that software failure can bring catastrophic result. It is therefore highly desirable to be able to develop software that is verified to meet its expected specification. This has also been identified as a key objective in one of the UK Grand Challenges (GC6) (Jones et al., 2006; Woodcock, 2006). However, many difficult problems still remain in achieving this objective, partially due to the wide use of (recursive) shared mutable data structures which are hard to keep track of statically in a precise and concise way. This thesis aims at building a verification system for both memory safety and functional correctness of programs manipulating pointer-based data structures, which can deal with two scenarios where only partial information about the program is available. For instance the verifier may be supplied with only partial program specification, or with full specification but only part of the program code. For the first scenario, previous state-of-the-art works (Nguyen et al., 2007; Chin et al., 2007; Nguyen and Chin, 2008; Chin et al, 2010) generally require users to provide full specifications for each method of the program to be verified. Their approach seeks much intellectual effort from users, and meanwhile users are liable to make mistakes in writing such specifications. This thesis proposes a new approach to program verification that allows users to provide only partial specification to methods. Our approach will then refine the given annotation into a more complete specification by discovering missing constraints. The discovered constraints may involve both numerical and multiset properties that could be later confirmed or revised by users. Meanwhile, we further augment our approach by requiring only partial specification to be given for primary methods of a program. Specifications for loops and auxiliary methods can then be systematically discovered by our augmented mechanism, with the help of information propagated from the primary methods. This work is aimed at verifying beyond shape properties, with the eventual goal of analysing both memory safety and functional properties for pointer-based data structures. Initial experiments have confirmed that we can automatically refine partial specifications with non-trivial constraints, thus making it easier for users to handle specifications with richer properties. For the second scenario, many programs contain invocations to unknown components and hence only part of the program code is available to the verifier. As previous works generally require the whole of program code be present, we target at the verification of memory safety and functional correctness of programs manipulating pointer-based data structures, where the program code is only partially available due to invocations to unknown components. Provided with a Hoare-style specification ({Pre} prog {Post}) where program (prog) contains calls to some unknown procedure (unknown), we infer a specification (mspecu) for the unknown part (unknown) from the calling contexts, such that the problem of verifying program (prog) can be safely reduced to the problem of proving that the unknown procedure (unknown) (once its code is available) meets the derived specification (mspecu). The expected specification (mspecu) is automatically calculated using an abduction-based shape analysis specifically designed for a combined abstract domain. We have implemented a system to validate the viability of our approach, with encouraging experimental results

Durham e-Theses

OpenGrey Repository

Towards General Loop Invariant Generation via Coordinating Symbolic Execution and Large Language Models

Author: Cao Qinxiang
Feng Yuan
Liu Chang
Wu Xiwei
Yan Junchi
Publication venue
Publication date: 17/11/2023
Field of study

Loop invariants, essential for program verification, are challenging to auto-generate especially for programs incorporating complex memory manipulations. Existing approaches for generating loop invariants rely on fixed sets or templates, hampering adaptability to real-world programs. Recent efforts have explored machine learning for loop invariant generation, but the lack of labeled data and the need for efficient generation are still troublesome. We consider the advent of the large language model (LLM) presents a promising solution, which can analyze the separation logic assertions after symbolic execution to infer loop invariants. To overcome the data scarcity issue, we propose a self-supervised learning paradigm to fine-tune LLM, using the split-and-reassembly of predicates to create an auxiliary task and generate rich synthetic data for offline training. Meanwhile, the proposed interactive system between LLM and traditional verification tools provides an efficient online querying process for unseen programs. Our framework can readily extend to new data structures or multi-loop programs since our framework only needs the definitions of different separation logic predicates, aiming to bridge the gap between existing capabilities and requirements of loop invariant generation in practical scenarios. Experiments across diverse memory-manipulated programs have demonstrated the performance of our proposed method compared to the baselines with respect to efficiency and effectiveness.Comment: Preprint, under revie

arXiv.org e-Print Archive

Compositional Verification of Heap-Manipulating Programs through Property-Guided Learning

Author: Le Quang Loc
Pham Long H.
Sun Jun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/08/2019
Field of study

Analyzing and verifying heap-manipulating programs automatically is challenging. A key for fighting the complexity is to develop compositional methods. For instance, many existing verifiers for heap-manipulating programs require user-provided specification for each function in the program in order to decompose the verification problem. The requirement, however, often hinders the users from applying such tools. To overcome the issue, we propose to automatically learn heap-related program invariants in a property-guided way for each function call. The invariants are learned based on the memory graphs observed during test execution and improved through memory graph mutation. We implemented a prototype of our approach and integrated it with two existing program verifiers. The experimental results show that our approach enhances existing verifiers effectively in automatically verifying complex heap-manipulating programs with multiple function calls

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Teeside University's Research Repository

Loop invariant synthesis in a combined abstract domain

Author: Berdine
Berdine
Calcagno
Chenguang Luo
Chin
Chin
Cousot
Cousot
Distefano
Furia
Guanhua He
Gulwani
Hackett
Ishtiaq
Kovacs
Kuncak
Leino
Leino
Magill
Nguyen
Nguyen
Parkinson
Popeea
Pugh
Sagiv
Shengchao Qin
Venet
Wei-Ngan Chin
Xin Chen
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Automated verification of memory safety and functional correctness for heap-manipulating programs has been a challenging task, especially when dealing with complex data structures with strong invariants involving both shape and numerical properties. Existing verification systems usually rely on users to supply annotations to guide the verification, which can be cumbersome and error-prone by hand and can significantly restrict the usability of the verification system. In this paper, we reduce the need for some user annotations by automatically inferring loop invariants over an abstract domain with both shape and numerical information. Our loop invariant synthesis is conducted automatically by a fixed-point iteration process, equipped with newly designed abstraction mechanism, together with join and widening operators over the combined domain. We have also proven the soundness and termination of our approach. Initial experiments confirm that we can synthesise loop invariants with non-trivial constraints

CiteSeerX

Crossref

Teeside University's Research Repository

ScholarBank@NUS

Program Analysis in A Combined Abstract Domain

Author: HE GUANHUA
Publication venue
Publication date: 01/01/2011
Field of study

Automated verification of heap-manipulating programs is a challenging task due to the complexity of aliasing and mutability of data structures used in these programs. The properties of a number of important data structures do not only relate to one domain, but to combined multiple domains, such as sorted list, priority queues, height-balanced trees and so on. The safety and sometimes efficiency of programs do rely on the properties of those data structures. This thesis focuses on developing a verification system for both functional correctness and memory safety of such programs which involve heap-based data structures. Two automated inference mechanisms are presented for heap-manipulating programs in this thesis. Firstly, an abstract interpretation based approach is proposed to synthesise program invariants in a combined pure and shape domain. Newly designed abstraction, join and widening operators have been defined for the combined domain. Furthermore, a compositional analysis approach is described to discover both pre-/post-conditions of programs with a bi-abduction technique in the combined domain. As results of my thesis, both inference approaches have been implemented and the obtained results validate the feasibility and precision of proposed approaches. The outcomes of the thesis confirm that it is possible and practical to analyse heap-manipulating programs automatically and precisely by using abstract interpretation in a sophisticated combined domain

Durham e-Theses