Search CORE

44 research outputs found

Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript

Author: Nielsen Benjamin Barslev
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th European Conference on Object-Oriented Programming (ECOOP 2020)
Publication date: 01/01/2020
Field of study

In static analysis of modern JavaScript libraries, relational analysis at key locations is critical to provide sound and useful results. Prior work addresses this challenge by the use of various forms of trace partitioning and syntactic patterns, which is fragile and does not scale well, or by incorporating complex backwards analysis. In this paper, we propose a new lightweight variant of trace partitioning named value partitioning that refines individual abstract values instead of entire abstract states. We describe how this approach can effectively capture important relational properties involving dynamic property accesses, functions with free variables, and predicate functions. Furthermore, we extend an existing JavaScript analyzer with value partitioning and demonstrate experimentally that it is a simple, precise, and efficient alternative to the existing approaches for analyzing widely used JavaScript libraries

Dagstuhl Research Online Publication Server

Recommended from our members

Combining Static and Dynamic Analysis for Bug Detection and Program Understanding

Author: Li Kaituo
Publication venue: ScholarWorks@UMass Amherst
Publication date: 10/11/2016
Field of study

This work proposes new combinations of static and dynamic analysis for bug detection and program understanding. There are 3 related but largely independent directions: a) In the area of dynamic invariant inference, we improve the consistency of dynamically discovered invariants by taking into account second-order constraints that encode knowledge aboutinvariants; the second-order constraints are either supplied by the programmer or vetted by the programmer (among candidate constraints suggested automatically); b) In the area of testing dataflow (esp. map-reduce) programs, our tool, SEDGE, achieves higher testing coverage by leveraging existinginput data and generalizing them using a symbolic reasoning engine (a powerful SMT solver); c) In the area of bug detection, we identify and present the concept of residual investigation: a dynamic analysis that serves as theruntime agent of a static analysis. Residual investigation identifies with higher certainty whether an error reported by the static analysis is likely true

ScholarWorks@UMass Amherst

Probabilistic Naming of Functions in Stripped Binaries

Author: Anh Quynh Coseinc Nguyen
Bao Tiffany
Bourquin Martial
Chul Richard Shin Eui
Dai Hanjun
DeFreez Daniel
Egele Manuel
Farhadi Mohammad Reza
Flake Halvar
Gulwani Sumit
Hu Yikun
Kim Soomin
Livshits Benjamin
Nagarajan Vijayanand
Ng Beng Heng
Pewny Jannik
Rosenblum E.
TensorFlow Martín Abadi
UC Santa Barbra Computer Security Lab and Arizona State University SEFCOM.
Publication venue: ACSAC '20: Annual Computer Security Applications Conference
Publication date: 07/12/2020
Field of study

Debugging symbols in binary executables carry the names of functions and global variables. When present, they greatly simplify the process of reverse engineering, but they are almost always removed (stripped) for deployment. We present the design and implementation of punstrip, a tool which combines a probabilistic fingerprint of binary code based on high-level features with a probabilistic graphical model to learn the relationship between function names and program structure. As there are many naming conventions and developer styles, functions from different applications do not necessarily have the exact same name, even if they implement the exact same functionality. We therefore evaluate punstrip across three levels of name matching: exact; an approach based on natural language processing of name components; and using Symbol2Vec, a new embedding of function names based on random walks of function call graphs. We show that our approach is able to recognize functions compiled across different compilers and optimization levels and then demonstrate that punstrip can predict semantically similar function names based on code structure. We evaluate our approach over open source C binaries from the Debian Linux distribution and compare against the state of the art

Crossref

UCL Discovery

Cautiously Optimistic Program Analyses for Secure and Reliable Software

Author: Banerjee Subarno
Publication venue
Publication date: 01/01/2021
Field of study

Modern computer systems still have various security and reliability vulnerabilities. Well-known dynamic analyses solutions can mitigate them using runtime monitors that serve as lifeguards. But the additional work in enforcing these security and safety properties incurs exorbitant performance costs, and such tools are rarely used in practice. Our work addresses this problem by constructing a novel technique- Cautiously Optimistic Program Analysis (COPA). COPA is optimistic- it infers likely program invariants from dynamic observations, and assumes them in its static reasoning to precisely identify and elide wasteful runtime monitors. The resulting system is fast, but also ensures soundness by recovering to a conservatively optimized analysis when a likely invariant rarely fails at runtime. COPA is also cautious- by carefully restricting optimizations to only safe elisions, the recovery is greatly simplified. It avoids unbounded rollbacks upon recovery, thereby enabling analysis for live production software. We demonstrate the effectiveness of Cautiously Optimistic Program Analyses in three areas: Information-Flow Tracking (IFT) can help prevent security breaches and information leaks. But they are rarely used in practice due to their high performance overhead (>500% for web/email servers). COPA dramatically reduces this cost by eliding wasteful IFT monitors to make it practical (9% overhead, 4x speedup). Automatic Garbage Collection (GC) in managed languages (e.g. Java) simplifies programming tasks while ensuring memory safety. However, there is no correct GC for weakly-typed languages (e.g. C/C++), and manual memory management is prone to errors that have been exploited in high profile attacks. We develop the first sound GC for C/C++, and use COPA to optimize its performance (16% overhead). Sequential Consistency (SC) provides intuitive semantics to concurrent programs that simplifies reasoning for their correctness. However, ensuring SC behavior on commodity hardware remains expensive. We use COPA to ensure SC for Java at the language-level efficiently, and significantly reduce its cost (from 24% down to 5% on x86). COPA provides a way to realize strong software security, reliability and semantic guarantees at practical costs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/170027/1/subarno_1.pd

Deep Blue Documents at the University of Michigan

Scalability-First Pointer Analysis with Self-Tuning Context-Sensitivity

Author: Yue Li Tian Tan Anders Mοller Yannis Smaragdakis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Context-sensitivity is important in pointer analysis to ensure high precision, but existing techniques suffer from unpredictable scala- bility. Many variants of context-sensitivity exist, and it is difficult to choose one that leads to reasonable analysis time and obtains high precision, without running the analysis multiple times. We present the Scaler framework that addresses this problem. Scaler efficiently estimates the amount of points-to information that would be needed to analyze each method with different variants of context-sensitivity. It then selects an appropriate variant for each method so that the total amount of points-to information is bounded, while utilizing the available space to maximize precision. Our experimental results demonstrate that Scaler achieves pre- dictable scalability for all the evaluated programs (e.g., speedups can reach 10x for 2-object-sensitivity), while providing a precision that matches or even exceeds that of the best alternative techniques

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens

Doctor of Philosophy

Author: Chen Yang
Publication venue: University of Utah
Publication date: 01/05/2014
Field of study

dissertationAggressive random testing tools, or fuzzers, are impressively effective at finding bugs in compilers and programming language runtimes. For example, a single test-case generator has resulted in more than 460 bugs reported for a number of production-quality C compilers. However, fuzzers can be hard to use. The first problem is that failures triggered by random test cases can be difficult to debug because these tests are often large. To report a compiler bug, one must often construct a small test case that triggers the bug. The existing automated test-case reduction technique, delta debugging, is not sufficient to produce small, reportable test cases. A second problem is that fuzzers are indiscriminate: they repeatedly find bugs that may not be severe enough to fix right away. Third, fuzzers tend to generate a large number of test cases that only trigger a few bugs. Some bugs are triggered much more frequently than others, creating needle-in-the-haystack problems. Currently, users rule out undesirable test cases using ad hoc methods such as disallowing problematic features in tests and filtering test results. This dissertation investigates approaches to improving the utility of compiler fuzzers. Two components, an aggressive test-case reducer and a tamer, are added to the fuzzing workflow to make the fuzzer more user friendly. We introduce C-Reduce, an aggressive test-case reducer for C/C++ programs, which exploits rich domain-specific knowledge to output test cases nearly as good as those produced by skilled humans. This reducer produces outputs that are, on average, more than 30 times smaller than those produced by the existing reducer that is most commonly used by compiler engineers. Second, this dissertation formulates and addresses the fuzzer taming problem: given a potentially large number of random test cases that trigger failures, order them such that diverse, interesting test cases are highly ranked. Bug triage can be effectively automated, relying on techniques from machine learning to suppress duplicate bug-triggering test cases and test cases triggering known bugs. An evaluation shows the ability of this tool to solve the fuzzer taming problem for 3,799 test cases triggering 46 bugs in a C compiler

The University of Utah: J. Willard Marriott Digital Library

UniASM: Binary Code Similarity Detection without Fine-tuning

Author: Gu Yeming
Hu Fan
Shu Hui
Publication venue
Publication date: 07/11/2022
Field of study

Binary code similarity detection (BCSD) is widely used in various binary analysis tasks such as vulnerability search, malware detection, clone detection, and patch analysis. Recent studies have shown that the learning-based binary code embedding models perform better than the traditional feature-based approaches. In this paper, we proposed a novel transformer-based binary code embedding model, named UniASM, to learn representations of the binary functions. We designed two new training tasks to make the spatial distribution of the generated vectors more uniform, which can be used directly in BCSD without any fine-tuning. In addition, we proposed a new tokenization approach for binary functions, increasing the token's semantic information while mitigating the out-of-vocabulary (OOV) problem. The experimental results show that UniASM outperforms state-of-the-art (SOTA) approaches on the evaluation dataset. We achieved the average scores of recall@1 on cross-compilers, cross-optimization-levels and cross-obfuscations are 0.72, 0.63, and 0.77, which is higher than existing SOTA baselines. In a real-world task of known vulnerability searching, UniASM outperforms all the current baselines.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Dynamic analysis for concurrent modern C/C++ applications

Author: Lidbury Christopher David
Publication venue: Computing, Imperial College London
Publication date: 01/04/2020
Field of study

Concurrent programs are executed by multiple threads that run simultaneously. While this allows programs to run more efficiently by utilising multiple processors, it brings with it numerous complications. For example, a program may behave unpredictably or erroneously when multiple threads modify the same memory location in an uncoordinated manner. Issues such as this are difficult to avoid, and when introduced, can break the program in unpredictable ways. Programmers will therefore often turn towards automated tools to aide in the detection of concurrency bugs. The work presented in this thesis aims to provide methods to aid in the creation of tools for the purpose of finding and explaining concurrency bugs. In particular, the following studies have been conducted: Dynamic Race Detection for C/C++11 With the introduction of a weak memory model in C++, many tools that provide dynamic race detection have become outdated, and are unable to adequately identify data races. This work updates an existing data race detection algorithm such that it can identify data races according to this new definition. A method for allowing programs to explore many of the weak behaviours that this new memory model permits is also provided. Record and Replay Much work has gone into record and replay, however, most of this work is focussed on whole system replay, whereby a tool will aim to record as much of the program execution as possible. Contrasting this, the work presented here aims to record as little as possible. This sparse approach has many interesting implications: some programs that were previously out of reach for record and reply become tractable, and vice versa. To back this up, controlled scheduling is introduced that is capable of applying different scheduling strategies, which combined with the record and replay is beneficial for helping to root out bugs. Tool Support Both of the above techniques have been implemented in a tool, tsan11rec, that builds on the tsan dynamic race detection tool. A large experimental evaluation is presented investigating the effectiveness of the enhanced data race detection algorithm when applied to the Firefox and Chromium web browsers, and of the novel approach to record and replay when applied to a diverse set of concurrent applications.Open Acces

Spiral - Imperial College Digital Repository

Mutation Testing Advances: An Analysis and Survey

Author: Abraham
Abreu
Adra
Ahmed
Ahmed
Aichernig
Aichernig
Aichernig
Aichernig
Aichernig
Aichernig
Al-Hajjaji
Alberto
Alipour
Ammann
Ammann
Anand
Anand
Anbalagan
Andrews
Andrews
Andrews
Andrés
Andrés
Andrés
Aranega
Arcaini
Arcaini
Arcaini
Arcaini
Arcaini
Arcaini
Arcuri
Arcuri
Ayari
Aydal
Baker
Bardin
Bardin
Barr
Bartel
Baudry
Belli
Belli
Belli
Belli
Bertolino
Bertolino
Binder
Binkley
Black
Bottaci
Boubeta-Puig
Bowes
Bradbury
Briand
Brodersen
Brown
Cadar
Chandra
Chekam
Chekam
Chen
Ciupa
Coles
Dadeau
Dadeau
Dan
Dan
Debroy
Debroy
Delamare
Delamare
Delamaro
Delamaro
Delamaro
Delamaro
Delamaro
Delgado-Pérez
Delgado-Pérez
DeMillo
DeMillo
DeMillo
DeMillo
Deng
Deng
Derezinska
Derezińska
Devroey
Devroey
Devroey
Devroey
Do
Dobolyi
Domínguez-Jiménez
Durelli
Durelli
Durães
El-Fakih
El-Fakih
Ellims
Elrakaiby
Enoiu
Estero-Botaro
Estero-Botaro
Fabbri
Feng
Fernandes
Ferrari
Filho
Filho
Foster
Frankl
Frankl
Frankl
Frankl
Fraser
Fraser
Fraser
Fraser
Fraser
Fraser
Galeotti
Garvin
Gay
Geist
Gligoric
Gligoric
Gligoric
Gligoric
Gligoric
Gligoric
Gong
Gong
Goodenough
Gopinath
Gopinath
Gopinath
Gopinath
Gopinath
Gopinath
Gopinath
Groce
Grün
Guan
Hamlet
Hao
Hariri
Harman
Harman
Harman
Harman
Hassan
Henard
Henard
Henard
Henard
Henard
Holling
Hong
Hong
Howden
Hu
Hwang
Iida
Inozemtseva
Inozemtseva
Jabbarvand
Jagannath
Jahangirova
Jamrozik
Jia
Jia
Jia
Jia
Jia
Just
Just
Just
Just
Just
Just
Just
Just
Kakarla
Kaminski
Kaminski
Kaminski
Kapfhammer
Kaplan
Khan
Kim
Kim
King
Kintis
Kintis
Kintis
Kintis
Kintis
Kintis
Kintis
Kintis
Kintis
Knauth
Krenn
Kurtz
Kurtz
Kurtz
Kurtz
Kusano
Lakehal
Langdon
Langdon
Larsen
Laurent
Laurent
Le
Le Goues
Le Goues
Lelli
Li
Li
Linares-Vásquez
Lindström
Lindström
Lisper
Loise
Long
Lou
Ma
Ma
Ma
Madeyski
Madeyski
Madiraju
Maezawa
Mahajan
Marcozzi
Marcozzi
Maruchi
Mateo
Mateo
Mateo
Mateo
Mateo
Matinnejad
Mirshokraie
Mirshokraie
Mirshokraie
Mirshokraie
Moon
Moore
Morell
Mouelhi
Mouelhi
Murtaza
Murtaza
Musco
Márki
Nam
Namin
Namin
Namin
Namin
Nanavati
Nardo
Nguyen
Nguyen
Nica
Ocariza
Offutt
Offutt
Offutt
Offutt
Offutt
Offutt
Offutt
Oliveira
Omar
Omar
Omar
Omar
Omar
Pankumhang
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Papadakis
Parsai
Parsai
Patrick
Patrick
Patrick
Patrick
Patrick
Patrick
Petke
Pill
Polo
Praphamontripong
Praphamontripong
Praphamontripong
Rajan
Ramler
Riener
Rojas
Rojas
Rothermel
Roy
Rutherford
Saifan
Schirp
Schuler
Schuler
Schuler
Schuler
Schwarz
Shi
Shin
Shin
Silva
Simao
Souza
Souza
Sridharan
Staats
Stephan
Stephan
Su
Sullivan
Sun
Svajlenko
Tai
Tai
Tan
Tan
Tengeri
Tisi
Tokumoto
Trakhtenbrot
Trakhtenbrot
Troya
Tuya
Tuya
Untch
Usaola
Usaola
Visser
Voas
Walsh
Wang
Wei
Weiglhofer
Weimer
Weimer
Winbladh
Wotawa
Wright
Wu
Wu
Xie
Xu
Yao
Ye
Yoshida
Yoshida
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhou
Zhou
Zhou
Zhou
Zhu
Publication venue
Publication date: 01/01/2019
Field of study

Crossref

Open Repository and Bibliography - Luxembourg