13 research outputs found
Using a decompiler for real-world source recovery
Despite their 40 year history, native executable decompilers have found very limited practical application in commercial projects. The success of Java decompilers is well known, and a few decompilers perform well by recognising patterns from specific compilers. This paper describes the experience gained from applying a native executable decompiler, assisted by a commercial disassembler and hand editing, to a real-world Windows-based application. The clients had source code for a prototype version of the program, and an executable that performed better, for which the source code was not available. The project was to recover the algorithm at the core of the program, and if time permitted, the recovery of other pieces of source code. Despite the difficulties, the core algorithm was successfully decompiled, and a portion of the rest of the program as well. There were surprises, including the ability to recover almost all original class names, and the complete class hierarchy
Precise static analysis of untrusted driver binaries
Most closed source drivers installed on desktop systems today have never been exposed to formal analysis. Without vendor support, the only way to make these often hastily written, yet critical programs accessible to static analysis is to directly work at the binary level. In this paper, we describe a full architecture to perform static analysis on binaries that does not rely on unsound external components such as disassemblers. To precisely calculate data and function pointers without any type information, we introduce Bounded Address Tracking, an abstract domain that is tailored towards machine code and is path sensitive up to a tunable bound assuring termination. We implemented Bounded Address Tracking in our binary analysis platform Jakstab and used it to verify API specifications on several Windows device drivers. Even without assumptions about executable layout and procedures as made by state of the art approaches, we achieve more precise results on a set of drivers from the Windows DDK. Since our technique does not require us to compile drivers ourselves, we also present results from analyzing over 300 closed source drivers
Rapid Parallelization by Collaboration
The widespread adoption of Chip Multiprocessors has renewed the emphasis on the use of parallelism to improve performance. The present and growing diversity in hardware architectures and software environments, however, continues to pose difficulties in the effective use of parallelism thus delaying a quick and smooth transition to the concurrency era. In this document, we describe the research being conducted at the Computer Science Department at Columbia University on a system called COMPASS that aims to simplify this transition by providing advice to programmers considering parallelizing their code. The advice proffered to the programmer is based on the wisdom collected from programmers who have already parallelized some code. The utility of COMPASS rests, not only on its ability to collect the wisdom unintrusively but also on its ability to automatically seek, find and synthesize this wisdom into advice that is tailored to the code the user is considering parallelizing and to the environment in which the optimized program will execute in. COMPASS provides a platform and an extensible framework for sharing human expertise about code parallelization -- widely and on diverse hardware and software. By leveraging the "Wisdom of Crowds" model which has been conjunctured to scale exponentially and which has successfully worked for Wikis, COMPASS aims to enable rapid parallelization of code and thus continue to extend the benefits for Moore's law scaling to science and society
An Abstract Interpretation-Based Framework for Control Flow Reconstruction from Binaries
Due to indirect branch instructions, analyses on executables commonly suffer from the problem that a complete control flow graph of the program is not available. Data flow analysis has been proposed before to statically determine branch targets in many cases, yet a generic strategy without assumptions on compiler idioms or debug information is lacking. We have devised an abstract interpretation-based framework for generic low level programs with indirect jumps which safely combines a pluggable abstract domain with the notion of partial control flow graphs. Using our framework, we are able to show that the control flow reconstruction algorithm of our disassembly tool Jakstab produces the most precise overapproximation of the control flow graph with respect to the used abstract domain
Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets
Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to identify authorship style. Our survey explores malicious author style and the adversarial techniques used by them to remain anonymous. We examine the adversarial impact on the state-of-the-art methods. We identify key findings and explore the open research challenges. To mitigate the lack of ground truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 15,660 malware labeled to 164 threat actor groups
Legal Aspects of Software Interoperability
Práce se zabĂ˝vá právnĂmi aspekty interoperability software. Interoperability software je dosahováno pomocĂ rozhranĂ, kterĂ© je součástĂ poÄŤĂtaÄŤovĂ©ho programu. PoÄŤĂtaÄŤovĂ˝ program spadá pod ochranu autorskĂ©ho práva. Tato práce se zabĂ˝vá rozsahem autorskoprávnĂ ochrany rozhranĂ a moĹľnostmi pĹ™Ăstupu k nechránÄ›nĂ˝m rozhranĂm. V práci jsou poměřovány zájmy autora na ochranu dĂla proti zájmĹŻm veĹ™ejnĂ˝m a navrhnuta Ĺ™ešenĂ, která by vedla k zajištÄ›nĂ vÄ›tšà mĂry interoperability, respektive sdĂlenĂ informacĂ nezbytnĂ˝ch pro jejĂ zajištÄ›nĂ.This work addresses the legal aspects of software interoperability. Interoperability of software is achieved through an interface that is part of a computer program. The computer program is protected by copyright. The work deals with the scope of interface protection and the possibilities of access to unprotected interfaces. It measures the interests of the author in protecting the work against the public interests and proposes solutions that would be able to provide a greater degree of interoperability or sharing the information necessary to ensure it
Trusted execution: applications and verification
Useful security properties arise from sealing data to specific units of code. Modern processors featuring Intel’s TXT and AMD’s SVM achieve this by a process of measured and trusted execution. Only code which has the correct measurement can access the data, and this code runs in an environment trusted from observation and interference.
We discuss the history of attempts to provide security for hardware platforms, and review the literature in the field. We propose some applications which would benefit from use of trusted execution, and discuss functionality enabled by trusted execution. We present in more detail a novel variation on Diffie-Hellman key exchange which removes some reliance on random number generation.
We present a modelling language with primitives for trusted execution, along with its semantics. We characterise an attacker who has access to all the capabilities of the hardware. In order to achieve automatic analysis of systems using trusted execution without attempting to search a potentially infinite state space, we define transformations that reduce the number of times the attacker needs to use trusted execution to a pre-determined bound. Given reasonable assumptions we prove the soundness of the transformation: no secrecy attacks are lost by applying it. We then describe using the StatVerif extensions to ProVerif to model the bounded invocations of trusted execution. We show the analysis of realistic systems, for which we provide case studies