2,415 research outputs found
Recommended from our members
Software lock elision for x86 machine code
More than a decade after becoming a topic of intense research there is no
transactional memory hardware nor any examples of software transactional memory
use outside the research community. Using software transactional memory in large
pieces of software needs copious source code annotations and often means
that standard compilers and debuggers can no longer be used. At the same time,
overheads associated with software transactional memory fail to motivate
programmers to expend the needed effort to use software transactional
memory. The only way around the overheads in the case of general unmanaged code
is the anticipated availability of hardware support. On the other hand, architects
are unwilling to devote power and area budgets in mainstream microprocessors to
hardware transactional memory, pointing to transactional memory being a
"niche" programming construct. A deadlock has thus ensued that is blocking
transactional memory use and experimentation in the mainstream.
This dissertation covers the design and construction of a software transactional
memory runtime system called SLE_x86 that can potentially break this
deadlock by decoupling transactional memory from programs using it. Unlike most
other STM designs, the core design principle is transparency rather than
performance. SLE_x86 operates at the level of x86 machine code, thereby
becoming immediately applicable to binaries for the popular x86
architecture. The only requirement is that the binary synchronise using known
locking constructs or calls such as those in Pthreads or OpenMP
libraries. SLE_x86 provides speculative lock elision (SLE) entirely in
software, executing critical sections in the binary using transactional
memory. Optionally, the critical sections can also be executed without using
transactions by acquiring the protecting lock.
The dissertation makes a careful analysis of the impact on performance due to
the demands of the x86 memory consistency model and the need to transparently
instrument x86 machine code. It shows that both of these problems can be
overcome to reach a reasonable level of performance, where transparent
software transactional memory can perform better than a lock. SLE_x86 can
ensure that programs are ready for transactional memory in any form, without
being explicitly written for it
C-FLAT: Control-FLow ATtestation for Embedded Systems Software
Remote attestation is a crucial security service particularly relevant to
increasingly popular IoT (and other embedded) devices. It allows a trusted
party (verifier) to learn the state of a remote, and potentially
malware-infected, device (prover). Most existing approaches are static in
nature and only check whether benign software is initially loaded on the
prover. However, they are vulnerable to run-time attacks that hijack the
application's control or data flow, e.g., via return-oriented programming or
data-oriented exploits. As a concrete step towards more comprehensive run-time
remote attestation, we present the design and implementation of Control- FLow
ATtestation (C-FLAT) that enables remote attestation of an application's
control-flow path, without requiring the source code. We describe a full
prototype implementation of C-FLAT on Raspberry Pi using its ARM TrustZone
hardware security extensions. We evaluate C-FLAT's performance using a
real-world embedded (cyber-physical) application, and demonstrate its efficacy
against control-flow hijacking attacks.Comment: Extended version of article to appear in CCS '16 Proceedings of the
23rd ACM Conference on Computer and Communications Securit
Link-time smart card code hardening
This paper presents a feasibility study to protect smart card software against fault-injection attacks by means of link-time code rewriting. This approach avoids the drawbacks of source code hardening, avoids the need for manual assembly writing, and is applicable in conjunction with closed third-party compilers. We implemented a range of cookbook code hardening recipes in a prototype link-time rewriter and evaluate their coverage and associated overhead to conclude that this approach is promising. We demonstrate that the overhead of using an automated link-time approach is not significantly higher than what can be obtained with compile-time hardening or with manual hardening of compiler-generated assembly code
Aikido: Accelerating shared data dynamic analyses
Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the lack of hardware support for inspecting the complex behavior of these parallel programs. Inter-thread communication, which must be instrumented for many types of analyses, may occur with any memory operation. To detect such thread communication in software, many existing tools require the instrumentation of all memory operations, which leads to significant performance overheads. To reduce this overhead, some existing tools resort to random sampling of memory operations, which introduces false negatives. Unfortunately, neither of these approaches provide the speed and accuracy programmers have traditionally expected from their tools. In this work, we present Aikido, a new system and framework that enables the development of efficient and transparent analyses that operate on shared data. Aikido uses a hybrid of existing hardware features and dynamic binary rewriting to detect thread communication with low overhead. Aikido runs a custom hypervisor below the operating system, which exposes per-thread hardware protection mechanisms not available in any widely used operating system. This hybrid approach allows us to benefit from the low cost of detecting memory accesses with hardware, while maintaining the word-level accuracy of a software-only approach. To evaluate our framework, we have implemented an Aikido-enabled vector clock race detector. Our results show that the Aikido enabled race-detector outperforms existing techniques that provide similar accuracy by up to 6.0x, and 76% on average, on the PARSEC benchmark suite.National Science Foundation (U.S.) (NSF grant CCF-0832997)National Science Foundation (U.S.) (DOE SC0005288)United States. Defense Advanced Research Projects Agency (DARPA HR0011-10- 9-0009
Automatic binary patching for flaws repairing using static rewriting and reverse dataflow analysis
Tese de Mestrado, Segurança Informática, 2022, Universidade de Lisboa, Faculdade de CiênciasThe C programming language is widely used in embedded systems, kernel and hardware programming, making it one of the most commonly used programming languages. However, C lacks of boundary verification of variables, making it one of the most vulnerable languages. Because of this and associated with its high usability, it is also the language with most reported vulnerabilities in the past ten years, being the memory corruption the most common type of vulnerabilities, specifically buffer overflows. These vulnerabilities when exploited can produce critical consequences, being thus extremely important not only to correctly identify these vulnerabilities but also to properly fix them. This work aims to study buffer overflow vulnerabilities in C binary programs by identifying possible malicious inputs that can trigger such vulnerabilities and finding their root
cause in order to mitigate the vulnerabilities by rewriting the binary assembly code and thus generating a new binary without the original flaw. The main focus of this thesis is the use of binary patching to automatically fix stack overflow vulnerabilities and validate its effectiveness while ensuring that these do not add new
vulnerabilities. Working with the binary code of applications and without accessing their source code is a challenge because any required change to its binary code (i.e, assembly) needs to take into consideration that new instructions must be allocated, and this typically means that existing instructions will need to be moved to create room for new ones and recover the control flow information, otherwise the application would be compromised. The approach we propose to address this problem was successfully implemented in a tool
and evaluated with a set of test cases and real applications. The evaluation results showed that the tool was effective in finding vulnerabilities, as well as in patching them
Synthesizing Short-Circuiting Validation of Data Structure Invariants
This paper presents incremental verification-validation, a novel approach for
checking rich data structure invariants expressed as separation logic
assertions. Incremental verification-validation combines static verification of
separation properties with efficient, short-circuiting dynamic validation of
arbitrarily rich data constraints. A data structure invariant checker is an
inductive predicate in separation logic with an executable interpretation; a
short-circuiting checker is an invariant checker that stops checking whenever
it detects at run time that an assertion for some sub-structure has been fully
proven statically. At a high level, our approach does two things: it statically
proves the separation properties of data structure invariants using a static
shape analysis in a standard way but then leverages this proof in a novel
manner to synthesize short-circuiting dynamic validation of the data
properties. As a consequence, we enable dynamic validation to make up for
imprecision in sound static analysis while simultaneously leveraging the static
verification to make the remaining dynamic validation efficient. We show
empirically that short-circuiting can yield asymptotic improvements in dynamic
validation, with low overhead over no validation, even in cases where static
verification is incomplete
How to Do a Million Watchpoints: Efficient Debugging Using Dynamic Instrumentation
Application debugging is a tedious but inevitable chore in any software development project. An effective debugger can make programmers more productive by allowing them to pause execution and inspect the state of the process, or monitor writes to memory to detect data corruption. The latter is a notoriously difficult category of bugs to diagnose and repair especially in pointer-heavy applications. The debugging challenges will increase with the arrival of multicore processors which require explicit parallelization of the user code to get any performance gains. Parallelization in turn can lead to more data debugging issues such as the detection of data races between threads. This paper leverages the increasing efficiency of runtime binary interpreters to provide a new concept of Efficient Debugging using Dynamic Instrumentation, or EDDI. The paper demonstrates for the first time the feasibility of using dynamic instrumentation on demand to accelerate software debuggers, especially when the available hardware support is lacking or inadequate. As an example, EDDI can simultaneously monitor millions of memory locations, without crippling the host processing platform. It does this in software and hence provides a portable debugging environment. It is also well suited for interactive debugging because of the low associated overheads. EDDI provides a scalable and extensible debugging framework that can substantially increase the feature set of standard off the shelf debuggers.Singapore-MIT Alliance (SMA
- …