11 research outputs found
CARVE: Practical Security-Focused Software Debloating Using Simple Feature Set Mappings
Software debloating is an emerging field of study aimed at improving the
security and performance of software by removing excess library code and
features that are not needed by the end user (called bloat). Software bloat is
pervasive, and several debloating techniques have been proposed to address this
problem. While these techniques are effective at reducing bloat, they are not
practical for the average user, risk creating unsound programs and introducing
vulnerabilities, and are not well suited for debloating complex software such
as network protocol implementations. In this paper, we propose CARVE, a simple
yet effective security-focused debloating technique that overcomes these
limitations. CARVE employs static source code annotation to map software
features source code, eliminating the need for advanced software analysis
during debloating and reducing the overall level of technical sophistication
required by the user. CARVE surpasses existing techniques by introducing
debloating with replacement, a technique capable of preserving software
interoperability and mitigating the risk of creating an unsound program or
introducing a vulnerability. We evaluate CARVE in 12 debloating scenarios and
demonstrate security and performance improvements that meet or exceed those of
existing techniques.Comment: 8 pages, 4 figures, 2 tables, 1 appendi
SYSFLOW: Efficient Execution Platform for IoT Devices
Traditional executable delivery models pose challenges for IoT devices with
limited storage, necessitating the download of complete executables and
dependencies. Network solutions like NFS, designed for data files, encounter
high IO overhead for irregular access patterns. This paper introduces SYSFLOW,
a lightweight network-based executable delivery system for IoT. SYSFLOW
delivers on-demand, redirecting local disk IO to the server through optimized
network IO. To optimize cache hit rates, SYSFLOW employs server-side
action-based prefetching, reducing latency by 45.1% to 75.8% compared to native
Linux filesystems on SD cards. In wired environments, SYSFLOW's latency is up
to 67.7% lower than NFS. In wireless scenarios, SYSFLOW performs 22.9% worse
than Linux, comparable with Linux and outperforming NFS by up to 60.7%. While
SYSFLOW's power consumption may be 6.7% higher than NFS, it offers energy
savings due to lower processing time
Automating Seccomp Filter Generation for Linux Applications
Software vulnerabilities in applications undermine the security of
applications. By blocking unused functionality, the impact of potential
exploits can be reduced. While seccomp provides a solution for filtering
syscalls, it requires manual implementation of filter rules for each individual
application. Recent work has investigated automated approaches for detecting
and installing the necessary filter rules. However, as we show, these
approaches make assumptions that are not necessary or require overly
time-consuming analysis.
In this paper, we propose Chestnut, an automated approach for generating
strict syscall filters for Linux userspace applications with lower requirements
and limitations. Chestnut comprises two phases, with the first phase consisting
of two static components, i.e., a compiler and a binary analyzer, that extract
the used syscalls during compilation or in an analysis of the binary. The
compiler-based approach of Chestnut is up to factor 73 faster than previous
approaches without affecting the accuracy adversely. On the binary analysis
level, we demonstrate that the requirement of position-independent binaries of
related work is not needed, enlarging the set of applications for which
Chestnut is usable. In an optional second phase, Chestnut provides a dynamic
refinement tool that allows restricting the set of allowed syscalls further. We
demonstrate that Chestnut on average blocks 302 syscalls (86.5%) via the
compiler and 288 (82.5%) using the binary-level analysis on a set of 18 widely
used applications. We found that Chestnut blocks the dangerous exec syscall in
50% and 77.7% of the tested applications using the compiler- and binary-based
approach, respectively. For the tested applications, Chestnut prevents
exploitation of more than 62% of the 175 CVEs that target the kernel via
syscalls. Finally, we perform a 6 month long-term study of a sandboxed Nginx
server
Coverage-Based Debloating for Java Bytecode
Software bloat is code that is packaged in an application but is actually not
necessary to run the application. The presence of software bloat is an issue
for security, for performance, and for maintenance. In this paper, we introduce
a novel technique for debloating Java bytecode, which we call coverage-based
debloating. We leverage a combination of state-of-the-art Java bytecode
coverage tools to precisely capture what parts of a project and its
dependencies are used at runtime. Then, we automatically remove the parts that
are not covered to generate a debloated version of the compiled project. We
successfully generate debloated versions of 220 open-source Java libraries,
which are syntactically correct and preserve their original behavior according
to the workload. Our results indicate that 68.3% of the libraries' bytecode and
20.5% of their total dependencies can be removed through coverage-based
debloating. Meanwhile, we present the first experiment that assesses the
utility of debloated libraries with respect to client applications that reuse
them. We show that 80.9% of the clients with at least one test that uses the
library successfully compile and pass their test suite when the original
library is replaced by its debloated version
Neural-guidance for symbolic reasoning
Symbolic reasoning begot Artificial Intelligence (AI). With the recent advances in Deep Learning, many traditional AI areas such as Computer Vision and Natural Language Processing have moved to probabilistic-based approaches. However, in applications where there is little to no room for uncertainty, such as Compiler or Software verification, symbolic reasoning is still the go-to option. In this thesis, we bring the advantage of data-driven learnable models into the precise world of symbolic reasoning. In particular, we choose to tackle two specific problems: Model Checking, in the context of Inductive Generalization, and Compiler Optimization, in the context of Software Debloating. We implemented our approach in two tools, named Dopey and DeepOccam, respectively. They both use traces generated from running a task to learn a better heuristic, and use said heuristic to improve subsequent runs of the same or similar tasks. Our results show that both neural-based heuristics outperform handcrafted heuristics