12 research outputs found
Convicted by memory: Automatically recovering spatial-temporal evidence from memory images
Memory forensics can reveal “up to the minute” evidence of a device’s usage, often without requiring a suspect’s password to unlock the device, and it is oblivious to any persistent storage encryption schemes, e.g., whole disk encryption. Prior to my work, researchers and investigators alike considered data-structure recovery the ultimate goal of memory image forensics. This, however, was far from sufficient, as investigators were still largely unable to understand the content of the recovered evidence, and hence efficiently locating and accurately analyzing such evidence locked in memory images remained an open research challenge.
In this dissertation, I propose breaking from traditional data-recovery-oriented forensics, and instead I present a memory forensics framework which leverages program analysis to automatically recover spatial-temporal evidence from memory images by understanding the programs that generated it. This framework consists of four techniques, each of which builds upon the discoveries of the previous, that represent this new paradigm of program-analysis-driven memory forensics. First, I present DSCRETE, a technique which reuses a program’s own interpretation and rendering logic to recover and present in-memory data structure contents. Following that, VCR developed vendor-generic data structure identification for the recovery of in-memory photographic evidence produced by an Android device’s cameras. GUITAR then realized an app-independent technique which automatically reassembles and redraws an app’s GUI from the multitude of GUI data elements found in a smartphone’s memory image. Finally, different from any traditional memory forensics technique, RetroScope introduced the vision of spatial-temporal memory forensics by retargeting an Android app’s execution to recover sequences of previous GUI screens, in their original temporal order, from a memory image. This framework, and the new program analysis techniques which enable it, have introduced encryption-oblivious forensics capabilities far exceeding traditional data-structure recovery
Recommended from our members
Making Data Storage Efficient in the Era of Cloud Computing
We enter the era of cloud computing in the last decade, as many paradigm shifts are happening on how people write and deploy applications. Despite the advancement of cloud computing, data storage abstractions have not evolved much, causing inefficiencies in performance, cost, and security.
This dissertation proposes a novel approach to make data storage efficient in the era of cloud computing by building new storage abstractions and systems that bridge the gap between cloud computing and data storage and simplify development. We build four systems to address four data inefficiencies in cloud computing.
The first system, Grandet, solves the data storage inefficiency caused by the paradigm shift from upfront provisioning to a variety of pay-as-you-go cloud services. Grandet is an extensible storage system that significantly reduces storage costs for web applications deployed in the cloud. Under the hood, it supports multiple heterogeneous stores and unifies them by placing each data object at the store deemed most economical. Our results show that Grandet reduces their costs by an average of 42.4%, and it is fast, scalable, and easy to use.
The second system, Unic, solves the data inefficiency caused by the paradigm shift from single-tenancy to multi-tenancy. Unic securely deduplicates general computations. It exports a cache service that allows cloud applications running on behalf of mutually distrusting users to memoize and reuse computation results, thereby improving performance. Unic achieves both integrity and secrecy through a novel use of code attestation, and it provides a simple yet expressive API that enables applications to deduplicate their own rich computations. Our results show that Unic is easy to use, speeds up applications by an average of 7.58x, and with little storage overhead.
The third system, Lambdata, solves the data inefficiency caused by the paradigm shift to serverless computing, where developers only write core business logic, and cloud service providers maintain all the infrastructure. Lambdata is a novel serverless computing system that enables developers to declare a cloud function's data intents, including both data read and data written. Once data intents are made explicit, Lambdata performs a variety of optimizations to improve speed, including caching data locally and scheduling functions based on code and data locality. Our results show that Lambdata achieves an average speedup of 1.51x on the turnaround time of practical workloads and reduces monetary cost by 16.5%.
The fourth system, CleanOS, solves the data inefficiency caused by the paradigm shift from desktop computers to smartphones always connected to the cloud. CleanOS is a new Android-based operating system that manages sensitive data rigorously and maintains a clean environment at all times. It identifies and tracks sensitive data, encrypts it with a key, and evicts that key to the cloud when the data is not in active use on the device. Our results show that CleanOS limits sensitive-data exposure drastically while incurring acceptable overheads on mobile networks
A Software Vulnerabilities Odysseus: Analysis, Detection, and Mitigation
Programming has become central in the development of human activities while not
being immune to defaults, or bugs. Developers have developed specific methods and
sequences of tests that they implement to prevent these bugs from being deployed in
releases. Nonetheless, not all cases can be thought through beforehand, and automation
presents limits the community attempts to overcome. As a consequence, not all bugs
can be caught.
These defaults are causing particular concerns in case bugs can be exploited to
breach the program’s security policy. They are then called vulnerabilities and provide
specific actors with undesired access to the resources a program manages. It damages
the trust in the program and in its developers, and may eventually impact the adoption
of the program. Hence, to attribute a specific attention to vulnerabilities appears as a
natural outcome. In this regard, this PhD work targets the following three challenges:
(1) The research community references those vulnerabilities, categorises them, reports
and ranks their impact. As a result, analysts can learn from past vulnerabilities in
specific programs and figure out new ideas to counter them. Nonetheless, the resulting
quality of the lessons and the usefulness of ensuing solutions depend on the quality and
the consistency of the information provided in the reports.
(2) New methods to detect vulnerabilities can emerge among the teachings this
monitoring provides. With responsible reporting, these detection methods can provide
hardening of the programs we rely on. Additionally, in a context of computer perfor-
mance gain, machine learning algorithms are increasingly adopted, providing engaging
promises.
(3) If some of these promises can be fulfilled, not all are not reachable today.
Therefore a complementary strategy needs to be adopted while vulnerabilities evade
detection up to public releases. Instead of preventing their introduction, programs can
be hardened to scale down their exploitability. Increasing the complexity to exploit
or lowering the impact below specific thresholds makes the presence of vulnerabilities
an affordable risk for the feature provided. The history of programming development
encloses the experimentation and the adoption of so-called defence mechanisms. Their
goals and performances can be diverse, but their implementation in worldwide adopted
programs and systems (such as the Android Open Source Project) acknowledges their
pivotal position.
To face these challenges, we provide the following contributions:
• We provide a manual categorisation of the vulnerabilities of the worldwide adopted
Android Open Source Project up to June 2020. Clarifying to adopt a vulnera-
bility analysis provides consistency in the resulting data set. It facilitates the
explainability of the analyses and sets up for the updatability of the resulting
set of vulnerabilities. Based on this analysis, we study the evolution of AOSP’s
vulnerabilities. We explore the different temporal evolutions of the vulnerabilities affecting the system for their severity, the type of vulnerability, and we provide a
focus on memory corruption-related vulnerabilities.
• We undertake the replication of a machine-learning based detection algorithms
that, besides being part of the state-of-the-art and referenced to by ensuing works,
was not available. Named VCCFinder, this algorithm implements a Support-
Vector Machine and bases its training on Vulnerability-Contributing Commits
and related patches for C and C++ code. Not in capacity to achieve analogous
performances to the original article, we explore parameters and algorithms, and
attempt to overcome the challenge provided by the over-population of unlabeled
entries in the data set. We provide the community with our code and results as a
replicable baseline for further improvement.
• We eventually list the defence mechanisms that the Android Open Source Project
incrementally implements, and we discuss how it sometimes answers comments
the community addressed to the project’s developers. We further verify the extent
to which specific memory corruption defence mechanisms were implemented in the
binaries of different versions of Android (from API-level 10 to 28). We eventually
confront the evolution of memory corruption-related vulnerabilities with the
implementation timeline of related defence mechanisms
Managing Smartphone Testbeds with SmartLab
The explosive number of smartphones with ever growing sensing and computing capabilities have brought a paradigm shift to many traditional domains of the computing field. Re-programming smartphones and instrumenting them for application testing and data gathering at scale is currently a tedious and time-consuming process that poses significant logistical challenges. In this paper, we make three major contributions: First, we propose a comprehensive architecture, coined SmartLab1, for managing a cluster of both real and virtual smartphones that are either wired to a private cloud or connected over a wireless link. Second, we propose and describe a number of Android management optimizations (e.g., command pipelining, screen-capturing, file management), which can be useful to the community for building similar functionality into their systems. Third, we conduct extensive experiments and microbenchmarks to support our design choices providing qualitative evidence on the expected performance of each module comprising our architecture. This paper also overviews experiences of using SmartLab in a research-oriented setting and also ongoing and future development efforts
The Multimodal Tutor: Adaptive Feedback from Multimodal Experiences
This doctoral thesis describes the journey of ideation, prototyping and empirical testing of the Multimodal Tutor, a system designed for providing digital feedback that supports psychomotor skills acquisition using learning and multimodal data capturing. The feedback is given in real-time with machine-driven assessment of the learner's task execution. The predictions are tailored by supervised machine learning models trained with human annotated samples. The main contributions of this thesis are: a literature survey on multimodal data for learning, a conceptual model (the Multimodal Learning Analytics Model), a technological framework (the Multimodal Pipeline), a data annotation tool (the Visual Inspection Tool) and a case study in Cardiopulmonary Resuscitation training (CPR Tutor). The CPR Tutor generates real-time, adaptive feedback using kinematic and myographic data and neural networks
Computer Aided Verification
This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications