81 research outputs found
FORMALLY ANALYZING AND VERIFYING SECURE SYSTEM DESIGN AND IMPLEMENTATION
Ph.DDOCTOR OF PHILOSOPH
Towards using concurrent Java API correctly
Concurrent Programs are hard to analyze or debug due to the complex program logic and unpredictable execution environment. In practice, ordinary programmers often adopt existing well-designed concurrency related API (e.g., those in java.util.concurrent) so as to avoid dealing with these issues. These API can however often be used incorrectly, which results in hardto-debug concurrent bugs. In this work, we propose an approach for enforcing the correct usage of concurrency-related Java API. Our idea is to annotate concurrency-related Java classes with annotations related to misuse of these API and develop lightweight type checker to detect concurrent API misuse based on the annotations. To automate this process, we need to solve two problems: (1) how do we obtain annotations of the relevant API; and (2) how do we systematically detect concurrent API misuse based on the annotations? We solve the first problem by extracting annotations from the API documentation using natural language processing techniques. We solve the second problem by implementing our type checkers in the Checker Framework to detect concurrent API misuse. We apply our approach to extract annotations for all classes in the Java standard library and use them to detect concurrent API misuse in open source projects on GitHub. We confirm that concurrent API misuse is common and often results in bugs or inefficiency.No Full Tex
Paoding: Supervised Robustness-preserving Data-free Neural Network Pruning
When deploying pre-trained neural network models in real-world applications,
model consumers often encounter resource-constraint platforms such as mobile
and smart devices. They typically use the pruning technique to reduce the size
and complexity of the model, generating a lighter one with less resource
consumption. Nonetheless, most existing pruning methods are proposed with a
premise that the model after being pruned has a chance to be fine-tuned or even
retrained based on the original training data. This may be unrealistic in
practice, as the data controllers are often reluctant to provide their model
consumers with the original data. In this work, we study the neural network
pruning in the \emph{data-free} context, aiming to yield lightweight models
that are not only accurate in prediction but also robust against undesired
inputs in open-world deployments. Considering the absence of the fine-tuning
and retraining that can fix the mis-pruned units, we replace the traditional
aggressive one-shot strategy with a conservative one that treats the pruning as
a progressive process. We propose a pruning method based on stochastic
optimization that uses robustness-related metrics to guide the pruning process.
Our method is implemented as a Python package named \textsc{Paoding} and
evaluated with a series of experiments on diverse neural network models. The
experimental results show that it significantly outperforms existing one-shot
data-free pruning approaches in terms of robustness preservation and accuracy
MalModel: Hiding Malicious Payload in Mobile Deep Learning Models with Black-box Backdoor Attack
Mobile malware has become one of the most critical security threats in the
era of ubiquitous mobile computing. Despite the intensive efforts from security
experts to counteract it, recent years have still witnessed a rapid growth of
identified malware samples. This could be partly attributed to the
newly-emerged technologies that may constantly open up under-studied attack
surfaces for the adversaries. One typical example is the recently-developed
mobile machine learning (ML) framework that enables storing and running deep
learning (DL) models on mobile devices. Despite obvious advantages, this new
feature also inadvertently introduces potential vulnerabilities (e.g.,
on-device models may be modified for malicious purposes). In this work, we
propose a method to generate or transform mobile malware by hiding the
malicious payloads inside the parameters of deep learning models, based on a
strategy that considers four factors (layer type, layer number, layer coverage
and the number of bytes to replace). Utilizing the proposed method, we can run
malware in DL mobile applications covertly with little impact on the model
performance (i.e., as little as 0.4% drop in accuracy and at most 39ms latency
overhead).Comment: Due to the limitation "The abstract field cannot be longer than 1,920
characters", the abstract here is shorter than that in the PDF fil
Break the dead end of dynamic slicing: localizing data and control omission bug
Dynamic slicing is a common way of identifying the root cause when a program fault is revealed. With the dynamic slicing technique, the programmers can follow data and control flow along the program execution trace to the root cause. However, the technique usually fails to work on omission bugs, i.e., the faults which are caused by missing executing some code. In many cases, dynamic slicing over-skips the root cause when an omission bug happens, leading the debugging process to a dead end. In this work, we conduct an empirical study on the omission bugs in the Defects4J bug repository. Our study shows that (1) omission bugs are prevalent (46.4%) among all the studied bugs; (2) there are repeating patterns on causes and fixes of the omission bugs; (3) the patterns of fixing omission bugs serve as a strong hint to break the slicing dead end. Based on our findings, we train a neural network model on the omission bugs in Defects4J repository to recommend where to approach when slicing can no long work. We conduct an experiment by applying our approach on 3193 mutated omission bugs which slicing fails to locate. The results show that our approach outperforms random benchmark on breaking the dead end and localizing the mutated omission bugs (63.8% over 2.8%).No Full Tex
- …