25 research outputs found
Recommended from our members
Adaptive and Effective Fuzzing: a Data-Driven Approach
Security vulnerabilities have a large real-world impact, from ransomware attacks costing billions of dollars every year to sensitive data breaches in government, military and industry. Fuzzing is a popular technique to discover these vulnerabilities in an automated fashion. Industries have poured tons of resources into building large-scale fuzzing factories (e.g., Google’s ClusterFuzz and Microsoft’s OneFuzz) to test their products and make their product more secure. Despite the wide application of fuzzing in industry, there remain many issues constraining its performance. One fundamental limitation is the rule-based design in fuzzing. Rule-based fuzzers heavily rely on a set of static rules or heuristics. These fixed rules are summarized from human experience, hence failing to generalize on a diverse set of programs.
In this dissertation, we present an adaptive and effective fuzzing framework in data-driven approach. A data-driven fuzzer makes decisions based on the analysis and reasoning of data rather than the static rules. Hence it is more adaptive, effective, and flexible than a typical rule-based fuzzer. More interestingly, the data-driven approach can bridge the connection from fuzzing to various data-centric domains (e.g., machine learning, optimizations and social network), enabling sophisticated designs in the fuzzing framework.
A general fuzzing framework consists of two major components: seed scheduling and seed mutation. The seed scheduling module selects a seed from a seed corpus that includes multiple testcases. Then seed mutation module applies perturbation on the selected seed to generate a new testcase. First, we present Neuzz, the first machine learning (ML) based general-purpose fuzzer that adopts ML to seed mutation and greatly improves fuzzing performance. Then we present MTFuzz, a follow-up work of Neuzz by including diverse data into ML to generate effective seed mutations.
In the end, we present K-Scheduler, a fuzzer-agnostic seed scheduling algorithm in data-driven approach. K-Scheduler leverages the graph data (i.e., inter-procedural control flow graph) and dynamic coverage data (i.e., code coverage bitmap) to construct a dynamic graph and schedule seeds by the graph centrality scores on that graph. It can significantly improve the fuzzing performance than the-state-of-art seed schedulers on various fuzzers widely-used in the industry
Fine Grained Dataflow Tracking with Proximal Gradients
Dataflow tracking with Dynamic Taint Analysis (DTA) is an important method in
systems security with many applications, including exploit analysis, guided
fuzzing, and side-channel information leak detection. However, DTA is
fundamentally limited by the Boolean nature of taint labels, which provide no
information about the significance of detected dataflows and lead to false
positives/negatives on complex real world programs.
We introduce proximal gradient analysis (PGA), a novel, theoretically
grounded approach that can track more accurate and fine-grained dataflow
information. PGA uses proximal gradients, a generalization of gradients for
non-differentiable functions, to precisely compose gradients over
non-differentiable operations in programs. Composing gradients over programs
eliminates many of the dataflow propagation errors that occur in DTA and
provides richer information about how each measured dataflow effects a program.
We compare our prototype PGA implementation to three state of the art DTA
implementations on 7 real-world programs. Our results show that PGA can improve
the F1 accuracy of data flow tracking by up to 33% over taint tracking (20% on
average) without introducing any significant overhead (<5% on average). We
further demonstrate the effectiveness of PGA by discovering 22 bugs (20
confirmed by developers) and 2 side-channel leaks, and identifying exploitable
dataflows in 19 existing CVEs in the tested programs.Comment: To appear in USENIX Security 202
MTFuzz: Fuzzing with a Multi-Task Neural Network
Fuzzing is a widely used technique for detecting software bugs and
vulnerabilities. Most popular fuzzers generate new inputs using an evolutionary
search to maximize code coverage. Essentially, these fuzzers start with a set
of seed inputs, mutate them to generate new inputs, and identify the promising
inputs using an evolutionary fitness function for further mutation. Despite
their success, evolutionary fuzzers tend to get stuck in long sequences of
unproductive mutations. In recent years, machine learning (ML) based mutation
strategies have reported promising results. However, the existing ML-based
fuzzers are limited by the lack of quality and diversity of the training data.
As the input space of the target programs is high dimensional and sparse, it is
prohibitively expensive to collect many diverse samples demonstrating
successful and unsuccessful mutations to train the model. In this paper, we
address these issues by using a Multi-Task Neural Network that can learn a
compact embedding of the input space based on diverse training samples for
multiple related tasks (i.e., predicting for different types of coverage). The
compact embedding can guide the mutation process by focusing most of the
mutations on the parts of the embedding where the gradient is high. \tool
uncovers previously unseen bugs and achieves an average of more
edge coverage compared with 5 state-of-the-art fuzzer on 10 real-world
programs.Comment: ACM Joint European Software Engineering Conference and Symposium on
the Foundations of Software Engineering (ESEC/FSE) 202
Enhanced Upconversion Photoluminescence of LiYF4: Yb3+/Ho3+ Crystals by Introducing Mg2+ Ions for Anti-Counterfeiting Recognition
By doping appropriate lanthanide ions, LiYF4 as a host luminescent material can simultaneously exhibit bright visible-light emission. A series of LiYF4:Yb3+/Ho3+ microparticles with different Mg2+ doping concentrations were synthesized and investigated. The crystal structure of the synthesized microparticles was tested by X-ray diffraction (XRD). Notably, a significant increase in the upconversion photoluminescence intensity of upconversion microparticles (UCMPs) was obtained by introducing Mg2+ ions under 980 nm laser excitation, and achieved a maximum level when the concentration of Mg2+ ions was 8 mol%. Additionally, the practicality of the resultant UCMPs used as the raw material of anti-counterfeiting ink was systematically investigated. These results prove that the Mg2+-doped LiYF4:Yb3+/Ho3+ are very promising as screen-printing materials for anti-counterfeiting recognition labels
Enhanced Upconversion Photoluminescence of LiYF<sub>4</sub>: Yb<sup>3+</sup>/Ho<sup>3+</sup> Crystals by Introducing Mg<sup>2+</sup> Ions for Anti-Counterfeiting Recognition
By doping appropriate lanthanide ions, LiYF4 as a host luminescent material can simultaneously exhibit bright visible-light emission. A series of LiYF4:Yb3+/Ho3+ microparticles with different Mg2+ doping concentrations were synthesized and investigated. The crystal structure of the synthesized microparticles was tested by X-ray diffraction (XRD). Notably, a significant increase in the upconversion photoluminescence intensity of upconversion microparticles (UCMPs) was obtained by introducing Mg2+ ions under 980 nm laser excitation, and achieved a maximum level when the concentration of Mg2+ ions was 8 mol%. Additionally, the practicality of the resultant UCMPs used as the raw material of anti-counterfeiting ink was systematically investigated. These results prove that the Mg2+-doped LiYF4:Yb3+/Ho3+ are very promising as screen-printing materials for anti-counterfeiting recognition labels
Predicted Infiltration for Sodic/Saline Soils from Reclaimed Coastal Areas: Sensitivity to Model Parameters
This study was conducted to assess the influences of soil surface conditions and initial soil water content on water movement in unsaturated sodic soils of reclaimed coastal areas. Data was collected from column experiments in which two soils from a Chinese coastal area reclaimed in 2007 (Soil A, saline) and 1960 (Soil B, nonsaline) were used, with bulk densities of 1.4 or 1.5 g/cm3. A 1D-infiltration model was created using a finite difference method and its sensitivity to hydraulic related parameters was tested. The model well simulated the measured data. The results revealed that soil compaction notably affected the water retention of both soils. Model simulations showed that increasing the ponded water depth had little effect on the infiltration process, since the increases in cumulative infiltration and wetting front advancement rate were small. However, the wetting front advancement rate increased and the cumulative infiltration decreased to a greater extent when θ0 was increased. Soil physical quality was described better by the S parameter than by the saturated hydraulic conductivity since the latter was also affected by the physical chemical effects on clay swelling occurring in the presence of different levels of electrolytes in the soil solutions of the two soils
Predicted Infiltration for Sodic/Saline Soils from Reclaimed Coastal Areas: Sensitivity to Model Parameters
This study was conducted to assess the influences of soil surface conditions and initial soil water content on water movement in unsaturated sodic soils of reclaimed coastal areas. Data was collected from column experiments in which two soils from a Chinese coastal area reclaimed in 2007 (Soil A, saline) and 1960 (Soil B, nonsaline) were used, with bulk densities of 1.4 or 1.5 g/cm3. A 1D-infiltration model was created using a finite difference method and its sensitivity to hydraulic related parameters was tested. The model well simulated the measured data. The results revealed that soil compaction notably affected the water retention of both soils. Model simulations showed that increasing the ponded water depth had little effect on the infiltration process, since the increases in cumulative infiltration and wetting front advancement rate were small. However, the wetting front advancement rate increased and the cumulative infiltration decreased to a greater extent when θ0 was increased. Soil physical quality was described better by the S parameter than by the saturated hydraulic conductivity since the latter was also affected by the physical chemical effects on clay swelling occurring in the presence of different levels of electrolytes in the soil solutions of the two soils