3 research outputs found

    Advances in SCA and RF-DNA Fingerprinting Through Enhanced Linear Regression Attacks and Application of Random Forest Classifiers

    Get PDF
    Radio Frequency (RF) emissions from electronic devices expose security vulnerabilities that can be used by an attacker to extract otherwise unobtainable information. Two realms of study were investigated here, including the exploitation of 1) unintentional RF emissions in the field of Side Channel Analysis (SCA), and 2) intentional RF emissions from physical devices in the field of RF-Distinct Native Attribute (RF-DNA) fingerprinting. Statistical analysis on the linear model fit to measured SCA data in Linear Regression Attacks (LRA) improved performance, achieving 98% success rate for AES key-byte identification from unintentional emissions. However, the presence of non-Gaussian noise required the use of a non-parametric classifier to further improve key guessing attacks. RndF based profiling attacks were successful in very high dimensional data sets, correctly guessing all 16 bytes of the AES key with a 50,000 variable dataset. With variable reduction, Random Forest still outperformed Template Attack for this data set, requiring fewer traces and achieving higher success rates with lower misclassification rate. Finally, the use of a RndF classifier is examined for intentional RF emissions from ZigBee devices to enhance security using RF-DNA fingerprinting. RndF outperformed parametric MDA/ML and non-parametric GRLVQI classifiers, providing up to GS =18.0 dB improvement (reduction in required SNR). Network penetration, measured using rogue ZigBee devices, show that the RndF method improved rogue rejection in noisier environments - gains of up to GS =18.0 dB are realized over previous methods

    The impact of training data characteristics on ensemble classification of land cover

    Get PDF
    Supervised classification of remote sensing imagery has long been recognised as an essential technology for large area land cover mapping. Remote sensing derived land cover and forest classification maps are important sources of information for understanding environmental processes and informing natural resource management decision making. In recent years, the supervised transformation of remote sensing data into thematic products has been advanced through the introduction and development of machine learning classification techniques. Applied to a variety of science and engineering problems over the past twenty years (Lary et al., 2016), machine learning provides greater accuracy and efficiency than traditional parametric classifiers, capable of dealing with large data volumes across complex measurement spaces. The Random forest (RF) classifier in particular, has become popular in the remote sensing community, with a range of commonly cited advantages, including its low parameterisation requirements, excellent classification results and ability to handle noisy observation data and outliers, in a complex measurement space and small training data relative to the study area size. In the context of large area land cover classification for forest cover, using multisource remote sensing and geospatial data, this research sets out to examine proposed advantages of the RF classifier - insensitivity to training data noise (mislabelling) and handling training data class imbalance. Through margin theory, the research also investigates the utility of ensemble learning – in which multiple base classifiers are combined to reduce generalisation error in classification – as a means of designing more efficient classifiers, improving classification performance, and reducing reference (training and test) data redundancy. The first part of the thesis (chapters 2 and 3) introduces the experimental setting and data used in the research, including a description (in chapter 2) of the sampling framework for the reference data used in classification experiments that follow. Chapter 3 evaluates the performance of the RF classifier applied across 7.2 million hectares of public land study area in Victoria, Australia. This chapter describes an open-source framework for deploying the RF classifier over large areas and processing significant volumes of multi-source remote sensing and ancillary spatial data. The second part of this thesis (research chapters 4 through 6) examines the effect of training data characteristics (class imbalance and mislabelling) on the performance of RF, and explores the application of the ensemble margin, as a means of both examining RF classification performance, and informing training data sampling to improve classification accuracy. Results of binary and multiclass experiments described in chapter 4, provide insights into the behaviour of RF, in which training data are not evenly distributed among classes and contain systematically mislabelled instances. Results show that while the error rate of the RF classifier is relatively insensitive to mislabelled training data (in the multiclass experiment, overall 78.3% Kappa with no mislabelled instances to 70.1% with 25% mislabelling in each class), the level of associated confidence falls at a faster rate than overall accuracy with increasing rates of mislabelled training data. This study section also demonstrates that imbalanced training data can be introduced to reduce error in classes that are most difficult to classify. The relationship between per-class and overall classification performance and the diversity of members in a RF ensemble classifier, is explored through experiments presented in chapter 5. This research examines ways of targeting particular training data samples to induce RF ensemble diversity and improve per-class and overall classification performance and efficiency. Through use of the ensemble margin, this study offers insights into the trade-off between ensemble classification accuracy and diversity. The research shows that boosting diversity among RF ensemble members, by emphasising the contribution of lower margin training instances used in the learning process, is an effective means of improving classification performance, particularly for more difficult or rarer classes, and is a way of reducing information redundancy and improving the efficiency of classification problems. Research chapter 6 looks at the application of the RF classifier for calculating Landscape Pattern Indices (LPIs) from classification prediction maps, and examines the sensitivity of these indices to training data characteristics and sampling based on the ensemble margin. This research reveals a range of commonly used LPIs to have significant sensitivity to training data mislabelling in RF classification, as well as margin-based training data sampling. In conclusion, this thesis examines proposed advantages of the popular machine learning classifier, Random forests - the relative insensitivity to training data noise (mislabelling) and its ability to handle class imbalance. This research also explores the utility of the ensemble margin for designing more efficient classifiers, measuring and improving classification performance, and designing ensemble classification systems which use reference data more efficiently and effectively, with less data redundancy. These findings have practical applications and implications for large area land cover classification, for which the generation of high quality reference data is often a time consuming, subjective and expensive exercise
    corecore