5,494 research outputs found

    Security Evaluation of Support Vector Machines in Adversarial Environments

    Full text link
    Support Vector Machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion), or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector Machine Applications

    Automating Cyber Analytics

    Get PDF
    Model based security metrics are a growing area of cyber security research concerned with measuring the risk exposure of an information system. These metrics are typically studied in isolation, with the formulation of the test itself being the primary finding in publications. As a result, there is a flood of metric specifications available in the literature but a corresponding dearth of analyses verifying results for a given metric calculation under different conditions or comparing the efficacy of one measurement technique over another. The motivation of this thesis is to create a systematic methodology for model based security metric development, analysis, integration, and validation. In doing so we hope to fill a critical gap in the way we view and improve a system’s security. In order to understand the security posture of a system before it is rolled out and as it evolves, we present in this dissertation an end to end solution for the automated measurement of security metrics needed to identify risk early and accurately. To our knowledge this is a novel capability in design time security analysis which provides the foundation for ongoing research into predictive cyber security analytics. Modern development environments contain a wealth of information in infrastructure-as-code repositories, continuous build systems, and container descriptions that could inform security models, but risk evaluation based on these sources is ad-hoc at best, and often simply left until deployment. Our goal in this work is to lay the groundwork for security measurement to be a practical part of the system design, development, and integration lifecycle. In this thesis we provide a framework for the systematic validation of the existing security metrics body of knowledge. In doing so we endeavour not only to survey the current state of the art, but to create a common platform for future research in the area to be conducted. We then demonstrate the utility of our framework through the evaluation of leading security metrics against a reference set of system models we have created. We investigate how to calibrate security metrics for different use cases and establish a new methodology for security metric benchmarking. We further explore the research avenues unlocked by automation through our concept of an API driven S-MaaS (Security Metrics-as-a-Service) offering. We review our design considerations in packaging security metrics for programmatic access, and discuss how various client access-patterns are anticipated in our implementation strategy. Using existing metric processing pipelines as reference, we show how the simple, modular interfaces in S-MaaS support dynamic composition and orchestration. Next we review aspects of our framework which can benefit from optimization and further automation through machine learning. First we create a dataset of network models labeled with the corresponding security metrics. By training classifiers to predict security values based only on network inputs, we can avoid the computationally expensive attack graph generation steps. We use our findings from this simple experiment to motivate our current lines of research into supervised and unsupervised techniques such as network embeddings, interaction rule synthesis, and reinforcement learning environments. Finally, we examine the results of our case studies. We summarize our security analysis of a large scale network migration, and list the friction points along the way which are remediated by this work. We relate how our research for a large-scale performance benchmarking project has influenced our vision for the future of security metrics collection and analysis through dev-ops automation. We then describe how we applied our framework to measure the incremental security impact of running a distributed stream processing system inside a hardware trusted execution environment

    U and Th content in the Central Apennines continental crust: a contribution to the determination of the geo-neutrinos flux at LNGS

    Full text link
    The regional contribution to the geo-neutrino signal at Gran Sasso National Laboratory (LNGS) was determined based on a detailed geological, geochemical and geophysical study of the region. U and Th abundances of more than 50 samples representative of the main lithotypes belonging to the Mesozoic and Cenozoic sedimentary cover were analyzed. Sedimentary rocks were grouped into four main "Reservoirs" based on similar paleogeographic conditions and mineralogy. Basement rocks do not outcrop in the area. Thus U and Th in the Upper and Lower Crust of Valsugana and Ivrea-Verbano areas were analyzed. Based on geological and geophysical properties, relative abundances of the various reservoirs were calculated and used to obtain the weighted U and Th abundances for each of the three geological layers (Sedimentary Cover, Upper and Lower Crust). Using the available seismic profile as well as the stratigraphic records from a number of exploration wells, a 3D modelling was developed over an area of 2^{\circ}x2^{\circ} down to the Moho depth, for a total volume of about 1.2x10^6 km^3. This model allowed us to determine the volume of the various geological layers and eventually integrate the Th and U contents of the whole crust beneath LNGS. On this base the local contribution to the geo-neutrino flux (S) was calculated and added to the contribution given by the rest of the world, yielding a Refined Reference Model prediction for the geo-neutrino signal in the Borexino detector at LNGS: S(U) = (28.7 \pm 3.9) TNU and S(Th) = (7.5 \pm 1.0) TNU. An excess over the total flux of about 4 TNU was previously obtained by Mantovani et al. (2004) who calculated, based on general worldwide assumptions, a signal of 40.5 TNU. The considerable thickness of the sedimentary rocks, almost predominantly represented by U- and Th- poor carbonatic rocks in the area near LNGS, is responsible for this difference.Comment: 45 pages, 5 figures, 12 tables; accepted for publication in GC

    An entropy-based analysis of GPR data for the assessment of railway ballast conditions

    Get PDF
    The effective monitoring of ballasted railway track beds is fundamental for maintaining safe operational conditions of railways and lowering maintenance costs. Railway ballast can be damaged over time by the breakdown of aggregates or by the upward migration of fine clay particles from the foundation, along with capillary water. This may cause critical track settlements. To that effect, early stage detection of fouling is of paramount importance. Within this context, ground penetrating radar (GPR) is a rapid nondestructive testing technique, which is being increasingly used for the assessment and health monitoring of railway track substructures. In this paper, we propose a novel and efficient signal processing approach based on entropy analysis, which was applied to GPR data for the assessment of the railway ballast conditions and the detection of fouling. In order to recreate a real-life scenario within the context of railway structures, four different ballast/pollutant mixes were introduced, ranging from clean to highly fouled ballast. GPR systems equipped with two different antennas, ground-coupled (600 and 1600 MHz) and air-coupled (1000 and 2000 MHz), were used for testing purposes. The proposed methodology aims at rapidly identifying distinctive areas of interest related to fouling, thereby lowering significantly the amount of data to be processed and the time required for specialist data processing. Prominent information on the use of suitable frequencies of investigation from the investigated set, as well as the relevant probability values of detection and false alarm, is provided

    Enhanced Prediction of Network Attacks Using Incomplete Data

    Get PDF
    For years, intrusion detection has been considered a key component of many organizations’ network defense capabilities. Although a number of approaches to intrusion detection have been tried, few have been capable of providing security personnel responsible for the protection of a network with sufficient information to make adjustments and respond to attacks in real-time. Because intrusion detection systems rarely have complete information, false negatives and false positives are extremely common, and thus valuable resources are wasted responding to irrelevant events. In order to provide better actionable information for security personnel, a mechanism for quantifying the confidence level in predictions is needed. This work presents an approach which seeks to combine a primary prediction model with a novel secondary confidence level model which provides a measurement of the confidence in a given attack prediction being made. The ability to accurately identify an attack and quantify the confidence level in the prediction could serve as the basis for a new generation of intrusion detection devices, devices that provide earlier and better alerts for administrators and allow more proactive response to events as they are occurring

    Agent Organization and Request Propagation in the Knowledge Plane

    Get PDF
    In designing and building a network like the Internet, we continue to face the problems of scale and distribution. In particular, network management has become an increasingly difficult task, and network applications often need to maintain efficient connectivity graphs for various purposes. The knowledge plane was proposed as a new construct to improve network management and applications. In this proposal, I propose an application-independent mechanism to support the construction of application-specific connectivity graphs. Specifically, I propose to build a network knowledge plane and multiple sub-planes for different areas of network services. The network knowledge plane provides valuable knowledge about the Internet to the sub-planes, and each sub-plane constructs its own connectivity graph using network knowledge and knowledge in its own specific area. I focus on two key design issues: (1) a region-based architecture for agent organization; (2) knowledge dissemination and request propagation. Network management and applications benefit from the underlying network knowledge plane and sub-planes. To demonstrate the effectiveness of this mechanism, I conduct case studies in network management and security

    The Unbalanced Classification Problem: Detecting Breaches in Security

    Get PDF
    This research proposes several methods designed to improve solutions for security classification problems. The security classification problem involves unbalanced, high-dimensional, binary classification problems that are prevalent today. The imbalance within this data involves a significant majority of the negative class and a minority positive class. Any system that needs protection from malicious activity, intruders, theft, or other types of breaches in security must address this problem. These breaches in security are considered instances of the positive class. Given numerical data that represent observations or instances which require classification, state of the art machine learning algorithms can be applied. However, the unbalanced and high-dimensional structure of the data must be considered prior to applying these learning methods. High-dimensional data poses a “curse of dimensionality” which can be overcome through the analysis of subspaces. Exploration of intelligent subspace modeling and the fusion of subspace models is proposed. Detailed analysis of the one-class support vector machine, as well as its weaknesses and proposals to overcome these shortcomings are included. A fundamental method for evaluation of the binary classification model is the receiver operating characteristic (ROC) curve and the area under the curve (AUC). This work details the underlying statistics involved with ROC curves, contributing a comprehensive review of ROC curve construction and analysis techniques to include a novel graphic for illustrating the connection between ROC curves and classifier decision values. The major innovations of this work include synergistic classifier fusion through the analysis of ROC curves and rankings, insight into the statistical behavior of the Gaussian kernel, and novel methods for applying machine learning techniques to defend against computer intrusion detection. The primary empirical vehicle for this research is computer intrusion detection data, and both host-based intrusion detection systems (HIDS) and network-based intrusion detection systems (NIDS) are addressed. Empirical studies also include military tactical scenarios

    Collaborative Edge Computing in Mobile Internet of Things

    Get PDF
    The proliferation of Internet-of-Things (IoT) devices has opened a plethora of opportunities for smart networking, connected applications and data driven intelligence. The large distribution of IoT devices within a finite geographical area and the pervasiveness of wireless networking present an opportunity for such devices to collaborate. Centralized decision systems have so far dominated the field, but they are starting to lose relevance in the wake of heterogeneity of the device pool. This thesis is driven by three key hypothesis: (i) In solving complex problems, it is possible to harness unused compute capabilities of the device pool instead of always relying on centralized infrastructures; (ii) When possible, collaborating with neighbors to identify security threats scales well in large environments; (iii) Given the abundance of data from a large pool of devices with possible privacy constraints, collaborative learning drives scalable intelligence. This dissertation defines three frameworks for these hypotheses; collaborative computing, collaborative security and collaborative privacy intelligence. The first framework, Opportunistic collaboration among IoT devices for workload execution, profiles applications and matches resource grants to requests using blockchain to put excess capacity at the edge to good use. The evaluation results show app execution latency comparable to the centralized edge and an outstanding resource utilization at the edge. The second framework, Integrity Threat Identification for Distributed IoT, uses a new spatio-temporal algorithm, based on Local Outlier Factor (LOF) uniquely using mean and variance collaboratively across spatial and temporal dimensions to identify potential threats. Evaluation results on real world underground sensor dataset (Thoreau) show good accuracy and efficiency. The third frame- work, Collaborative Privacy Intelligence, aims to understand privacy invasion by reverse engineering a user’s privacy model using sensors data, and score the level of intrusion for various dimensions of privacy. By having sensors track activities, and learning rule books from the collective insights, we are able to predict ones privacy attributes and states, with reasonable accuracy. As the Edge gains more prominence with computation moving closer to the data source, the above frameworks will drive key solutions and research in areas of Edge federation and collaboration

    Assessing Feature Representations for Instance-Based Cross-Domain Anomaly Detection in Cloud Services Univariate Time Series Data

    Get PDF
    In this paper, we compare and assess the efficacy of a number of time-series instance feature representations for anomaly detection. To assess whether there are statistically significant differences between different feature representations for anomaly detection in a time series, we calculate and compare confidence intervals on the average performance of different feature sets across a number of different model types and cross-domain time-series datasets. Our results indicate that the catch22 time-series feature set augmented with features based on rolling mean and variance performs best on average, and that the difference in performance between this feature set and the next best feature set is statistically significant. Furthermore, our analysis of the features used by the most successful model indicates that features related to mean and variance are the most informative for anomaly detection. We also find that features based on model forecast errors are useful for anomaly detection for some but not all datasets
    • …
    corecore