Search CORE

11,915 research outputs found

Toward Non-security Failures as a Predictor of Security Faults and Failures

Author: A. Endres
B. Boehm
I. Witten
J. Saltzer
J. Viega
J. Zheng
O.H. Alhazmi
S.A.S. Institute Inc
T. Hastie
T.M. Khoshgoftaar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Abstract. In the search for metrics that can predict the presence of vulnerabilities early in the software life cycle, there may be some benefit to choosing metrics from the non-security realm. We analyzed non-security and security failure data reported for the year 2007 of a Cisco software system. We used non-security failure reports as input variables into a classification and regression tree (CART) model to determine the probability that a component will have at least one vulnerability. Using CART, we ranked all of the system components in descending order of their probabilities and found that 57 % of the vulnerable components were in the top nine percent of the total component ranking, but with a 48 % false positive rate. The results indicate that non-security failures can be used as one of the input variables for security-related prediction models

CiteSeerX

Crossref

Measuring Membership Privacy on Aggregate Location Time-Series

Author: De Cristofaro Emiliano
Pyrgelis Apostolos
Troncoso Carmela
Publication venue
Publication date: 27/04/2020
Field of study

While location data is extremely valuable for various applications, disclosing it prompts serious threats to individuals' privacy. To limit such concerns, organizations often provide analysts with aggregate time-series that indicate, e.g., how many people are in a location at a time interval, rather than raw individual traces. In this paper, we perform a measurement study to understand Membership Inference Attacks (MIAs) on aggregate location time-series, where an adversary tries to infer whether a specific user contributed to the aggregates. We find that the volume of contributed data, as well as the regularity and particularity of users' mobility patterns, play a crucial role in the attack's success. We experiment with a wide range of defenses based on generalization, hiding, and perturbation, and evaluate their ability to thwart the attack vis-a-vis the utility loss they introduce for various mobility analytics tasks. Our results show that some defenses fail across the board, while others work for specific tasks on aggregate location time-series. For instance, suppressing small counts can be used for ranking hotspots, data generalization for forecasting traffic, hotspot discovery, and map inference, while sampling is effective for location labeling and anomaly detection when the dataset is sparse. Differentially private techniques provide reasonable accuracy only in very specific settings, e.g., discovering hotspots and forecasting their traffic, and more so when using weaker privacy notions like crowd-blending privacy. Overall, our measurements show that there does not exist a unique generic defense that can preserve the utility of the analytics for arbitrary applications, and provide useful insights regarding the disclosure of sanitized aggregate location time-series

arXiv.org e-Print Archive

Crossref

UCL Discovery

An Energy Aware and Secure MAC Protocol for Tackling Denial of Sleep Attacks in Wireless Sensor Networks

Author: Udoh E.
Udoh E.
Publication venue
Publication date: 01/01/2019
Field of study

Wireless sensor networks which form part of the core for the Internet of Things consist of resource constrained sensors that are usually powered by batteries. Therefore, careful energy awareness is essential when working with these devices. Indeed,the introduction of security techniques such as authentication and encryption, to ensure confidentiality and integrity of data, can place higher energy load on the sensors. However, the absence of security protection c ould give room for energy drain attacks such as denial of sleep attacks which have a higher negative impact on the life span ( of the sensors than the presence of security features. This thesis, therefore, focuses on tackling denial of sleep attacks from two perspectives A security perspective and an energy efficiency perspective. The security perspective involves evaluating and ranking a number of security based techniques to curbing denial of sleep attacks. The energy efficiency perspective, on the other hand, involves exploring duty cycling and simulating three Media Access Control ( protocols Sensor MAC, Timeout MAC andTunableMAC under different network sizes and measuring different parameters such as the Received Signal Strength RSSI) and Link Quality Indicator ( Transmit power, throughput and energy efficiency Duty cycling happens to be one of the major techniques for conserving energy in wireless sensor networks and this research aims to answer questions with regards to the effect of duty cycles on the energy efficiency as well as the throughput of three duty cycle protocols Sensor MAC ( Timeout MAC ( and TunableMAC in addition to creating a novel MAC protocol that is also more resilient to denial of sleep a ttacks than existing protocols. The main contributions to knowledge from this thesis are the developed framework used for evaluation of existing denial of sleep attack solutions and the algorithms which fuel the other contribution to knowledge a newly developed protocol tested on the Castalia Simulator on the OMNET++ platform. The new protocol has been compared with existing protocols and has been found to have significant improvement in energy efficiency and also better resilience to denial of sleep at tacks Part of this research has been published Two conference publications in IEEE Explore and one workshop paper

WestminsterResearch

"Influence Sketching": Finding Influential Samples In Large-Scale Regressions

Author: Crable Caleb
Cruz Ben
Luan Jay
Wallace Brian
Wojnowicz Mike
Wolff Matt
Zhao Xuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2017
Field of study

There is an especially strong need in modern large-scale data analysis to prioritize samples for manual inspection. For example, the inspection could target important mislabeled samples or key vulnerabilities exploitable by an adversarial attack. In order to solve the "needle in the haystack" problem of which samples to inspect, we develop a new scalable version of Cook's distance, a classical statistical technique for identifying samples which unusually strongly impact the fit of a regression model (and its downstream predictions). In order to scale this technique up to very large and high-dimensional datasets, we introduce a new algorithm which we call "influence sketching." Influence sketching embeds random projections within the influence computation; in particular, the influence score is calculated using the randomly projected pseudo-dataset from the post-convergence Generalized Linear Model (GLM). We validate that influence sketching can reliably and successfully discover influential samples by applying the technique to a malware detection dataset of over 2 million executable files, each represented with almost 100,000 features. For example, we find that randomly deleting approximately 10% of training samples reduces predictive accuracy only slightly from 99.47% to 99.45%, whereas deleting the same number of samples with high influence sketch scores reduces predictive accuracy all the way down to 90.24%. Moreover, we find that influential samples are especially likely to be mislabeled. In the case study, we manually inspect the most influential samples, and find that influence sketching pointed us to new, previously unidentified pieces of malware.Comment: fixed additional typo

arXiv.org e-Print Archive

Crossref

The Importance of Accounting for Real-World Labelling When Predicting Software Vulnerabilities

Author: Harman M
Jimenez M
Le Traon Y
Papadakis M
Rwemalika R
Sarro F
Publication venue: 27th ACM Joint Meeting on European Software Engineering Conference (ESEC) / Symposium on the Foundations of Software Engineering (FSE)
Publication date: 12/08/2019
Field of study

Previous work on vulnerability prediction assume that predictive models are trained with respect to perfect labelling information (includes labels from future, as yet undiscovered vulnerabilities). In this paper we present results from a comprehensive empirical study of 1,898 real-world vulnerabilities reported in 74 releases of three security-critical open source systems (Linux Kernel, OpenSSL and Wiresark). Our study investigates the effectiveness of three previously proposed vulnerability prediction approaches, in two settings: with and without the unrealistic labelling assumption. The results reveal that the unrealistic labelling assumption can profoundly mis- lead the scientific conclusions drawn; suggesting highly effective and deployable prediction results vanish when we fully account for realistically available labelling in the experimental methodology. More precisely, MCC mean values of predictive effectiveness drop from 0.77, 0.65 and 0.43 to 0.08, 0.22, 0.10 for Linux Kernel, OpenSSL and Wiresark, respectively. Similar results are also obtained for precision, recall and other assessments of predictive efficacy. The community therefore needs to upgrade experimental and empirical methodology for vulnerability prediction evaluation and development to ensure robust and actionable scientific findings

UCL Discovery

Trust based collaborative filtering

Author: Capra L.
Hailes S.
Lathia N.
Publication venue: Springer Verlag
Publication date: 01/01/2008
Field of study

k-nearest neighbour (kNN) collaborative filtering (CF), the widely successful algorithm supporting recommender systems, attempts to relieve the problem of information overload by generating predicted ratings for items users have not expressed their opinions about; to do so, each predicted rating is computed based on ratings given by like-minded individuals. Like-mindedness, or similarity-based recommendation, is the cause of a variety of problems that plague recommender systems. An alternative view of the problem, based on trust, offers the potential to address many of the previous limiations in CF. In this work we present a varation of kNN, the trusted k-nearest recommenders (or kNR) algorithm, which allows users to learn who and how much to trust one another by evaluating the utility of the rating information they receive. This method redefines the way CF is performed, and while avoiding some of the pitfalls that similarity-based CF is prone to, outperforms the basic similarity-based methods in terms of prediction accuracy

CiteSeerX

UCL Discovery

A log mining approach for process monitoring in SCADA

Author: Bolzoni Damiano
Hadziosmanovic Dina
Hartel Pieter
Publication venue: Springer
Publication date: 01/01/2012
Field of study

SCADA (Supervisory Control and Data Acquisition) systems are used for controlling and monitoring industrial processes. We propose a methodology to systematically identify potential process-related threats in SCADA. Process-related threats take place when an attacker gains user access rights and performs actions, which look legitimate, but which are intended to disrupt the SCADA process. To detect such threats, we propose a semi-automated approach of log processing. We conduct experiments on a real-life water treatment facility. A preliminary case study suggests that our approach is effective in detecting anomalous events that might alter the regular process workflow

Springer - Publisher Connector

University of Twente Research Information