11,915 research outputs found
Toward Non-security Failures as a Predictor of Security Faults and Failures
Abstract. In the search for metrics that can predict the presence of vulnerabilities early in the software life cycle, there may be some benefit to choosing metrics from the non-security realm. We analyzed non-security and security failure data reported for the year 2007 of a Cisco software system. We used non-security failure reports as input variables into a classification and regression tree (CART) model to determine the probability that a component will have at least one vulnerability. Using CART, we ranked all of the system components in descending order of their probabilities and found that 57 % of the vulnerable components were in the top nine percent of the total component ranking, but with a 48 % false positive rate. The results indicate that non-security failures can be used as one of the input variables for security-related prediction models
Measuring Membership Privacy on Aggregate Location Time-Series
While location data is extremely valuable for various applications,
disclosing it prompts serious threats to individuals' privacy. To limit such
concerns, organizations often provide analysts with aggregate time-series that
indicate, e.g., how many people are in a location at a time interval, rather
than raw individual traces. In this paper, we perform a measurement study to
understand Membership Inference Attacks (MIAs) on aggregate location
time-series, where an adversary tries to infer whether a specific user
contributed to the aggregates.
We find that the volume of contributed data, as well as the regularity and
particularity of users' mobility patterns, play a crucial role in the attack's
success. We experiment with a wide range of defenses based on generalization,
hiding, and perturbation, and evaluate their ability to thwart the attack
vis-a-vis the utility loss they introduce for various mobility analytics tasks.
Our results show that some defenses fail across the board, while others work
for specific tasks on aggregate location time-series. For instance, suppressing
small counts can be used for ranking hotspots, data generalization for
forecasting traffic, hotspot discovery, and map inference, while sampling is
effective for location labeling and anomaly detection when the dataset is
sparse. Differentially private techniques provide reasonable accuracy only in
very specific settings, e.g., discovering hotspots and forecasting their
traffic, and more so when using weaker privacy notions like crowd-blending
privacy. Overall, our measurements show that there does not exist a unique
generic defense that can preserve the utility of the analytics for arbitrary
applications, and provide useful insights regarding the disclosure of sanitized
aggregate location time-series
An Energy Aware and Secure MAC Protocol for Tackling Denial of Sleep Attacks in Wireless Sensor Networks
Wireless sensor networks which form part of the core for the Internet of Things consist of resource constrained sensors that are usually powered by batteries. Therefore, careful
energy awareness is essential when working with these devices.
Indeed,the introduction of security techniques such as authentication and encryption, to ensure confidentiality and integrity of data, can place higher energy load on the sensors. However, the absence of security protection c ould give room for energy drain attacks such as denial of sleep attacks which have a higher negative impact on the life span ( of the sensors than the presence of security features.
This thesis, therefore, focuses on tackling denial of sleep attacks from two perspectives A security perspective and an energy efficiency perspective. The security perspective involves evaluating and ranking a number of security based techniques to curbing denial of sleep attacks. The energy efficiency perspective, on the other hand, involves exploring duty cycling and simulating three Media Access Control ( protocols Sensor MAC, Timeout MAC andTunableMAC under different network sizes and measuring different parameters such as the Received Signal Strength RSSI) and Link Quality Indicator ( Transmit power, throughput and energy efficiency Duty cycling happens to be one of the major techniques for conserving energy in wireless sensor networks and this research aims to answer questions with regards to the effect of duty cycles on the energy efficiency as well as the throughput of three duty cycle protocols Sensor MAC ( Timeout MAC ( and TunableMAC in addition to creating a novel MAC protocol that is also more resilient to denial of sleep a ttacks than existing protocols.
The main contributions to knowledge from this thesis are the developed framework used for evaluation of existing denial of sleep attack solutions and the algorithms which fuel the other contribution to knowledge a newly developed protocol tested on the Castalia Simulator on the OMNET++ platform. The new protocol has been compared with existing protocols and
has been found to have significant improvement in energy efficiency and also better resilience to denial of sleep at tacks Part of this research has been published Two conference
publications in IEEE Explore and one workshop paper
"Influence Sketching": Finding Influential Samples In Large-Scale Regressions
There is an especially strong need in modern large-scale data analysis to
prioritize samples for manual inspection. For example, the inspection could
target important mislabeled samples or key vulnerabilities exploitable by an
adversarial attack. In order to solve the "needle in the haystack" problem of
which samples to inspect, we develop a new scalable version of Cook's distance,
a classical statistical technique for identifying samples which unusually
strongly impact the fit of a regression model (and its downstream predictions).
In order to scale this technique up to very large and high-dimensional
datasets, we introduce a new algorithm which we call "influence sketching."
Influence sketching embeds random projections within the influence computation;
in particular, the influence score is calculated using the randomly projected
pseudo-dataset from the post-convergence Generalized Linear Model (GLM). We
validate that influence sketching can reliably and successfully discover
influential samples by applying the technique to a malware detection dataset of
over 2 million executable files, each represented with almost 100,000 features.
For example, we find that randomly deleting approximately 10% of training
samples reduces predictive accuracy only slightly from 99.47% to 99.45%,
whereas deleting the same number of samples with high influence sketch scores
reduces predictive accuracy all the way down to 90.24%. Moreover, we find that
influential samples are especially likely to be mislabeled. In the case study,
we manually inspect the most influential samples, and find that influence
sketching pointed us to new, previously unidentified pieces of malware.Comment: fixed additional typo
The Importance of Accounting for Real-World Labelling When Predicting Software Vulnerabilities
Previous work on vulnerability prediction assume that predictive models are trained with respect to perfect labelling information (includes labels from future, as yet undiscovered vulnerabilities). In this paper we present results from a comprehensive empirical study of 1,898 real-world vulnerabilities reported in 74 releases of three security-critical open source systems (Linux Kernel, OpenSSL and Wiresark). Our study investigates the effectiveness of three previously proposed vulnerability prediction approaches, in two settings: with and without the unrealistic labelling assumption. The results reveal that the unrealistic labelling assumption can profoundly mis- lead the scientific conclusions drawn; suggesting highly effective and deployable prediction results vanish when we fully account for realistically available labelling in the experimental methodology. More precisely, MCC mean values of predictive effectiveness drop from 0.77, 0.65 and 0.43 to 0.08, 0.22, 0.10 for Linux Kernel, OpenSSL and Wiresark, respectively. Similar results are also obtained for precision, recall and other assessments of predictive efficacy. The community therefore needs to upgrade experimental and empirical methodology for vulnerability prediction evaluation and development to ensure robust and actionable scientific findings
Trust based collaborative filtering
k-nearest neighbour (kNN) collaborative filtering (CF), the widely successful
algorithm supporting recommender systems, attempts to relieve the problem
of information overload by generating predicted ratings for items users have not
expressed their opinions about; to do so, each predicted rating is computed based
on ratings given by like-minded individuals. Like-mindedness, or similarity-based
recommendation, is the cause of a variety of problems that plague recommender
systems. An alternative view of the problem, based on trust, offers the potential to
address many of the previous limiations in CF. In this work we present a varation of
kNN, the trusted k-nearest recommenders (or kNR) algorithm, which allows users
to learn who and how much to trust one another by evaluating the utility of the rating
information they receive. This method redefines the way CF is performed, and
while avoiding some of the pitfalls that similarity-based CF is prone to, outperforms
the basic similarity-based methods in terms of prediction accuracy
A log mining approach for process monitoring in SCADA
SCADA (Supervisory Control and Data Acquisition) systems are used for controlling and monitoring industrial processes. We propose a methodology to systematically identify potential process-related threats in SCADA. Process-related threats take place when an attacker gains user access rights and performs actions, which look legitimate, but which are intended to disrupt the SCADA process. To detect such threats, we propose a semi-automated approach of log processing. We conduct experiments on a real-life water treatment facility. A preliminary case study suggests that our approach is effective in detecting anomalous events that might alter the regular process workflow
- …