79,856 research outputs found
Mimic Learning to Generate a Shareable Network Intrusion Detection Model
Purveyors of malicious network attacks continue to increase the complexity
and the sophistication of their techniques, and their ability to evade
detection continues to improve as well. Hence, intrusion detection systems must
also evolve to meet these increasingly challenging threats. Machine learning is
often used to support this needed improvement. However, training a good
prediction model can require a large set of labelled training data. Such
datasets are difficult to obtain because privacy concerns prevent the majority
of intrusion detection agencies from sharing their sensitive data. In this
paper, we propose the use of mimic learning to enable the transfer of intrusion
detection knowledge through a teacher model trained on private data to a
student model. This student model provides a mean of publicly sharing knowledge
extracted from private data without sharing the data itself. Our results
confirm that the proposed scheme can produce a student intrusion detection
model that mimics the teacher model without requiring access to the original
dataset
EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models
This paper describes EMBER: a labeled benchmark dataset for training machine
learning models to statically detect malicious Windows portable executable
files. The dataset includes features extracted from 1.1M binary files: 900K
training samples (300K malicious, 300K benign, 300K unlabeled) and 200K test
samples (100K malicious, 100K benign). To accompany the dataset, we also
release open source code for extracting features from additional binaries so
that additional sample features can be appended to the dataset. This dataset
fills a void in the information security machine learning community: a
benign/malicious dataset that is large, open and general enough to cover
several interesting use cases. We enumerate several use cases that we
considered when structuring the dataset. Additionally, we demonstrate one use
case wherein we compare a baseline gradient boosted decision tree model trained
using LightGBM with default settings to MalConv, a recently published
end-to-end (featureless) deep learning model for malware detection. Results
show that even without hyper-parameter optimization, the baseline EMBER model
outperforms MalConv. The authors hope that the dataset, code and baseline model
provided by EMBER will help invigorate machine learning research for malware
detection, in much the same way that benchmark datasets have advanced computer
vision research
Multiclass Road Sign Detection using Multiplicative Kernel
We consider the problem of multiclass road sign detection using a
classification function with multiplicative kernel comprised from two kernels.
We show that problems of detection and within-foreground classification can be
jointly solved by using one kernel to measure object-background differences and
another one to account for within-class variations. The main idea behind this
approach is that road signs from different foreground variations can share
features that discriminate them from backgrounds. The classification function
training is accomplished using SVM, thus feature sharing is obtained through
support vector sharing. Training yields a family of linear detectors, where
each detector corresponds to a specific foreground training sample. The
redundancy among detectors is alleviated using k-medoids clustering. Finally,
we report detection and classification results on a set of road sign images
obtained from a camera on a moving vehicle.Comment: Part of the Proceedings of the Croatian Computer Vision Workshop,
CCVW 2013, Year
Differentially Private Collaborative Intrusion Detection Systems For VANETs
Vehicular ad hoc network (VANET) is an enabling technology in modern
transportation systems for providing safety and valuable information, and yet
vulnerable to a number of attacks from passive eavesdropping to active
interfering. Intrusion detection systems (IDSs) are important devices that can
mitigate the threats by detecting malicious behaviors. Furthermore, the
collaborations among vehicles in VANETs can improve the detection accuracy by
communicating their experiences between nodes. To this end, distributed machine
learning is a suitable framework for the design of scalable and implementable
collaborative detection algorithms over VANETs. One fundamental barrier to
collaborative learning is the privacy concern as nodes exchange data among
them. A malicious node can obtain sensitive information of other nodes by
inferring from the observed data. In this paper, we propose a
privacy-preserving machine-learning based collaborative IDS (PML-CIDS) for
VANETs. The proposed algorithm employs the alternating direction method of
multipliers (ADMM) to a class of empirical risk minimization (ERM) problems and
trains a classifier to detect the intrusions in the VANETs. We use the
differential privacy to capture the privacy notation of the PML-CIDS and
propose a method of dual variable perturbation to provide dynamic differential
privacy. We analyze theoretical performance and characterize the fundamental
tradeoff between the security and privacy of the PML-CIDS. We also conduct
numerical experiments using the NSL-KDD dataset to corroborate the results on
the detection accuracy, security-privacy tradeoffs, and design
Constrained Generative Adversarial Network Ensembles for Sharable Synthetic Data Generation
The sharing of medical imaging datasets between institutions, and even inside
the same institution, is limited by various regulations/legal barriers.
Although these limitations are necessities for protecting patient privacy and
setting strict boundaries for data ownership, medical research projects that
require large datasets suffer considerably as a result. Machine learning has
been revolutionized with the emerging deep neural network approaches over
recent years, making the data-related limitations even a larger problem as
these novel techniques commonly require immense imaging datasets. This paper
introduces constrained Generative Adversarial Network ensembles (cGANe) to
address this problem by altering the representation of the imaging data,
whereas containing the significant information, enabling the reproduction of
similar research results elsewhere with the sharable data. Accordingly, a
framework representing the generation of a cGANe is described, and the approach
is validated for the generation of synthetic 3D brain metastatic region data
from T1-weighted contrast-enhanced MRI studies. For 90% brain metastases (BM)
detection sensitivity, our previously reported detection algorithm produced on
average 9.12 false-positive BM detections per patient after training with the
original data, whereas producing 9.53 false-positives after training with the
cGANe generated synthetic data. Although the applicability of the introduced
approach needs further validation studies with a range of medical imaging data
types, the results suggest that the BM-detection algorithm can achieve
comparable performance by using cGANe generated synthetic data. Hence, the
generalization of the proposed approach for various modalities may occur in the
near future
Taxonomy driven indicator scoring in MISP threat intelligence platforms
IT security community is recently facing a change of trend from closed to
open working groups and from restrictive information to full information
disclosure and sharing. One major feature for this trend change is the number
of incidents and various Indicators of compromise (IoC) that appear on a daily
base, which can only be faced and solved in a collaborative way. Sharing
information is key to stay on top of the threats.
To cover the needs of having a medium for information sharing, different
initiatives were taken such as the Open Source Threat Intelligence and Sharing
Platform called MISP. At current state, this sharing and collection platform
has become far more than a malware information sharing platform. It includes
all kind of IoCs, malware and vulnerabilities, but also financial threat or
fraud information. Hence, the volume of information is increasing and evolving.
In this paper we present implemented distributed data interaction methods for
MISP followed by a generic scoring model for decaying information that is
shared within MISP communities. As the MISP community members do not have the
same objectives, use cases and implementations of the scoring model are
discussed. A commonly encountered use case in practice is the detection of
indicators of compromise in operational networks.Comment: 10 pages, 13 figures. arXiv admin note: substantial text overlap with
arXiv:1803.1105
A Survey on the Security of Pervasive Online Social Networks (POSNs)
Pervasive Online Social Networks (POSNs) are the extensions of Online Social
Networks (OSNs) which facilitate connectivity irrespective of the domain and
properties of users. POSNs have been accumulated with the convergence of a
plethora of social networking platforms with a motivation of bridging their
gap. Over the last decade, OSNs have visually perceived an altogether
tremendous amount of advancement in terms of the number of users as well as
technology enablers. A single OSN is the property of an organization, which
ascertains smooth functioning of its accommodations for providing a quality
experience to their users. However, with POSNs, multiple OSNs have coalesced
through communities, circles, or only properties, which make
service-provisioning tedious and arduous to sustain. Especially, challenges
become rigorous when the focus is on the security perspective of cross-platform
OSNs, which are an integral part of POSNs. Thus, it is of utmost paramountcy to
highlight such a requirement and understand the current situation while
discussing the available state-of-the-art. With the modernization of OSNs and
convergence towards POSNs, it is compulsory to understand the impact and reach
of current solutions for enhancing the security of users as well as associated
services. This survey understands this requisite and fixates on different sets
of studies presented over the last few years and surveys them for their
applicability to POSNs...Comment: 39 Pages, 10 Figure
Combating Fake News: A Survey on Identification and Mitigation Techniques
The proliferation of fake news on social media has opened up new directions
of research for timely identification and containment of fake news, and
mitigation of its widespread impact on public opinion. While much of the
earlier research was focused on identification of fake news based on its
contents or by exploiting users' engagements with the news on social media,
there has been a rising interest in proactive intervention strategies to
counter the spread of misinformation and its impact on society. In this survey,
we describe the modern-day problem of fake news and, in particular, highlight
the technical challenges associated with it. We discuss existing methods and
techniques applicable to both identification and mitigation, with a focus on
the significant advances in each method and their advantages and limitations.
In addition, research has often been limited by the quality of existing
datasets and their specific application contexts. To alleviate this problem, we
comprehensively compile and summarize characteristic features of available
datasets. Furthermore, we outline new directions of research to facilitate
future development of effective and interdisciplinary solutions
A Survey on Malicious Domains Detection through DNS Data Analysis
Malicious domains are one of the major resources required for adversaries to
run attacks over the Internet. Due to the important role of the Domain Name
System (DNS), extensive research has been conducted to identify malicious
domains based on their unique behavior reflected in different phases of the
life cycle of DNS queries and responses. Existing approaches differ
significantly in terms of intuitions, data analysis methods as well as
evaluation methodologies. This warrants a thorough systematization of the
approaches and a careful review of the advantages and limitations of every
group.
In this paper, we perform such an analysis. In order to achieve this goal, we
present the necessary background knowledge on DNS and malicious activities
leveraging DNS. We describe a general framework of malicious domain detection
techniques using DNS data. Applying this framework, we categorize existing
approaches using several orthogonal viewpoints, namely (1) sources of DNS data
and their enrichment, (2) data analysis methods, and (3) evaluation strategies
and metrics. In each aspect, we discuss the important challenges that the
research community should address in order to fully realize the power of DNS
data analysis to fight against attacks leveraging malicious domains.Comment: 35 pages, to appear in ACM CSU
ScreenAvoider: Protecting Computer Screens from Ubiquitous Cameras
We live and work in environments that are inundated with cameras embedded in
devices such as phones, tablets, laptops, and monitors. Newer wearable devices
like Google Glass, Narrative Clip, and Autographer offer the ability to quietly
log our lives with cameras from a `first person' perspective. While capturing
several meaningful and interesting moments, a significant number of images
captured by these wearable cameras can contain computer screens. Given the
potentially sensitive information that is visible on our displays, there is a
need to guard computer screens from undesired photography. People need
protection against photography of their screens, whether by other people's
cameras or their own cameras.
We present ScreenAvoider, a framework that controls the collection and
disclosure of images with computer screens and their sensitive content.
ScreenAvoider can detect images with computer screens with high accuracy and
can even go so far as to discriminate amongst screen content. We also introduce
a ScreenTag system that aids in the identification of screen content, flagging
images with highly sensitive content such as messaging applications or email
webpages. We evaluate our concept on realistic lifelogging datasets, showing
that ScreenAvoider provides a practical and useful solution that can help users
manage their privacy
- …