3 research outputs found
MEADE: Towards a Malicious Email Attachment Detection Engine
Malicious email attachments are a growing delivery vector for malware. While
machine learning has been successfully applied to portable executable (PE)
malware detection, we ask, can we extend similar approaches to detect malware
across heterogeneous file types commonly found in email attachments? In this
paper, we explore the feasibility of applying machine learning as a static
countermeasure to detect several types of malicious email attachments including
Microsoft Office documents and Zip archives. To this end, we collected a
dataset of over 5 million malicious/benign Microsoft Office documents from
VirusTotal for evaluation as well as a dataset of benign Microsoft Office
documents from the Common Crawl corpus, which we use to provide more realistic
estimates of thresholds for false positive rates on in-the-wild data. We also
collected a dataset of approximately 500k malicious/benign Zip archives, which
we scraped using the VirusTotal service, on which we performed a separate
evaluation. We analyze predictive performance of several classifiers on each of
the VirusTotal datasets using a 70/30 train/test split on first seen time,
evaluating feature and classifier types that have been applied successfully in
commercial antimalware products and R&D contexts. Using deep neural networks
and gradient boosted decision trees, we are able to obtain ROC curves with >
0.99 AUC on both Microsoft Office document and Zip archive datasets. Discussion
of deployment viability in various antimalware contexts is provided.Comment: Pre-print of a manuscript submitted to IEEE Symposium on Technologies
for Homeland Security (HST
Recent Advances in Open Set Recognition: A Survey
In real-world recognition/classification tasks, limited by various objective
factors, it is usually difficult to collect training samples to exhaust all
classes when training a recognizer or classifier. A more realistic scenario is
open set recognition (OSR), where incomplete knowledge of the world exists at
training time, and unknown classes can be submitted to an algorithm during
testing, requiring the classifiers to not only accurately classify the seen
classes, but also effectively deal with the unseen ones. This paper provides a
comprehensive survey of existing open set recognition techniques covering
various aspects ranging from related definitions, representations of models,
datasets, evaluation criteria, and algorithm comparisons. Furthermore, we
briefly analyze the relationships between OSR and its related tasks including
zero-shot, one-shot (few-shot) recognition/learning techniques, classification
with reject option, and so forth. Additionally, we also overview the open world
recognition which can be seen as a natural extension of OSR. Importantly, we
highlight the limitations of existing approaches and point out some promising
subsequent research directions in this field.Comment: Accepted by IEEE TPAM
A Review of Computer Vision Methods in Network Security
Network security has become an area of significant importance more than ever
as highlighted by the eye-opening numbers of data breaches, attacks on critical
infrastructure, and malware/ransomware/cryptojacker attacks that are reported
almost every day. Increasingly, we are relying on networked infrastructure and
with the advent of IoT, billions of devices will be connected to the internet,
providing attackers with more opportunities to exploit. Traditional machine
learning methods have been frequently used in the context of network security.
However, such methods are more based on statistical features extracted from
sources such as binaries, emails, and packet flows.
On the other hand, recent years witnessed a phenomenal growth in computer
vision mainly driven by the advances in the area of convolutional neural
networks. At a glance, it is not trivial to see how computer vision methods are
related to network security. Nonetheless, there is a significant amount of work
that highlighted how methods from computer vision can be applied in network
security for detecting attacks or building security solutions. In this paper,
we provide a comprehensive survey of such work under three topics; i) phishing
attempt detection, ii) malware detection, and iii) traffic anomaly detection.
Next, we review a set of such commercial products for which public information
is available and explore how computer vision methods are effectively used in
those products. Finally, we discuss existing research gaps and future research
directions, especially focusing on how network security research community and
the industry can leverage the exponential growth of computer vision methods to
build much secure networked systems