4 research outputs found
Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection
In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology
and framework for efficient and effective real-time malware detection,
leveraging the best of conventional machine learning (ML) and deep learning
(DL) algorithms. In PROPEDEUTICA, all software processes in the system start
execution subjected to a conventional ML detector for fast classification. If a
piece of software receives a borderline classification, it is subjected to
further analysis via more performance expensive and more accurate DL methods,
via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays
to the execution of software subjected to deep learning analysis as a way to
"buy time" for DL analysis and to rate-limit the impact of possible malware in
the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and
877 commonly used benign software samples from various categories for the
Windows OS. Our results show that the false positive rate for conventional ML
methods can reach 20%, and for modern DL methods it is usually below 6%.
However, the classification time for DL can be 100X longer than conventional ML
methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional
ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the
percentage of software subjected to DL analysis was approximately 40% on
average. Further, the application of delays in software subjected to ML reduced
the detection time by approximately 10%. Finally, we found and discussed a
discrepancy between the detection accuracy offline (analysis after all traces
are collected) and on-the-fly (analysis in tandem with trace collection). Our
insights show that conventional ML and modern DL-based malware detectors in
isolation cannot meet the needs of efficient and effective malware detection:
high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure
PENGEMBANGAN MULTIMEDIA BERBASIS ADOBE FLASH DALAM PEMBELAJARAN PENGENALAN ANGGOTA TUBUH MANUSIA UNTUK MURID KELAS I SDN 95 BONTO BULAENG DI BULUKUMBA
The learning process in the classroom is difficult for students
to understand the material being taught, because learning
does not apply learning material in using media in the
classroom environment, so the learning process in class is not
effective in understanding the material being taught. This
study aims to identify a description of the need for
developing multimedia learning of human limb recognition
using Adobe Flash, designing multimedia for learning
human limb recognition using Adobe Flash, measuring the
level of validity and practicality of multimedia for learning
human limb recognition using Adobe. This research is a type
of R&D (Research and Development) used ADDIE model
(analysis, design, development, implementation, evaluation).
The results showed that Adobe Flash-based multimedia
products were valid to be used based on the results of
validation by content/material experts obtaining a percentage
result of 100% being in very good qualifications, validation
results by design experts obtaining a percentage of 86% being
in good qualifications, validation results by media experts
obtained a percentage result of 94% being in very good
qualifications, then in practicality trials conducted on
students who obtained results 98.17% were in very good
qualifications, and the responses of subject teachers related to
Adobe Flash-based multimedia obtained 100% results which
were in very good qualification. Based on the results of the
analysis, it can be concluded that Adobe Flash-based
multimedia is valid and practical to use in the learning
process of Education, Physical, Sports and Health subjects
Effizientes Maschinelles Lernen für die Angriffserkennung
Detecting and fending off attacks on computer systems is an enduring
problem in computer security. In light of a plethora of different
threats and the growing automation used by attackers, we are in urgent
need of more advanced methods for attack detection.
In this thesis, we address the necessity of advanced attack detection
and develop methods to detect attacks using machine learning to
establish a higher degree of automation for reactive security. Machine
learning is data-driven and not void of bias. For the effective
application of machine learning for attack detection, thus, a periodic
retraining over time is crucial. However, the training complexity of
many learning-based approaches is substantial. We show that with the
right data representation, efficient algorithms for mining substring
statistics, and implementations based on probabilistic data structures,
training the underlying model can be achieved in linear time.
In two different scenarios, we demonstrate the effectiveness of
so-called language models that allow to generically portray the content
and structure of attacks: On the one hand, we are learning malicious
behavior of Flash-based malware using classification, and on the other
hand, we detect intrusions by learning normality in industrial control
networks using anomaly detection. With a data throughput of up to
580 Mbit/s during training, we do not only meet our expectations with
respect to runtime but also outperform related approaches by up to an
order of magnitude in detection performance. The same techniques that
facilitate learning in the previous scenarios can also be used for
revealing malicious content, embedded in passive file formats, such as
Microsoft Office documents. As a further showcase, we additionally
develop a method based on the efficient mining of substring statistics
that is able to break obfuscations irrespective of the used key length,
with up to 25 Mbit/s and thus, succeeds where related approaches fail.
These methods significantly improve detection performance and enable
operation in linear time. In doing so, we counteract the trend of
compensating increasing runtime requirements with resources. While the
results are promising and the approaches provide urgently needed
automation, they cannot and are not intended to replace human experts or
traditional approaches, but are designed to assist and complement them.Die Erkennung und Abwehr von Angriffen auf Endnutzer und Netzwerke ist
seit vielen Jahren ein anhaltendes Problem in der Computersicherheit.
Angesichts der hohen Anzahl an unterschiedlichen Angriffsvektoren und
der zunehmenden Automatisierung von Angriffen, bedarf es dringend
moderner Methoden zur Angriffserkennung.
In dieser Doktorarbeit werden Ansätze entwickelt, um Angriffe mit Hilfe
von Methoden des maschinellen Lernens zuverlässig, aber auch effizient
zu erkennen. Sie stellen der Automatisierung von Angriffen einen
entsprechend hohen Grad an Automatisierung von Verteidigungsmaßnahmen
entgegen. Das Trainieren solcher Methoden ist allerdings rechnerisch
aufwändig und erfolgt auf sehr großen Datenmengen. Laufzeiteffiziente
Lernverfahren sind also entscheidend. Wir zeigen, dass durch den Einsatz
von effizienten Algorithmen zur statistischen Analyse von Zeichenketten
und Implementierung auf Basis von probabilistischen Datenstrukturen, das
Lernen von effektiver Angriffserkennung auch in linearer Zeit möglich
ist.
Anhand von zwei unterschiedlichen Anwendungsfällen, demonstrieren wir
die Effektivität von Modellen, die auf der Extraktion von sogenannten
n-Grammen basieren: Zum einen, betrachten wir die Erkennung von
Flash-basiertem Schadcode mittels Methoden der Klassifikation, und zum
anderen, die Erkennung von Angriffen auf Industrienetzwerke bzw.
SCADA-Systeme mit Hilfe von Anomaliedetektion. Dabei erzielen wir
während des Trainings dieser Modelle einen Datendurchsatz von bis zu
580 Mbit/s und übertreffen gleichzeitig die Erkennungsleistung von
anderen Ansätzen deutlich. Die selben Techniken, um diese lernenden
Ansätze zu ermöglichen, können außerdem für die Erkennung von Schadcode
verwendet werden, der in anderen Dateiformaten eingebettet und mittels
einfacher Verschlüsselungen obfuskiert wurde. Hierzu entwickeln wir eine
Methode die basierend auf der statistischen Auswertung von Zeichenketten
einfache Verschlüsselungen bricht. Der entwickelte Ansatz arbeitet
unabhängig von der verwendeten Schlüssellänge, mit einem Datendurchsatz
von bis zu 25 Mbit/s und ermöglicht so die erfolgreiche Deobfuskierung
in Fällen an denen andere Ansätze scheitern.
Die erzielten Ergebnisse in Hinsicht auf Laufzeiteffizienz und
Erkennungsleistung sind vielversprechend. Die vorgestellten Methoden
ermöglichen die dringend nötige Automatisierung von
Verteidigungsmaßnahmen, sollen den Experten oder etablierte Methoden
aber nicht ersetzen, sondern diese unterstützen und ergänzen