11 research outputs found
An Evaluation Of N-gram System Call Sequence In Mobile Malware Detection
The rapid growth of Android-based mobile devices technology in recent years has increased the proliferation of mobile devices throughout the community at large. The ability of Android mobile devices has become similar to its desktop environment; users can do more than just a phone call and short text messaging. These days, Android mobile devices are used for various applications such as web browsing, ubiquitous services, social networking, MMS and many more. However, the rapid growth of Android mobile devices technology has also triggered the malware author to start
exploiting the vulnerabilities of the devices. Based on this reason, this paper explores mobile malware detection through an n-gram system call sequence which uses a sequence of system call invoked by the mobile application as the feature in classifying a benign and malicious mobile application. Several n-gram values are evaluated with Linear-SVM classifier to determine the best n system call sequence that produces the highest detection accuracy and highest True Positive Rate (TPR) with low False Positive Rate (FPR)
Review on Malware and Malware Detection Using Data Mining Techniques
البرمجيات الخبيثة هي اي نوع من البرمجيات او شفرات برمجية التي هدفها سرقة بعض المعلومات الخاصة او بيانات من نظام الكمبيوتر او عمليات الكمبيوتر او(و) فقط ببساطة لعمل المبتغيات غير المشروعة لصانع البرامجيات الخبيثة على نظام الكمبيوتر، وبدون الرخصة من مستخدمي الكمبيوتر. البرامجيات الخبيثة للمختصر القصير تعرف كملور. ومع ذلك، اكتشاف البرامجبات الخبيثة اصبحت واحدة من اهم المشاكل في مجال امن الكمبيوتر وذلك لان بنية الاتصال الحالية غير حصينه للاختراق من قبل عدة انواع من استراتيجيات الاصابات والهجومات للبرامجيات الخبيثة. فضلا على ذلك، البرامجيات الخبيثة متنوعة ومختلفة في المقدار والنوعيات وهذا يبطل بصورة تامة فعالية طرق الحماية القديمة والتقليدية مثل طريقة التواقيع والتي تكون غير قادرة على اكتشاف البرامجيات الخبيثة الجديدة. من ناحية أخرى، هذا الضعف سوف يودي الى نجاح اختراق (والهجوم) نظام الكمبيوتر بالإضافة الى نجاح هجومات أكثر تطوراً مثل هجوم منع الخدمة الموزع. طرق تنقيب البيانات يمكن ان تستخدم لتغلب على القصور في طريقة التواقيع لاكتشاف البرامجيات الخبيثة غير المعروفة. هذا البحث يقدم نظره عامة عن البرامجيات الخبيثة وانظمة اكتشاف البرامجيات الخبيثة باستخدام التقنيات الحديثة مثل تقنيات طريقة تعدين البيانات لاكتشاف عينات البرامجيات الخبيثة المعروفة وغير المعروفة.Malicious software is any type of software or codes which hooks some: private information, data from the computer system, computer operations or(and) merely just to do malicious goals of the author on the computer system, without permission of the computer users. (The short abbreviation of malicious software is Malware). However, the detection of malware has become one of biggest issues in the computer security field because of the current communication infrastructures are vulnerable to penetration from many types of malware infection strategies and attacks. Moreover, malwares are variant and diverse in volume and types and that strictly explode the effectiveness of traditional defense methods like signature approach, which is unable to detect a new malware. However, this vulnerability will lead to a successful computer system penetration (and attack) as well as success of more advanced attacks like distributed denial of service (DDoS) attack. Data mining methods can be used to overcome limitation of signature-based techniques to detect the zero-day malware. This paper provides an overview of malware and malware detection system using modern techniques such as techniques of data mining approach to detect known and unknown malware samples
A Comparative Study on Feature Selection Method for N-gram Mobile Malware Detection
Abstract In recent years, mobile device technology has become an important necessity in our community at large. The ability of the mobile technology today has become more similar to its desktop environment. Despite the advancement of the mobile devices technology provide, it has also exposes the mobile devices to the similar threat it predecessor possess. One of the anomaly based detection methods used in detecting mobile malware is the n-gram system call sequence. However, with the limited storage, memory and CPU processing power, mobile devices that provide this approach can exhaust the mobile device resources. This is due to the huge amount of system call to be collected and processed for the detection approach. To overcome the issues, this paper investigates the use of several different feature selection methods in optimizing the n-gram system call sequence feature in classifying benign and malicious mobile application. Several filter and wrapper feature selection methods are selected and their performance analyzed. The feature selection methods are evaluated based on the number of feature selected and the contribution it made to improve the True Positive Rate (TPR), False Positive Rate (FPR) and Accuracy of the Linear-SVM classifier in classifying benign and malicious mobile malware application
Feature selection and machine learning classification for malware detection
Malware is a computer security problem that can morph to evade traditional detection methods based on known signature matching. Since new malware variants contain patterns that are similar to those in observed malware, machine learning techniques can be used to identify new malware. This work presents a comparative study of several feature selection methods with four different machine learning classifiers in the context of static malware detection based on n-grams analysis. The result shows that the use of Principal Component Analysis (PCA) feature selection and Support Vector Machines (SVM) classification gives the best classification accuracy using a minimum number of feature
Sec-Lib: Protecting Scholarly Digital Libraries From Infected Papers Using Active Machine Learning Framework
Researchers from academia and the corporate-sector rely on scholarly digital libraries to access articles. Attackers take advantage of innocent users who consider the articles' files safe and thus open PDF-files with little concern. In addition, researchers consider scholarly libraries a reliable, trusted, and untainted corpus of papers. For these reasons, scholarly digital libraries are an attractive-target and inadvertently support the proliferation of cyber-attacks launched via malicious PDF-files. In this study, we present related vulnerabilities and malware distribution approaches that exploit the vulnerabilities of scholarly digital libraries. We evaluated over two-million scholarly papers in the CiteSeerX library and found the library to be contaminated with a surprisingly large number (0.3-2%) of malicious PDF documents (over 55% were crawled from the IPs of US-universities). We developed a two layered detection framework aimed at enhancing the detection of malicious PDF documents, Sec-Lib, which offers a security solution for large digital libraries. Sec-Lib includes a deterministic layer for detecting known malware, and a machine learning based layer for detecting unknown malware. Our evaluation showed that scholarly digital libraries can detect 96.9% of malware with Sec-Lib, while minimizing the number of PDF-files requiring labeling, and thus reducing the manual inspection efforts of security-experts by 98%
Sec-Lib: Protecting Scholarly Digital Libraries From Infected Papers Using Active Machine Learning Framework
Researchers from academia and the corporate-sector rely on scholarly digital libraries to access articles. Attackers take advantage of innocent users who consider the articles\u27 files safe and thus open PDF-files with little concern. In addition, researchers consider scholarly libraries a reliable, trusted, and untainted corpus of papers. For these reasons, scholarly digital libraries are an attractive-target and inadvertently support the proliferation of cyber-attacks launched via malicious PDF-files. In this study, we present related vulnerabilities and malware distribution approaches that exploit the vulnerabilities of scholarly digital libraries. We evaluated over two-million scholarly papers in the CiteSeerX library and found the library to be contaminated with a surprisingly large number (0.3-2%) of malicious PDF documents (over 55% were crawled from the IPs of US-universities). We developed a two layered detection framework aimed at enhancing the detection of malicious PDF documents, Sec-Lib, which offers a security solution for large digital libraries. Sec-Lib includes a deterministic layer for detecting known malware, and a machine learning based layer for detecting unknown malware. Our evaluation showed that scholarly digital libraries can detect 96.9% of malware with Sec-Lib, while minimizing the number of PDF-files requiring labeling, and thus reducing the manual inspection efforts of security-experts by 98%
Recommended from our members
A new model for worm detection and response. Development and evaluation of a new model based on knowledge discovery and data mining techniques to detect and respond to worm infection by integrating incident response, security metrics and apoptosis.
Worms have been improved and a range of sophisticated techniques have been
integrated, which make the detection and response processes much harder and
longer than in the past. Therefore, in this thesis, a STAKCERT (Starter Kit for
Computer Emergency Response Team) model is built to detect worms attack in
order to respond to worms more efficiently.
The novelty and the strengths of the STAKCERT model lies in the method
implemented which consists of STAKCERT KDD processes and the
development of STAKCERT worm classification, STAKCERT relational model
and STAKCERT worm apoptosis algorithm. The new concept introduced in this
model which is named apoptosis, is borrowed from the human immunology
system has been mapped in terms of a security perspective. Furthermore, the
encouraging results achieved by this research are validated by applying the
security metrics for assigning the weight and severity values to trigger the
apoptosis. In order to optimise the performance result, the standard operating
procedures (SOP) for worm incident response which involve static and dynamic
analyses, the knowledge discovery techniques (KDD) in modeling the
STAKCERT model and the data mining algorithms were used.
This STAKCERT model has produced encouraging results and outperformed
comparative existing work for worm detection. It produces an overall accuracy
rate of 98.75% with 0.2% for false positive rate and 1.45% is false negative rate.
Worm response has resulted in an accuracy rate of 98.08% which later can be
used by other researchers as a comparison with their works in future.Ministry of Higher Education, Malaysia
and Universiti Sains Islam Malaysia (USIM