465 research outputs found

    Malware Detection Based on Structural and Behavioural Features of API Calls

    Get PDF
    In this paper, we propose a five-step approach to detect obfuscated malware by investigating the structural and behavioural features of API calls. We have developed a fully automated system to disassemble and extract API call features effectively from executables. Using n-gram statistical analysis of binary content, we are able to classify if an executable file is malicious or benign. Our experimental results with a dataset of 242 malwares and 72 benign files have shown a promising accuracy of 96.5% for the unigram model. We also provide a preliminary analysis by our approach using support vector machine (SVM) and by varying n-values from 1 to 5, we have analysed the performance that include accuracy, false positives and false negatives. By applying SVM, we propose to train the classifier and derive an optimum n-gram model for detecting both known and unknown malware efficiently

    Most Recent Malicious Software Datasets and Machine Learning Detection Techniques: A Review

    Get PDF
    مقدمة: في سياق الأمن السيبراني ، أصبح من الضروري مراقبة الأنظمة وتحليل البيانات للحفاظ على أمن البيانات وسلامتها. في الآونة الأخيرة ، أصبح من المهم إنشاء نظام لتحليل البيانات وتصنيفها ، بهدف منع أي برامج ضارة مثل البرامج الضارة. طرق العمل: تم استخدام أحدث مجموعة بيانات للبرامج الضارة وتقنيات التعلم الآلي الحديثة للكشف عن البرامج الضارة ، بناءً على اختيار الميزات الديناميكية. الاستنتاجات: أدت الزيادة المستمرة في عدد وأنواع الهجمات إلى توسع هائل في متغيرات عينات البرامج الضارة. لذلك ، يجب تصنيف البرامج الضارة إلى مجموعات وفقًا لسلوكها وتأثيرها وخصائصها. بالنظر إلى حقيقة أن البحث والتدريب عنصران أساسيان للأمن السيبراني ، فإن تغيير الطبيعة باستمرار يشكل تحديًا كبيرًا. تهدف هذه الدراسة بشكل أساسي إلى توضيح أحدث مجموعة بيانات للبرامج الضارة وتقنيات التعلم الآلي الحديثة للكشف عن البرامج الضارة ، بناءً على اختيار الميزات الديناميكيةBackground: Within the context of cyber security, it has become crucial to monitor systems and analyze data to maintain data security and integrity. Recently, it has become important to create a system for analyzing and classifying data, to prevent any malicious programs such as malware. Materials and Methods: The latest malware dataset and the latest machine-learning techniques were used to detect malware, based on dynamic feature identification. Results: The results showed that the FFNN algorithm was the best algorithm for the sorel20M dataset based on the research work discussed in this paper.  Conclusion: The continuous increase in the number and types of attacks has led to a huge expansion in the variants of malware samples. Therefore, malware needs to be categorized into groups according to their behavior, influence, and characteristics. Given the fact that research and training are essential elements of cyber security, its constantly changing nature poses a great challenge. This study mainly aims to demonstrate the most recent malware dataset and modern machine-learning techniques of malware detection, based on dynamic feature selection

    Assessing and augmenting SCADA cyber security: a survey of techniques

    Get PDF
    SCADA systems monitor and control critical infrastructures of national importance such as power generation and distribution, water supply, transportation networks, and manufacturing facilities. The pervasiveness, miniaturisations and declining costs of internet connectivity have transformed these systems from strictly isolated to highly interconnected networks. The connectivity provides immense benefits such as reliability, scalability and remote connectivity, but at the same time exposes an otherwise isolated and secure system, to global cyber security threats. This inevitable transformation to highly connected systems thus necessitates effective security safeguards to be in place as any compromise or downtime of SCADA systems can have severe economic, safety and security ramifications. One way to ensure vital asset protection is to adopt a viewpoint similar to an attacker to determine weaknesses and loopholes in defences. Such mind sets help to identify and fix potential breaches before their exploitation. This paper surveys tools and techniques to uncover SCADA system vulnerabilities. A comprehensive review of the selected approaches is provided along with their applicability

    Using response action with Intelligent Intrusion detection and prevention System against web application malware

    Full text link
    Findings: After evaluating the new system, a better result was generated in line with detection efficiency and the false alarm rate. This demonstrates the value of direct response action in an intrusion detection system

    Malware classification using self organising feature maps and machine activity data

    Get PDF
    In this article we use machine activity metrics to automatically distinguish between malicious and trusted portable executable software samples. The motivation stems from the growth of cyber attacks using techniques that have been employed to surreptitiously deploy Advanced Persistent Threats (APTs). APTs are becoming more sophisticated and able to obfuscate much of their identifiable features through encryption, custom code bases and in-memory execution. Our hypothesis is that we can produce a high degree of accuracy in distinguishing malicious from trusted samples using Machine Learning with features derived from the inescapable footprint left behind on a computer system during execution. This includes CPU, RAM, Swap use and network traffic at a count level of bytes and packets. These features are continuous and allow us to be more flexible with the classification of samples than discrete features such as API calls (which can also be obfuscated) that form the main feature of the extant literature. We use these continuous data and develop a novel classification method using Self Organizing Feature Maps to reduce over fitting during training through the ability to create unsupervised clusters of similar ‘behaviour’ that are subsequently used as features for classification, rather than using the raw data. We compare our method to a set of machine classification methods that have been applied in previous research and demonstrate an increase of between 7.24% and 25.68% in classification accuracy using our method and an unseen dataset over the range of other machine classification methods that have been applied in previous research

    Metamorphic malware detection based on support vector machine classification of malware sub-signatures

    Get PDF
    Achieving accurate and efficient metamorphic malware detection remains a challenge. Metamorphic malware is able to mutate and alter its code structure in each infection that can circumvent signature matching detection. However, some vital functionalities and code segments remain unchanged between mutations. We exploit these unchanged features by the mean of classification using Support Vector Machine (SVM). N-gram features are extracted directly from malware binaries to avoid disassembly, which these features are then masked with the extracted known malware signature n-grams. These masked features reduce the number of selected n-gram features considerably. Our method is capable to accurately detect metamorphic malware with ~99 accuracy and low false positive rate. The proposed method is also superior to commercially available anti-viruses for detecting metamorphic malware

    Neural malware detection

    Get PDF
    At the heart of today’s malware problem lies theoretically infinite diversity created by metamorphism. The majority of conventional machine learning techniques tackle the problem with the assumptions that a sufficiently large number of training samples exist and that the training set is independent and identically distributed. However, the lack of semantic features combined with the models under these wrong assumptions result largely in overfitting with many false positives against real world samples, resulting in systems being left vulnerable to various adversarial attacks. A key observation is that modern malware authors write a script that automatically generates an arbitrarily large number of diverse samples that share similar characteristics in program logic, which is a very cost-effective way to evade detection with minimum effort. Given that many malware campaigns follow this paradigm of economic malware manufacturing model, the samples within a campaign are likely to share coherent semantic characteristics. This opens up a possibility of one-to-many detection. Therefore, it is crucial to capture this non-linear metamorphic pattern unique to the campaign in order to detect these seemingly diverse but identically rooted variants. To address these issues, this dissertation proposes novel deep learning models, including generative static malware outbreak detection model, generative dynamic malware detection model using spatio-temporal isomorphic dynamic features, and instruction cognitive malware detection. A comparative study on metamorphic threats is also conducted as part of the thesis. Generative adversarial autoencoder (AAE) over convolutional network with global average pooling is introduced as a fundamental deep learning framework for malware detection, which captures highly complex non-linear metamorphism through translation invariancy and local variation insensitivity. Generative Adversarial Network (GAN) used as a part of the framework enables oneshot training where semantically isomorphic malware campaigns are identified by a single malware instance sampled from the very initial outbreak. This is a major innovation because, to the best of our knowledge, no approach has been found to this challenging training objective against the malware distribution that consists of a large number of very sparse groups artificially driven by arms race between attackers and defenders. In addition, we propose a novel method that extracts instruction cognitive representation from uninterpreted raw binary executables, which can be used for oneto- many malware detection via one-shot training against frequency spectrum of the Transformer’s encoded latent representation. The method works regardless of the presence of diverse malware variations while remaining resilient to adversarial attacks that mostly use random perturbation against raw binaries. Comprehensive performance analyses including mathematical formulations and experimental evaluations are provided, with the proposed deep learning framework for malware detection exhibiting a superior performance over conventional machine learning methods. The methods proposed in this thesis are applicable to a variety of threat environments here artificially formed sparse distributions arise at the cyber battle fronts.Doctor of Philosoph

    Metamorphic Malware Detection Based on Support Vector Machine Classification of Malware Sub-Signatures

    Get PDF
    Achieving accurate and efficient metamorphic malware detection remains a challenge. Metamorphic malware is able to mutate and alter its code structure in each infection, with some vital functionality and codesegment remain unchanged. We exploit these unchanged features for detecting metamorphic malware detection using Support Vector Machine(SVM) classifier. n-gram features are extracted directly from sample malware binaries to avoid disassembly, which are then masked with the extracted Snort signature n-grams. These masked features reduce considerably the number of selected n-gram features. Our method is capable to accurately detect metamorphic malware with ~99 % accuracy and low false positive rate. The proposed method is also superior than commercially available anti-viruses in detecting metamorphicmalware
    corecore