4 research outputs found

    Classification of malware based on string and function feature selection

    Full text link
    Anti-malware software producers are continually challenged to identify and counter new malware as it is released into the wild. A dramatic increase in malware production in recent years has rendered the conventional method of manually determining a signature for each new malware sample untenable. This paper presents a scalable, automated approach for detecting and classifying malware by using pattern recognition algorithms and statistical methods at various stages of the malware analysis life cycle. Our framework combines the static features of function length and printable string information extracted from malware samples into a single test which gives classification results better than those achieved by using either feature individually. In our testing we input feature information from close to 1400 unpacked malware samples to a number of different classification algorithms. Using k-fold cross validation on the malware, which includes Trojans and viruses, along with 151 clean files, we achieve an overall classification accuracy of over 98%.</p

    Malware Pattern of Life Analysis

    Get PDF
    Many malware classifications include viruses, worms, trojans, ransomware, bots, adware, spyware, rootkits, file-less downloaders, malvertising, and many more. Each type may share unique behavioral characteristics with its methods of operations (MO), a pattern of behavior so distinctive that it could be recognized as having the same creator. The research shows the extraction of malware methods of operation using the step-by-step process of Artificial-Based Intelligence (ABI) with built-in Density-based spatial clustering of applications with noise (DBSCAN) machine learning to quantify the actions for their similarities, differences, baseline behaviors, and anomalies. The collected data of the research is from the ransomware sample repositories of Malware Bazaar and Virus Share, totaling 1300 live malicious codes ingested into the CAPEv2 malware sandbox, allowing the capture of traces of static, dynamic, and network behavior features. The ransomware features have shown significant activity of varying identified functions used in encryption, file application programming interface (API), and network function calls. During the machine learning categorization phase, there are eight identified clusters that have similar and different features regarding function-call sequencing events and file access manipulation for dropping file notes and writing encryption. Having compared all the clusters using a “supervenn” pictorial diagram, the characteristics of the static and dynamic behavior of the ransomware give the initial baselines for comparison with other variants that may have been added to the collected data for intelligence gathering. The findings provide a novel practical approach for intelligence gathering to address ransomware or any other malware variants’ activity patterns to discern similarities, anomalies, and differences between malware actions under study
    corecore