306 research outputs found
Doctor of Philosophy
dissertationIn computer science, functional software testing is a method of ensuring that software gives expected output on specific inputs. Software testing is conducted to ensure desired levels of quality in light of uncertainty resulting from the complexity of software. Most of today's software is written by people and software development is a creative activity. However, due to the complexity of computer systems and software development processes, this activity leads to a mismatch between the expected software functionality and the implemented one. If not addressed in a timely and proper manner, this mismatch can cause serious consequences to users of the software, such as security and privacy breaches, financial loss, and adversarial human health issues. Because of manual effort, software testing is costly. Software testing that is performed without human intervention is automatic software testing and it is one way of addressing the issue. In this work, we build upon and extend several techniques for automatic software testing. The techniques do not require any guidance from the user. Goals that are achieved with the techniques are checking for yet unknown errors, automatically testing object-oriented software, and detecting malicious software. To meet these goals, we explored several techniques and related challenges: automatic test case generation, runtime verification, dynamic symbolic execution, and the type and size of test inputs for efficient detection of malicious software via machine learning. Our work targets software written in the Java programming language, though the techniques are general and applicable to other languages. We performed an extensive evaluation on freely available Java software projects, a flight collision avoidance system, and thousands of applications for the Android operating system. Evaluation results show to what extent dynamic symbolic execution is applicable in testing object-oriented software, they show correctness of the flight system on millions of automatically customized and generated test cases, and they show that simple and relatively small inputs in random testing can lead to effective malicious software detection
Um método supervisionado para encontrar variáveis discriminantes na análise de problemas complexos : estudos de caso em segurança do Android e em atribuição de impressora fonte
Orientadores: Ricardo Dahab, Anderson de Rezende RochaDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A solução de problemas onde muitos componentes atuam e interagem simultaneamente requer modelos de representação nem sempre tratáveis pelos métodos analíticos tradicionais. Embora em muitos caso se possa prever o resultado com excelente precisão através de algoritmos de aprendizagem de máquina, a interpretação do fenómeno requer o entendimento de quais são e em que proporção atuam as variáveis mais importantes do processo. Esta dissertação apresenta a aplicação de um método onde as variáveis discriminantes são identificadas através de um processo iterativo de ranqueamento ("ranking") por eliminação das que menos contribuem para o resultado, avaliando-se em cada etapa o impacto da redução de características nas métricas de acerto. O algoritmo de florestas de decisão ("Random Forest") é utilizado para a classificação e sua propriedade de importância das características ("Feature Importance") para o ranqueamento. Para a validação do método, dois trabalhos abordando sistemas complexos de natureza diferente foram realizados dando origem aos artigos aqui apresentados. O primeiro versa sobre a análise das relações entre programas maliciosos ("malware") e os recursos requisitados pelos mesmos dentro de um ecossistema de aplicações no sistema operacional Android. Para realizar esse estudo, foram capturados dados, estruturados segundo uma ontologia definida no próprio artigo (OntoPermEco), de 4.570 aplicações (2.150 malware, 2.420 benignas). O modelo complexo produziu um grafo com cerca de 55.000 nós e 120.000 arestas, o qual foi transformado usando-se a técnica de bolsa de grafos ("Bag Of Graphs") em vetores de características de cada aplicação com 8.950 elementos. Utilizando-se apenas os dados do manifesto atingiu-se com esse modelo 88% de acurácia e 91% de precisão na previsão do comportamento malicioso ou não de uma aplicação, e o método proposto foi capaz de identificar 24 características relevantes na classificação e identificação de famílias de malwares, correspondendo a 70 nós no grafo do ecosistema. O segundo artigo versa sobre a identificação de regiões em um documento impresso que contém informações relevantes na atribuição da impressora laser que o imprimiu. O método de identificação de variáveis discriminantes foi aplicado sobre vetores obtidos a partir do uso do descritor de texturas (CTGF-"Convolutional Texture Gradient Filter") sobre a imagem scaneada em 600 DPI de 1.200 documentos impressos em 10 impressoras. A acurácia e precisão médias obtidas no processo de atribuição foram de 95,6% e 93,9% respectivamente. Após a atribuição da impressora origem a cada documento, 8 das 10 impressoras permitiram a identificação de variáveis discriminantes associadas univocamente a cada uma delas, podendo-se então visualizar na imagem do documento as regiões de interesse para uma análise pericial. Os objetivos propostos foram atingidos mostrando-se a eficácia do método proposto na análise de dois problemas em áreas diferentes (segurança de aplicações e forense digital) com modelos complexos e estruturas de representação bastante diferentes, obtendo-se um modelo reduzido interpretável para ambas as situaçõesAbstract: Solving a problem where many components interact and affect results simultaneously requires models which sometimes are not treatable by traditional analytic methods. Although in many cases the result is predicted with excellent accuracy through machine learning algorithms, the interpretation of the phenomenon requires the understanding of how the most relevant variables contribute to the results. This dissertation presents an applied method where the discriminant variables are identified through an iterative ranking process. In each iteration, a classifier is trained and validated discarding variables that least contribute to the result and evaluating in each stage the impact of this reduction in the classification metrics. Classification uses the Random Forest algorithm, and the discarding decision applies using its feature importance property. The method handled two works approaching complex systems of different nature giving rise to the articles presented here. The first article deals with the analysis of the relations between \textit{malware} and the operating system resources requested by them within an ecosystem of Android applications. Data structured according to an ontology defined in the article (OntoPermEco) were captured to carry out this study from 4,570 applications (2,150 malware, 2,420 benign). The complex model produced a graph of about 55,000 nodes and 120,000 edges, which was transformed using the Bag of Graphs technique into feature vectors of each application with 8,950 elements. The work accomplished 88% of accuracy and 91% of precision in predicting malicious behavior (or not) for an application using only the data available in the application¿s manifest, and the proposed method was able to identify 24 relevant features corresponding to only 70 nodes of the entire ecosystem graph. The second article is about to identify regions in a printed document that contains information relevant to the attribution of the laser printer that printed it. The discriminant variable determination method achieved average accuracy and precision of 95.6% and 93.9% respectively in the source printer attribution using a dataset of 1,200 documents printed on ten printers. Feature vectors were obtained from the scanned image at 600 DPI applying the texture descriptor Convolutional Texture Gradient Filter (CTGF). After the assignment of the source printer to each document, eight of the ten printers allowed the identification of discriminant variables univocally associated to each one of them, and it was possible to visualize in document's image the regions of interest for expert analysis. The work in both articles accomplished the objective of reducing a complex system into an interpretable streamlined model demonstrating the effectiveness of the proposed method in the analysis of two problems in different areas (application security and digital forensics) with complex models and entirely different representation structuresMestradoCiência da ComputaçãoMestre em Ciência da Computaçã
Techniques for advanced android malware triage
Mención Internacional en el título de doctorAndroid is the leading operating system in smartphones with a big difference.
Statistics show that 88% of all smartphones sold to end users in
the second quarter of 2018 were phones with the Android OS. Regardless
of the operating systems which are running on smartphones, most of
the functionalities of these devices are offered through applications. There
are currently over 2 million apps only on the official Google store, known
as Google Play. This huge market with billions of users is tempting for
attackers to develop and distribute their malicious apps (or malware).
Mobile malware has raised explosively since 2009. Symantec reported
an increase of 54% in the new mobile malware variants in 2017 as compared
to the previous year. Additionally, more incentive has been provided
for profit-driven malware by the growth of black markets. This rise has
happened for Android malware as well since only 20% of devices are running
the newest major version of Android OS based on Symantec report in
2018. Android continued to be the most targeted platform with the biggest
number of attacks in 2015. After that year, attacks against the Android
platform slowed for the first time as attackers were faced with improved
security architectures though Android is still the main appealing target OS
for attackers. Moreover, advanced types of Android malware are found
which make use of extensive anit-analysis techniques to evade static or
dynamic analysis.
To address the security and privacy concerns of complex Android malware,
this dissertation focuses on three main objectives. First of all, we
propose a light-weight yet efficient method to identify risky Android applications.
Next, we present a precise approach to characterize Android
malware based on their malicious behavior. Finally, we propose an adaptive learning system to address the security concerns of obfuscation in Android
malware.
Identifying potentially dangerous and risky applications is an important
step in Android malware analysis. To this end, we develop a triage system
to rank applications based on their potential risk. Our approach, called TriFlow, relies on static features which are quick to obtain. TriFlow combines
a probabilistic model to predict the existence of information flows with a
metric of how significant a flow is in benign and malicious apps. Based
on this, TriFlow provides a score for each application that can be used to
prioritize analysis. It also provides the analysts with an explanatory report
of the associated risk. Our tool can also be used as a complement with
computationally expensive static and dynamic analysis tools.
Another important step towards Android malware analysis lies in their
accurate characterization. Labeling Android malware is challenging yet
crucially important, as it helps to identify upcoming malware samples and
threats. A key challenge is that different researchers and anti-virus vendors
assign labels using their own criteria, and it is not known to what
extent these labels are aligned with the apps’ real behavior. Based on this,
we propose a new behavioral characterization method for Android apps
based on their extracted information flows. As information flows can be
used to track why and how apps use specific pieces of information, a flowbased
characterization provides a relatively easy-to-interpret summary of
the malware sample’s behavior.
Not all Android malware are easy to analyze due to advanced and easyto-apply anti-analysis techniques that are available nowadays. Obfuscation
is the most common anti-analysis technique that Android malware use to
evade detection. Obfuscation techniques modify an app’s source (or machine)
code in order to make it more difficult to analyze. This is typically
applied to protect intellectual property in benign apps, or to hinder the process
of extracting actionable information in the case of malware. Since
malware analysis often requires considerable resource investment, detecting
the particular obfuscation technique used may contribute to apply the
right analysis tools, thus leading to some savings.
Therefore, we propose AndrODet, a mechanism to detect three popular
types of obfuscation in Android applications, namely identifier renaming, string encryption, and control flow obfuscation. AndrODet leverages online
learning techniques, thus being suitable for resource-limited environments
that need to operate in a continuous manner. We compare our results
with a batch learning algorithm using a dataset of 34,962 apps from both
malware and benign apps. Experimental results show that online learning
approaches are not only able to compete with batch learning methods in
terms of accuracy, but they also save significant amount of time and computational
resources.
Finally, we present a number of open research directions based on the
outcome of this thesis.Android es el sistema operativo líder en teléfonos inteligentes (también
denominados con la palabra inglesa smartphones), con una gran diferencia
con respecto al resto de competidores. Las estadísticas muestran que el
88% de todos los smartphones vendidos a usuarios finales en el segundo
trimestre de 2018 fueron teléfonos con sistema operativo Android. Independientemente
de su sistema operativo, la mayoría de las funcionalidades
de estos dispositivos se ofrecen a través de aplicaciones. Actualmente hay
más de 2 millones de aplicaciones solo en la tienda oficial de Google, conocida
como Google Play. Este enorme mercado con miles de millones de
usuarios es tentador para los atacantes, que buscan distribuir sus aplicaciones
malintencionadas (o malware).
El malware para dispositivos móviles ha aumentado de forma exponencial
desde 2009. Symantec ha detectado un aumento del 54% en las nuevas
variantes de malware para dispositivos móviles en 2017 en comparación
con el año anterior. Además, el crecimiento del mercado negro (es decir,
plataformas no oficiales de descargas de aplicaciones) supone un incentivo
para los programas maliciosos con fines lucrativos. Este aumento también
ha ocurrido en el malware de Android, aprovechando la circunstancia de
que solo el 20% de los dispositivos ejecutan la versión mas reciente del sistema
operativo Android, de acuerdo con el informe de Symantec en 2018.
De hecho, Android ha sido la plataforma que ha centrado los esfuerzos de
los atacantes desde 2015, aunque los ataques decayeron ligeramente tras
ese año debido a las mejoras de seguridad incorporadas en el sistema operativo.
En todo caso, existen formas avanzadas de malware para Android
que hacen uso de técnicas sofisticadas para evadir el análisis estático o
dinámico.
Para abordar los problemas de seguridad y privacidad que causa el malware
en Android, esta Tesis se centra en tres objetivos principales. En
primer lugar, se propone un método ligero y eficiente para identificar aplicaciones
de Android que pueden suponer un riesgo. Por otra parte, se presenta
un mecanismo para la caracterización del malware atendiendo a su
comportamiento. Finalmente, se propone un mecanismo basado en aprendizaje
adaptativo para la detección de algunos tipos de ofuscación que son
empleados habitualmente en las aplicaciones maliciosas.
Identificar aplicaciones potencialmente peligrosas y riesgosas es un
paso importante en el análisis de malware de Android. Con este fin, en
esta Tesis se desarrolla un mecanismo de clasificación (llamado TriFlow)
que ordena las aplicaciones según su riesgo potencial. La aproximación
se basa en características estáticas que se obtienen rápidamente, siendo de
especial interés los flujos de información. Un flujo de información existe
cuando un cierto dato es recibido o producido mediante una cierta función
o llamada al sistema, y atraviesa la lógica de la aplicación hasta que
llega a otra función. Así, TriFlow combina un modelo probabilístico para
predecir la existencia de un flujo con una métrica de lo habitual que es
encontrarlo en aplicaciones benignas y maliciosas. Con ello, TriFlow proporciona
una puntuación para cada aplicación que puede utilizarse para
priorizar su análisis. Al mismo tiempo, proporciona a los analistas un informe
explicativo de las causas que motivan dicha valoración. Así, esta
herramienta se puede utilizar como complemento a otras técnicas de análisis
estático y dinámico que son mucho más costosas desde el punto de vista
computacional.
Otro paso importante hacia el análisis de malware de Android radica
en caracterizar su comportamiento. Etiquetar el malware de Android es
un desafío de crucial importancia, ya que ayuda a identificar las próximas
muestras y amenazas de malware. Una cuestión relevante es que los
diferentes investigadores y proveedores de antivirus asignan etiquetas utilizando
sus propios criterios, de modo no se sabe en qué medida estas etiquetas
están en línea con el comportamiento real de las aplicaciones. Sobre
esta base, en esta Tesis se propone un nuevo método de caracterización de
comportamiento para las aplicaciones de Android en función de sus flujos
de información. Como dichos flujos se pueden usar para estudiar el uso de
cada dato por parte de una aplicación, permiten proporcionar un resumen relativamente sencillo del comportamiento de una determinada muestra de
malware.
A pesar de la utilidad de las técnicas de análisis descritas, no todos los
programas maliciosos de Android son fáciles de analizar debido al uso de
técnicas anti-análisis que están disponibles en la actualidad. Entre ellas, la
ofuscación es la técnica más común que se utiliza en el malware de Android
para evadir la detección. Dicha técnica modifica el código de una
aplicación para que sea más difícil de entender y analizar. Esto se suele
aplicar para proteger la propiedad intelectual en aplicaciones benignas o
para dificultar la obtención de pistas sobre su funcionamiento en el caso
del malware. Dado que el análisis de malware a menudo requiere una inversión
considerable de recursos, detectar la técnica de ofuscación que se
ha utilizado en un caso particular puede contribuir a utilizar herramientas
de análisis adecuadas, contribuyendo así a un cierto ahorro de recursos.
Así, en esta Tesis se propone AndrODet, un mecanismo para detectar tres
tipos populares de ofuscación, a saber, el renombrado de identificadores,
cifrado de cadenas de texto y la modificación del flujo de control de la aplicación.
AndrODet se basa en técnicas de aprendizaje automático en línea
(online machine learning), por lo que es adecuado para entornos con recursos
limitados que necesitan operar de forma continua, sin interrupción.
Para medir su eficacia respecto de las técnicas de aprendizaje automático
tradicionales, se comparan los resultados con un algoritmo de aprendizaje
por lotes (batch learning) utilizando un dataset de 34.962 aplicaciones de
malware y benignas. Los resultados experimentales muestran que el enfoque
de aprendizaje en línea no solo es capaz de competir con el basado
en lotes en términos de precisión, sino que también ahorra una gran cantidad
de tiempo y recursos computacionales.
Tras la exposición de las contribuciones anteriormente mencionadas,
esta Tesis concluye con la identificación de una serie de líneas abiertas de
investigación con el fin de alentar el desarrollo de trabajos futuros en esta
dirección.Omid Mirzaei is a Ph.D. candidate in the Computer Security Lab (COSEC)
at the Department of Computer Science and Engineering of Universidad
Carlos III de Madrid (UC3M). His Ph.D. is funded by the Community
of Madrid and the European Union through the research project CIBERDINE
(Ref. S2013/ICE-3095).Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Gregorio Martínez Pérez.- Secretario: Pedro Peris López.- Vocal: Pablo Picazo Sánche
Applying Deep Learning Techniques to the Analysis of Android APKs
Malware targeting mobile devices is a pervasive problem in modern life and as such tools to detect and classify malware are of great value. This paper seeks to demonstrate the effectiveness of Deep Learning Techniques, specifically Convolutional Neural Networks, in detecting and classifying malware targeting the Android operating system. Unlike many current detection techniques, which require the use of relatively rigid features to aid in detection, deep neural networks are capable of automatically learning flexible features which may be more resilient to obfuscation. We present a parsing for extracting sequences of API calls which can be used to describe a hypothetical execution of a given application. We then show how to use this sequence of API calls to successfully classify Android malware using a Convolutional Neural Network
Adversarial Detection of Flash Malware: Limitations and Open Issues
During the past four years, Flash malware has become one of the most
insidious threats to detect, with almost 600 critical vulnerabilities targeting
Adobe Flash disclosed in the wild. Research has shown that machine learning can
be successfully used to detect Flash malware by leveraging static analysis to
extract information from the structure of the file or its bytecode. However,
the robustness of Flash malware detectors against well-crafted evasion attempts
- also known as adversarial examples - has never been investigated. In this
paper, we propose a security evaluation of a novel, representative Flash
detector that embeds a combination of the prominent, static features employed
by state-of-the-art tools. In particular, we discuss how to craft adversarial
Flash malware examples, showing that it suffices to manipulate the
corresponding source malware samples slightly to evade detection. We then
empirically demonstrate that popular defense techniques proposed to mitigate
evasion attempts, including re-training on adversarial examples, may not always
be sufficient to ensure robustness. We argue that this occurs when the feature
vectors extracted from adversarial examples become indistinguishable from those
of benign data, meaning that the given feature representation is intrinsically
vulnerable. In this respect, we are the first to formally define and
quantitatively characterize this vulnerability, highlighting when an attack can
be countered by solely improving the security of the learning algorithm, or
when it requires also considering additional features. We conclude the paper by
suggesting alternative research directions to improve the security of
learning-based Flash malware detectors
A Pre-Trained BERT Model for Android Applications
The automation of an increasingly large number of software engineering tasks
is becoming possible thanks to Machine Learning (ML). One foundational building
block in the application of ML to software artifacts is the representation of
these artifacts (e.g., source code or executable code) into a form that is
suitable for learning. Many studies have leveraged representation learning,
delegating to ML itself the job of automatically devising suitable
representations. Yet, in the context of Android problems, existing models are
either limited to coarse-grained whole-app level (e.g., apk2vec) or conducted
for one specific downstream task (e.g., smali2vec). Our work is part of a new
line of research that investigates effective, task-agnostic, and fine-grained
universal representations of bytecode to mitigate both of these two
limitations. Such representations aim to capture information relevant to
various low-level downstream tasks (e.g., at the class-level). We are inspired
by the field of Natural Language Processing, where the problem of universal
representation was addressed by building Universal Language Models, such as
BERT, whose goal is to capture abstract semantic information about sentences,
in a way that is reusable for a variety of tasks. We propose DexBERT, a
BERT-like Language Model dedicated to representing chunks of DEX bytecode, the
main binary format used in Android applications. We empirically assess whether
DexBERT is able to model the DEX language and evaluate the suitability of our
model in two distinct class-level software engineering tasks: Malicious Code
Localization and Defect Prediction. We also experiment with strategies to deal
with the problem of catering to apps having vastly different sizes, and we
demonstrate one example of using our technique to investigate what information
is relevant to a given task
Resilient and Scalable Android Malware Fingerprinting and Detection
Malicious software (Malware) proliferation reaches hundreds of thousands daily. The manual analysis of such a large volume of malware is daunting and time-consuming. The diversity of targeted systems in terms of architecture and platforms compounds the challenges of Android malware detection and malware in general. This highlights the need to design and implement new scalable and robust methods, techniques, and tools to detect Android malware. In this thesis, we develop a malware fingerprinting framework to cover accurate Android malware detection and family attribution. In this context, we emphasize the following: (i) the scalability over a large malware corpus; (ii) the resiliency to common obfuscation techniques; (iii) the portability over different platforms and architectures.
In the context of bulk and offline detection on the laboratory/vendor level: First, we propose an approximate fingerprinting technique for Android packaging that captures the underlying static structure of the Android apps. We also propose a malware clustering framework on top of this fingerprinting technique to perform unsupervised malware detection and grouping by building and partitioning a similarity network of malicious apps. Second, we propose an approximate fingerprinting technique for Android malware's behavior reports generated using dynamic analyses leveraging natural language processing techniques. Based on this fingerprinting technique, we propose a portable malware detection and family threat attribution framework employing supervised machine learning techniques. Third, we design an automatic framework to produce intelligence about the underlying malicious cyber-infrastructures of Android malware. We leverage graph analysis techniques to generate relevant, actionable, and granular intelligence that can be used to identify the threat effects induced by malicious Internet activity associated to Android malicious apps.
In the context of the single app and online detection on the mobile device level, we further propose the following: Fourth, we design a portable and effective Android malware detection system that is suitable for deployment on mobile and resource constrained devices, using machine learning classification on raw method call sequences. Fifth, we elaborate a framework for Android malware detection that is resilient to common code obfuscation techniques and adaptive to operating systems and malware change overtime, using natural language processing and deep learning techniques.
We also evaluate the portability of the proposed techniques and methods beyond Android platform malware, as follows: Sixth, we leverage the previously elaborated techniques to build a framework for cross-platform ransomware fingerprinting relying on raw hybrid features in conjunction with advanced deep learning techniques
GRASE: Granulometry Analysis with Semi Eager Classifier to Detect Malware
Technological advancement in communication leading to 5G, motivates everyone to get connected to the internet including ‘Devices’, a technology named Web of Things (WoT). The community benefits from this large-scale network which allows monitoring and controlling of physical devices. But many times, it costs the security as MALicious softWARE (MalWare) developers try to invade the network, as for them, these devices are like a ‘backdoor’ providing them easy ‘entry’. To stop invaders from entering the network, identifying malware and its variants is of great significance for cyberspace. Traditional methods of malware detection like static and dynamic ones, detect the malware but lack against new techniques used by malware developers like obfuscation, polymorphism and encryption. A machine learning approach to detect malware, where the classifier is trained with handcrafted features, is not potent against these techniques and asks for efforts to put in for the feature engineering. The paper proposes a malware classification using a visualization methodology wherein the disassembled malware code is transformed into grey images. It presents the efficacy of Granulometry texture analysis technique for improving malware classification. Furthermore, a Semi Eager (SemiE) classifier, which is a combination of eager learning and lazy learning technique, is used to get robust classification of malware families. The outcome of the experiment is promising since the proposed technique requires less training time to learn the semantics of higher-level malicious behaviours. Identifying the malware (testing phase) is also done faster. A benchmark database like malimg and Microsoft Malware Classification challenge (BIG-2015) has been utilized to analyse the performance of the system. An overall average classification accuracy of 99.03 and 99.11% is achieved, respectively
MDFRCNN: Malware Detection using Faster Region Proposals Convolution Neural Network
Technological advancement of smart devices has opened up a new trend: Internet of Everything (IoE), where all devices are connected to the web. Large scale networking benefits the community by increasing connectivity and giving control of physical devices. On the other hand, there exists an increased ‘Threat’ of an ‘Attack’. Attackers are targeting these devices, as it may provide an easier ‘backdoor entry to the users’ network’.MALicious softWARE (MalWare) is a major threat to user security. Fast and accurate detection of malware attacks are the sine qua non of IoE, where large scale networking is involved. The paper proposes use of a visualization technique where the disassembled malware code is converted into gray images, as well as use of Image Similarity based Statistical Parameters (ISSP) such as Normalized Cross correlation (NCC), Average difference (AD), Maximum difference (MaxD), Singular Structural Similarity Index Module (SSIM), Laplacian Mean Square Error (LMSE), MSE and PSNR. A vector consisting of gray image with statistical parameters is trained using a Faster Region proposals Convolution Neural Network (F-RCNN) classifier. The experiment results are promising as the proposed method includes ISSP with F-RCNN training. Overall training time of learning the semantics of higher-level malicious behaviors is less. Identification of malware (testing phase) is also performed in less time. The fusion of image and statistical parameter enhances system performance with greater accuracy. The benchmark database from Microsoft Malware Classification challenge has been used to analyze system performance, which is available on the Kaggle website. An overall average classification accuracy of 98.12% is achieved by the proposed method
- …