Search CORE

50 research outputs found

Software for malicious macro detection

Author: Peidro Paredes Miguel Pedro
Publication venue
Publication date: 01/07/2020
Field of study

The objective of this work is to give a detailed study of the development process of a software tool for the detection of the Emotet virus in Microsoft Office files, Emotet is a virus that has been wreaking havoc mainly in the business environment, from its beginnings as a banking Trojan to nowadays. In fact, this polymorphic family has managed to generate evident, incalculable and global inconveniences in the business activity without discriminating by corporate typology, affecting any company regardless of its size or sector, even entering into government agencies, as well as the citizens themselves as a whole. The existence of two main obstacles for the detection of this virus, constitute an intrinsic reality to it, on the one hand, the obfuscation in its macros and on the other, its polymorphism, are essential pieces of the analysis, focusing our tool in facing precisely two obstacles, descending to the analysis of the macros features and the creation of a neuron network that uses machine learning to recognize the detection patterns and deliberate its malicious nature. With Emotet's in-depth nature analysis, our goal is to draw out a set of features from the malicious macros and build a machine learning model for their detection. After the feasibility study of this project, its design and implementation, the results that emerge endorse the intention to detect Emotet starting only from the static analysis and with the application of machine learning techniques. The detection ratios shown by the tests performed on the final model, present a accuracy of 84% and only 3% of false positives during this detection process.Grado en Ingeniería Informátic

Universidad Carlos III de Madrid e-Archivo

Big data analytics: a predictive analysis applied to cybersecurity in a financial organization

Author: Pereira Pedro Filipe Martins Tourais
Publication venue
Publication date: 05/07/2019
Field of study

Project Work presented as partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Knowledge Management and Business IntelligenceWith the generalization of the internet access, cyber attacks have registered an alarming growth in frequency and severity of damages, along with the awareness of organizations with heavy investments in cybersecurity, such as in the financial sector. This work is focused on an organization’s financial service that operates on the international markets in the payment systems industry. The objective was to develop a predictive framework solution responsible for threat detection to support the security team to open investigations on intrusive server requests, over the exponentially growing log events collected by the SIEM from the Apache Web Servers for the financial service. A Big Data framework, using Hadoop and Spark, was developed to perform classification tasks over the financial service requests, using Neural Networks, Logistic Regression, SVM, and Random Forests algorithms, while handling the training of the imbalance dataset through BEV. The main conclusions over the analysis conducted, registered the best scoring performances for the Random Forests classifier using all the preprocessed features available. Using the all the available worker nodes with a balanced configuration of the Spark executors, the most performant elapsed times for loading and preprocessing of the data were achieved using the column-oriented ORC with native format, while the row-oriented CSV format performed the best for the training of the classifiers.Com a generalização do acesso à internet, os ciberataques registaram um crescimento alarmante em frequência e severidade de danos causados, a par da consciencialização das organizações, com elevados investimentos em cibersegurança, como no setor financeiro. Este trabalho focou-se no serviço financeiro de uma organização que opera nos mercados internacionais da indústria de sistemas de pagamento. O objetivo consistiu no desenvolvimento uma solução preditiva responsável pela detecção de ameaças, por forma a dar suporte à equipa de segurança na abertura de investigações sobre pedidos intrusivos no servidor, relativamente aos exponencialmente crescentes eventos de log coletados pelo SIEM, referentes aos Apache Web Servers, para o serviço financeiro. Uma solução de Big Data, usando Hadoop e Spark, foi desenvolvida com o objectivo de executar tarefas de classificação sobre os pedidos do serviço financeiros, usando os algoritmos Neural Networks, Logistic Regression, SVM e Random Forests, solucionando os problemas associados ao treino de um dataset desequilibrado através de BEV. As principais conclusões sobre as análises realizadas registaram os melhores resultados de classificação usando o algoritmo Random Forests com todas as variáveis pré-processadas disponíveis. Usando todos os nós do cluster e uma configuração balanceada dos executores do Spark, os melhores tempos para carregar e pré-processar os dados foram obtidos usando o formato colunar ORC nativo, enquanto o formato CSV, orientado a linhas, apresentou os melhores tempos para o treino dos classificadores

Advances in Data Mining Knowledge Discovery and Applications

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Advances in Data Mining Knowledge Discovery and Applications aims to help data miners, researchers, scholars, and PhD students who wish to apply data mining techniques. The primary contribution of this book is highlighting frontier fields and implementations of the knowledge discovery and data mining. It seems to be same things are repeated again. But in general, same approach and techniques may help us in different fields and expertise areas. This book presents knowledge discovery and data mining applications in two different sections. As known that, data mining covers areas of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. In this book, most of the areas are covered with different data mining applications. The eighteen chapters have been classified in two parts: Knowledge Discovery and Data Mining Applications