114 research outputs found

    Weakly supervised deep learning for the detection of domain generation algorithms

    Get PDF
    Domain generation algorithms (DGAs) have become commonplace in malware that seeks to establish command and control communication between an infected machine and the botmaster. DGAs dynamically and consistently generate large volumes of malicious domain names, only a few of which are registered by the botmaster, within a short time window around their generation time, and subsequently resolved when the malware on the infected machine tries to access them. Deep neural networks that can classify domain names as benign or malicious are of great interest in the real-time defense against DGAs. In contrast with traditional machine learning models, deep networks do not rely on human engineered features. Instead, they can learn features automatically from data, provided that they are supplied with sufficiently large amounts of suitable training data. Obtaining cleanly labeled ground truth data is difficult and time consuming. Heuristically labeled data could potentially provide a source of training data for weakly supervised training of DGA detectors. We propose a set of heuristics for automatically labeling domain names monitored in real traffic, and then train and evaluate classifiers with the proposed heuristically labeled dataset. We show through experiments on a dataset with 50 million domain names that such heuristically labeled data is very useful in practice to improve the predictive accuracy of deep learning-based DGA classifiers, and that these deep neural networks significantly outperform a random forest classifier with human engineered features

    Real-time detection of dictionary DGA network traffic using deep learning

    Get PDF
    Botnets and malware continue to avoid detection by static rules engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. To combat this, we created a novel hybrid neural network, Bilbo the `bagging` model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious. Bilbo is the first parallel usage of a convolutional neural network (CNN) and a long short-term memory (LSTM) network for DGA detection. Our unique architecture is found to be the most consistent in performance in terms of AUC, F1 score, and accuracy when generalising across different dictionary DGA classification tasks compared to current state-of-the-art deep learning architectures. We validate using reverse-engineered dictionary DGA domains and detail our real-time implementation strategy for scoring real-world network logs within a large financial enterprise. In four hours of actual network traffic, the model discovered at least five potential command-and-control networks that commercial vendor tools did not flag

    An End-to-End Framework for Private DGA Detection as a Service

    Get PDF
    Domain Generation Algorithms (DGAs) are used by malware to generate pseudorandom domain names to establish communication between infected bots and Command and Control servers. While DGAs can be detected by machine learning (ML) models with great accuracy, offering DGA detection as a service raises privacy concerns when requiring network administrators to disclose their DNS traffic to the service provider. We propose the first end-to-end framework for privacy-preserving classification as a service of domain names into DGA (malicious) or non-DGA (benign) domains. We achieve this through a combination of two privacy-enhancing technologies (PETs), namely secure multi-party computation (MPC) and differential privacy (DP). Through MPC, our framework enables an enterprise network administrator to outsource the problem of classifying a DNS domain as DGA or non-DGA to an external organization without revealing any information about the domain name. Moreover, the service provider\u27s ML model used for DGA detection is never revealed to the network administrator. Furthermore, by using DP, we also ensure that the classification result cannot be used to learn information about individual entries of the training data. Finally, we leverage the benefits of quantization of deep learning models in the context of MPC to achieve efficient, secure DGA detection. We demonstrate that we achieve a significant speed-up resulting in a 15% reduction in detection runtime without reducing accuracy

    Deep Learning-Based Magnetic Coupling Detection for Advanced Induction Heating Appliances

    Get PDF
    Induction heating has become the reference technology for domestic heating applications due to its benefits in terms of performance, efficiency and safety, among others. In this context, recent design trends aim at providing highly flexible cooking surfaces composed of multi-coil structures. As in many other wireless power transfer systems, one of the main challenges to face is the proper detection of the magnetic coupling with the induction heating load in order to provide improved thermal performance and safe power electronic converter operation. This is specially challenging due to the high variability in the materials used in cookware as well as the random pot placement in flexible induction heating appliances. This paper proposes the use of deep learning techniques in order to provide accurate area overlap estimation regardless of the used pot and its position. An experimental test-bench composed of a complete power converter, multi-coil system and real-Time measurement system has been implemented and used in this study to characterize the parameter variation with overlapped area. Convolutional neural networks are then proposed as an effective method to estimate the covered area, and several implementations are studied and compared according to their computational cost and accuracy. As a conclusion, the presented deep learning-based technique is proposed as an effective tool to estimate the magnetic coupling between the coil and the induction heating load in advanced induction heating appliances

    Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research

    Get PDF
    This survey presents a comprehensive review of current literature on Explainable Artificial Intelligence (XAI) methods for cyber security applications. Due to the rapid development of Internet-connected systems and Artificial Intelligence in recent years, Artificial Intelligence including Machine Learning and Deep Learning has been widely utilized in the fields of cyber security including intrusion detection, malware detection, and spam filtering. However, although Artificial Intelligence-based approaches for the detection and defense of cyber attacks and threats are more advanced and efficient compared to the conventional signature-based and rule-based cyber security strategies, most Machine Learning-based techniques and Deep Learning-based techniques are deployed in the “black-box” manner, meaning that security experts and customers are unable to explain how such procedures reach particular conclusions. The deficiencies of transparencies and interpretability of existing Artificial Intelligence techniques would decrease human users’ confidence in the models utilized for the defense against cyber attacks, especially in current situations where cyber attacks become increasingly diverse and complicated. Therefore, it is essential to apply XAI in the establishment of cyber security models to create more explainable models while maintaining high accuracy and allowing human users to comprehend, trust, and manage the next generation of cyber defense mechanisms. Although there are papers reviewing Artificial Intelligence applications in cyber security areas and the vast literature on applying XAI in many fields including healthcare, financial services, and criminal justice, the surprising fact is that there are currently no survey research articles that concentrate on XAI applications in cyber security. Therefore, the motivation behind the survey is to bridge the research gap by presenting a detailed and up-to-date survey of XAI approaches applicable to issues in the cyber security field. Our work is the first to propose a clear roadmap for navigating the XAI literature in the context of applications in cyber security

    1st Symposium of Applied Science for Young Researchers: proceedings

    Get PDF
    SASYR, the rst Symposium of Applied Science for Young Researchers, welcomes works from young researchers (master students) covering any aspect of all the scienti c areas of the three research centres ADiT-lab (IPVC, Instituto Polit ecnico de Viana do Castelo), 2Ai (IPCA, Instituto Polit ecnico do C avado e do Ave) and CeDRI (IPB, Instituto Polit ecnico de Bragan ca). The main objective of SASYR is to provide a friendly and relaxed environment for young researchers to present their work, to discuss recent results and to develop new ideas. In this way, it will provide an opportunity to the ADiT-lab, 2Ai and CeDRI research communities to gather synergies and indicate possible paths for future joint work. We invite you to join SASYR on 7 July and to share your research!info:eu-repo/semantics/publishedVersio

    Distributed Processing Methods for Extra Large Scale MIMO

    Get PDF

    Segmentación semántica con modelos de deep learning y etiquetados no densos

    Get PDF
    La segmentación semántica es un problema muy estudiado dentro del campo de la visión por computador que consiste en la clasificación de imágenes a nivel de píxel. Es decir, asignar una etiqueta o valor a cada uno de los píxeles de la imagen. Tiene aplicaciones muy variadas, que van desde interpretar el contenido de escenas urbanas para tareas de conducción automática hasta aplicaciones médicas que ayuden al médico a analizar la información del paciente para realizar un diagnóstico o operaciones. Como en muchos otros problemas y tareas relacionados con la visión por computador, en los últimos años se han propuesto y demostrado grandes avances en los métodos para segmentación semántica gracias, en gran parte, al reciente auge de los métodos basados en aprendizaje profundo o deep learning.\\ A pesar de que en los últimos años se están realizando mejoras constantes, los modelos de \textit{deep learning} para segmentación semántica %así como otras áreas, tienen un problema presentan un reto que dificulta su aplicabilidad a problemas de la vida real: necesitan grandes cantidades de anotaciones para entrenar los modelos. Esto es muy costoso, sobre todo porque en este caso hay que realizarlo a nivel de píxel. Muchos conjuntos de datos reales, por ejemplo datos adquiridos para tareas de monitorización del medio ambiente (grabaciones de entornos naturales, imágenes de satélite) generalmente presentan tan solo unos pocos píxeles etiquetados por imagen, que suelen venir de algunos clicks de un experto, para indicar ciertas zonas de interés en esas imágenes. Este tipo de etiquetado hace %imposible que sea muy complicado el entrenamiento de modelos densos que permitan procesar y obtener de manera automática una mayor cantidad de información de todos estos conjuntos de datos.\\ El objetivo de este trabajo es proponer nuevos métodos para resolver este problema. La idea principal es utilizar una segmentación inicial de la imagen multi-nivel de la imagen para propagar la poca información disponible. Este enfoque novedoso permite aumentar la anotación, y demostramos que pese a ser algo ruidosa, permite aprender de manera efectiva un modelo que obtenga la segmentación deseada. Este método es aplicable a cualquier tipo de dispersión de las anotaciones, siendo independiente del número de píxeles anotados. Las principales tareas desarrolladas en este proyecto son: -Estudio del estado del arte en técnicas de segmentación semántica (la mayoría basadas en técnicas de deep learning) -Propuesta y evaluación de métodos para aumentar (propagar) las etiquetas de las imágenes de entrenamiento cuando estas son dispersas y escasas -Diseño y evaluación de las arquitecturas de redes neuronales más adecuadas para resolver este problema Para validar nuestras propuestas, nos centramos en un caso de aplicación en imágenes submarinas, capturadas para monitorización de las zonas de barreras de coral. También demostramos que el método propuesto se puede aplicar a otro tipo de imágenes, como imágenes aéreas, imágenes multiespectrales y conjuntos de datos de segmentación de instancias
    corecore