196 research outputs found

    Back To The Roots: Tree-Based Algorithms for Weakly Supervised Anomaly Detection

    Full text link
    Weakly supervised methods have emerged as a powerful tool for model-agnostic anomaly detection at the Large Hadron Collider (LHC). While these methods have shown remarkable performance on specific signatures such as di-jet resonances, their application in a more model-agnostic manner requires dealing with a larger number of potentially noisy input features. In this paper, we show that using boosted decision trees as classifiers in weakly supervised anomaly detection gives superior performance compared to deep neural networks. Boosted decision trees are well known for their effectiveness in tabular data analysis. Our results show that they not only offer significantly faster training and evaluation times, but they are also robust to a large number of noisy input features. By using advanced gradient boosted decision trees in combination with ensembling techniques and an extended set of features, we significantly improve the performance of weakly supervised methods for anomaly detection at the LHC. This advance is a crucial step towards a more model-agnostic search for new physics.Comment: 11 pages, 9 figure

    Network Intrusion Detection with Two-Phased Hybrid Ensemble Learning and Automatic Feature Selection

    Get PDF
    The use of network connected devices has grown exponentially in recent years revolutionizing our daily lives. However, it has also attracted the attention of cybercriminals making the attacks targeted towards these devices increase not only in numbers but also in sophistication. To detect such attacks, a Network Intrusion Detection System (NIDS) has become a vital component in network applications. However, network devices produce large scale high-dimensional data which makes it difficult to accurately detect various known and unknown attacks. Moreover, the complex nature of network data makes the feature selection process of a NIDS a challenging task. In this study, we propose a machine learning based NIDS with Two-phased Hybrid Ensemble learning and Automatic Feature Selection. The proposed framework leverages four different machine learning classifiers to perform automatic feature selection based on their ability to detect the most significant features. The two-phased hybrid ensemble learning algorithm consists of two learning phases, with the first phase constructed using classifiers built from an adaptation of the One-vs-One framework, and the second phase constructed using classifiers built from combinations of attack classes. The proposed framework was evaluated on two well-referenced datasets for both wired and wireless applications, and the results demonstrate that the two-phased ensemble learning framework combined with the automatic feature selection engine has superior attack detection capability compared to other similar studies found in the literature

    An Ensemble Learning-Based Architecture for Security Detection in IoT Infrastructures

    Get PDF
    International audienceThe Internet of Things has known an important development. However, security management is still a key challenge in particular for deploying complex IoT systems that provide sophisticated services. In this paper, we design an ensemble learning-based architecture to support early security detection in the context of multi-step attacks, by leveraging the performance of different detection techniques. The architecture relies on a total of five major methods, including process mining, elliptic envelope, one class support vector machine, local outlier factor and isolation forest. We describe the main components of this architecture and their interactions, from the data preprocessing to the generation of alerts, through the calculation of scores. The different detection methods are executed in parallel, and their results are combined by an ensemble learning strategy in order to improve the overall detection performance. We develop a proof-of-concept prototype and perform a large set of experiments to quantify the benefits and limits of this approach based on industrial datasets

    Self organisation for 4G/5G networks

    Get PDF
    Nowadays, the rapid growth of mobile communications is changing the world towards a fully connected society. Current 4G networks account for almost half of total mobile traffic, and in the forthcoming years, the overall mobile data traffic is expected to dramatically increase. To manage this increase in data traffic, operators adopt network topologies such as Heterogeneous Networks. Thus, operators can de­ ploy hundreds of small cells for each macro cell, allowing them to reduce coverage hales and/or lack of capacity. The advent of this technology is expected to tremendously increase the number of nodes in this new ecosystem, so that traditional network management activities based on, e.g., classic manual and field trial design approaches are just not be viable anymore. As a consequence, the academic J literature has dedicated a significant amount of effort to Self-Organising Network (SON) algorithms. These solutions aim to bring intelligence and autonomous adaptability into cellular networks, thereby reducing capital and operation expenditures (CAPEX/OPEX). Another aspect to take into account is that, these type of networks generate a large amount of data during their normal operation in the form of control, management and data measurements. This data is expected to increase in SG due to different aspects, such as densification, heterogeneity in layers and technologies, additional control and management complexity in Network Functions Virtualisation (NFV) and Software Defined Network (SDN), and the advent of the Internet of Things (loT), among others. In this context, operators face the challenge of de ­ signing efficient technologies, while introducing new services, reaching challenges in terms networks, which are self-aware, self-adaptive, and intelligent. This dissertation provides a contribution to the design, analysis, and evaluation of SON solutions to improve network opera tor performance, expenses, and users' experience, by making the network more self-adaptive and intelligent. It also provides a contribution to the design of a self-aware network planning tool, which allows to predict the Quality of Service (QoS) offered to end-users, based on data al ­ ready available in the network . The main thesis contributions are divided into two parts. The first part presents a novel functional architecture based on an automatic and self-organised Reinforcement Learning (RL) based approach to model SON functionalities, in which the main task is the self-coordination of different actions taken by different SON functions to be automatically executed in a self-organised realistic Long Term Evolution (LTE) network. The proposed approach introduces a new paradigm to deal with the conflicts genera ted by the concurrent execution of multiple SON functions, revealing that the proposed approach is general enough to modelali the SON functions and their derived conflicts. The second part of the thesis is dedicated to the problem of QoS prediction. In particular, we aim at finding patterns of knowledge from physical layer data acquired from heterogeneous LTE networks. We propose an approach that not only is able to verify the QoS level experienced by the users, through physical layer measurements of the UEs, but it is a lso able to predict it based on measurements collected at different time, and from different regions of the heterogeneous network. We propose then to make predictions independently of the physical location, in order to exploit the experience gained in other sectors of the network, to properly dimension and deploy heterogeneous nodes. In this context, we use Machine Learning (ML) as a tool to allow the network to learn from experience, improving performances, and big data analytics to drive the network from reactive to predictive.Hoy en día, el rápido crecimiento de las comunicaciones móviles está cambiando el mundo hacia una sociedad completamente conectada. Las redes 4G actuales representan casi la mitad del tráfico móvil total, y en los próximos años se espera que el tráfico total de los dispositivos móviles aumente drásticamente. Para gestionar este incremento de tráfico de datos, los operadores adoptan tecnologías de redes como las redes heterogéneas. De esta manera, los operadores pueden desplegar centena res de pequeñas celdas por cada macro celda, permitiendo reducir zonas sin cobertura y/o falta de capacidad. Con la introducción de esta tecnología, se espera que incremente de manera sustancia l el número de nodos en el nuevo ecosistema, de manera que las actividades de gestión de las redes tradicionales, basadas en, por ejemplo, el diseño manual, sean inviables. Como consecuencia, la literatura académica ha dedicado un esfuerzo significativo al diseño de algoritmos de redes auto-organizadas (SON). Estas soluciones tienen como objetivo introducir inteligencia y capacidad autónoma a las redes móviles, reduciendo la capacidad y costes operativos. Otro aspecto a tener en cuenta es que este tipo de redes generan una gran cantidad de datos durante su funcionamiento habitual, en forma de medidas de control y gestión de datos. Se espera que estos datos incrementen con la tecnología SG, debido a diferentes aspectos como los son la densificación de redes heterogéneas, la complejidad adicional en el control y la gestión de la virtualización de las funciones de redes (NFV) y las redes definidas por software (SON), así como la llegada del internet de las cosas (loT), entre otros. En este contexto, los operadores se enfrentan al reto de diseñar tecnologías eficientes, mientras introducen nuevos servicios, consiguiendo objetivos en términos de satisfacción del cliente, en donde el objetivo global del operador es la construcción de redes auto-conscientes, auto-adaptables e inteligentes. Esta tesis ofrece una contribución al diseño y evaluación de soluciones SON para mejorar el rendimiento de las redes, los costes y la experiencia de los usuarios, consiguiendo que la red sea auto-adaptable e inteligente. Así mismo, proporciona una contribución al diseño de una herramienta de planificación de red auto-consciente, que permita predecir la calidad de servicio brindada a los usuarios finales, basada en la explotación de datos disponibles en la red.Avui en dia, el ràpid creixement de les comunicacions mòbils està canviant el món cap a una societat completament connectada. Les xarxes 4G actuals representen casi la m trànsit mòbil total, i en els propers anys s’espera que el trànsit total de dades mòbils augmenti dràsticament. Per gestionar aquest increment de trànsit de dades, els operadors adopten topologies de xarxa com ara les xarxes heterogènies (HetNets). D’aquesta manera, els operadors poden desplegar centenars de cel·les petites per a cada cella macro, permetent reduir forats en la cobertura i/o la manca de capacitat. Amb l’arribada d’aquesta tecnologia, s’espera que incrementi enormement el nombre de nodes en el nou ecosistema, de manera que les activitats de gestió de xarxa tradicionals, basades en, per exemple, el disseny manual i els assaigs de camp esdevenen simplement inviables. Com a conseqüència, la literatura acadèmica ha dedicat una quantitat significativa d’esforç als algorismes de xarxa auto organitzada (SON). Aquestes solucions tenen com a objectiu portar la intel·ligència i capacitat d’adaptació autònoma a les xarxes mòbils, reduint el capital i les despeses operatives (CAPES/OPEX). Un altre aspecte a tenir en compte és que aquest tipus de xarxes generen una gran quantitat de dades durant el seu funcionament habitual, en forma de mesuraments de control, gestió i dades. S’espera que aquestes dades incrementin amb la tecnologia 5G, degut a diferents aspectes com ara la densificació, l’heterogeneïtat en capes i tecnologies, la complexitat addicional en el control i la gestió de la virtualització de les funcions de xarxa (NFV) i xarxes definides per software (SDN), i l’adveniment de la internet de les coses (IoT), entre d’altres. En aquest context, els operadors s’enfronten al repte de dissenyar tecnologies eficients, mentre introdueixen nous serveis, aconseguint objectius en termes de satisfacció del client, i on l’objectiu global d’un operador és la construcció de xarxes que són autoconscients, auto-adaptables i intel·ligents. Aquesta tesis ofereix una contribució al disseny, l’anàlisi i l’avaluació de les solucions SON per millorar el rendiment de l’operador de xarxa, les xi despeses i l’experiència dels usuaris, fent que la xarxa sigui més auto-adaptable i intel·ligent. També proporciona una contribució al disseny d’una eina de planificació de xarxa autoconscient, el que permet predir la qualitat de servei (QoS) oferta als usuaris finals, basada en dades ja disponibles a la xarxa. Les contribucions principals d’aquesta tesis es divideixen en dues parts. La primera part presenta una nova arquitectura funcional basada en un aprenentatge per reforç (RL) automàtic i auto-organitzat, enfocat en modelar funcionalitats SON, on la tasca principal és l’auto-coordinació de les diferents accions dutes a terme perles diferents funcions SON a ser executades de forma automàtica en una xarxa Long Term Evolution (LTE) auto-organitzada. L’enfocament proposat introdueix un nou paradigma perfer front als conflictes generats per l’execució simultània de múltiples funcions SON, revelant que l’enfocament proposat és prou general per modelar totes les funcions SON i els seus conflictes derivats. La segona part de la tesis està dedicada al problema de la predicció de la qualitat de servei. En particular, el nostre objectiu és trobar patrons de coneixement a partir de dades de la capa física adquirides de xarxes LTE heterogènies. Proposem un enfocament que no només és capaç de verificar el nivell de QoS experimentat pels usuaris, a través de mesuraments de la capa física dels UEs, sinó que també és capaç de predir-ho basant-se en mesuraments adquirits en diferents instants, i de diferents regions de la xarxa heterogènia. Proposem per tant fer prediccions amb independència de la ubicació física, aprofitant l’experiència adquirida en altres sectors de la xarxa, per dimensionar i desplegar nodes heterogenis correctament. En aquest context, utilitzem l’aprenentatge automàtic (ML) com a eina per permetre que la xarxa aprengui de l’experiència, millorant el rendiment, i l’anàlisi de grans volums de dades per a conduir la xarxa de reactiva a predictiva. Durant l’elaboració d’aquesta tesis, s’han extret dues conclusions principals clau. En primer lloc, destaquem la importància de dissenyar algorismes SON eficients per fer front eficaçment a diversos reptes, com ara la ubicació més adequada de funcions SON i algorismes per resoldre adequadament el problema d’implementació distribuïda o centralitzada, o la solució de conflictes entre funcions SON executades a diferents nodes o xarxes. En segon lloc, en termes d’eines de planificació de xarxes, es poden trobar diferents eines cobrint una àmplia gamma de sistemes i aplicacions orientades a la indústria, així com per a fins d’investigació. En aquest context, les solucions investigades són sotmeses contínuament a canvis importants, on un del principals impulsors és presentar solucions més rentable

    Towards Effective Wireless Intrusion Detection using AWID Dataset

    Get PDF
    In the field of network security, intrusion detection system plays a vital role in the procedure of applying machine learning (ML) techniques with the dataset. This study is an IDS related in machine, developed the literature by utilizing AWID dataset. There tends to be a need in balancing a dataset and its existing approaches from the analysis of its respective works. A taxonomy of balancing technique was introduced due to the lack of treatment of imbalance. This attempt has provided a proper structure defined on all levels and a hierarchical group was formed with the collected papers. This describes a comparative study on the proposed or treated aspects. The main aspect from the surveyed papers were found that: understanding of the existing taxonomies were not in detail and there were no treatment of imbalance for the utilized dataset. So, this study concludes a gathered information in these aspects. Regardless, there are factors or weakness have been seen in any adaptations of the intrusion detection system. In this context, there are few findings that are multifold with contributions. Thus, to best of our knowledge, the study provides an integration with the observation of threshold limit and feature drop selection method by random samples. Thus, the work contributes a better understanding towards imbalanced techniques from the literature surveyed. Hence, this research would benefit for the development of IDS using ML

    Explainable AI over the Internet of Things (IoT): Overview, State-of-the-Art and Future Directions

    Full text link
    Explainable Artificial Intelligence (XAI) is transforming the field of Artificial Intelligence (AI) by enhancing the trust of end-users in machines. As the number of connected devices keeps on growing, the Internet of Things (IoT) market needs to be trustworthy for the end-users. However, existing literature still lacks a systematic and comprehensive survey work on the use of XAI for IoT. To bridge this lacking, in this paper, we address the XAI frameworks with a focus on their characteristics and support for IoT. We illustrate the widely-used XAI services for IoT applications, such as security enhancement, Internet of Medical Things (IoMT), Industrial IoT (IIoT), and Internet of City Things (IoCT). We also suggest the implementation choice of XAI models over IoT systems in these applications with appropriate examples and summarize the key inferences for future works. Moreover, we present the cutting-edge development in edge XAI structures and the support of sixth-generation (6G) communication services for IoT applications, along with key inferences. In a nutshell, this paper constitutes the first holistic compilation on the development of XAI-based frameworks tailored for the demands of future IoT use cases.Comment: 29 pages, 7 figures, 2 tables. IEEE Open Journal of the Communications Society (2022

    Data analytics for mobile traffic in 5G networks using machine learning techniques

    Get PDF
    This thesis collects the research works I pursued as Ph.D. candidate at the Universitat Politecnica de Catalunya (UPC). Most of the work has been accomplished at the Mobile Network Department Centre Tecnologic de Telecomunicacions de Catalunya (CTTC). The main topic of my research is the study of mobile network traffic through the analysis of operative networks dataset using machine learning techniques. Understanding first the actual network deployments is fundamental for next-generation network (5G) for improving the performance and Quality of Service (QoS) of the users. The work starts from the collection of a novel type of dataset, using an over-the-air monitoring tool, that allows to extract the control information from the radio-link channel, without harming the users’ identities. The subsequent analysis comprehends a statistical characterization of the traffic and the derivation of prediction models for the network traffic. A wide group of algorithms are implemented and compared, in order to identify the highest performances. Moreover, the thesis addresses a set of applications in the context mobile networks that are prerogatives in the future mobile networks. This includes the detection of urban anomalies, the user classification based on the demanded network services, the design of a proactive wake-up scheme for efficient-energy devices.Esta tesis recoge los trabajos de investigación que realicé como Ph.D. candidato a la Universitat Politecnica de Catalunya (UPC). La mayor parte del trabajo se ha realizado en el Centro Tecnológico de Telecomunicaciones de Catalunya (CTTC) del Departamento de Redes Móviles. El tema principal de mi investigación es el estudio del tráfico de la red móvil a través del análisis del conjunto de datos de redes operativas utilizando técnicas de aprendizaje automático. Comprender primero las implementaciones de red reales es fundamental para la red de próxima generación (5G) para mejorar el rendimiento y la calidad de servicio (QoS) de los usuarios. El trabajo comienza con la recopilación de un nuevo tipo de conjunto de datos, utilizando una herramienta de monitoreo por aire, que permite extraer la información de control del canal de radioenlace, sin dañar las identidades de los usuarios. El análisis posterior comprende una caracterización estadística del tráfico y la derivación de modelos de predicción para el tráfico de red. Se implementa y compara un amplio grupo de algoritmos para identificar los rendimientos más altos. Además, la tesis aborda un conjunto de aplicaciones en el contexto de redes móviles que son prerrogativas en las redes móviles futuras. Esto incluye la detección de anomalías urbanas, la clasificación de usuarios basada en los servicios de red demandados, el diseño de un esquema de activación proactiva para dispositivos de energía eficiente.Postprint (published version

    A Survey of Machine Learning Techniques for Video Quality Prediction from Quality of Delivery Metrics

    Get PDF
    A growing number of video streaming networks are incorporating machine learning (ML) applications. The growth of video streaming services places enormous pressure on network and video content providers who need to proactively maintain high levels of video quality. ML has been applied to predict the quality of video streams. Quality of delivery (QoD) measurements, which capture the end-to-end performances of network services, have been leveraged in video quality prediction. The drive for end-to-end encryption, for privacy and digital rights management, has brought about a lack of visibility for operators who desire insights from video quality metrics. In response, numerous solutions have been proposed to tackle the challenge of video quality prediction from QoD-derived metrics. This survey provides a review of studies that focus on ML techniques for predicting the QoD metrics in video streaming services. In the context of video quality measurements, we focus on QoD metrics, which are not tied to a particular type of video streaming service. Unlike previous reviews in the area, this contribution considers papers published between 2016 and 2021. Approaches for predicting QoD for video are grouped under the following headings: (1) video quality prediction under QoD impairments, (2) prediction of video quality from encrypted video streaming traffic, (3) predicting the video quality in HAS applications, (4) predicting the video quality in SDN applications, (5) predicting the video quality in wireless settings, and (6) predicting the video quality in WebRTC applications. Throughout the survey, some research challenges and directions in this area are discussed, including (1) machine learning over deep learning; (2) adaptive deep learning for improved video delivery; (3) computational cost and interpretability; (4) self-healing networks and failure recovery. The survey findings reveal that traditional ML algorithms are the most widely adopted models for solving video quality prediction problems. This family of algorithms has a lot of potential because they are well understood, easy to deploy, and have lower computational requirements than deep learning techniques
    • …
    corecore