11 research outputs found

    Generalized CMAC adaptive ensembles for concept-drifting data streams

    Get PDF
    In this paper we propose to use an adaptive ensemble learning framework with different levels of diversity to handle streams of data in non-stationary scenarios in which concept drifts are present. Our adaptive system consists of two ensembles, each one with a different level of diversity (from high to low), and, therefore, with different and complementary capabilities, that are adaptively combined to obtain an overall system of improved performance. In our approach, the ensemble members are generalized CMACs, a linear-in-the-parameters network. The ensemble of CMACs provides a reasonable trade-off between expressive power, simplicity, and fast learning speed. At the end of the paper, we provide a performance analysis of the proposed learning framework on benchmark datasets with concept drifts of different levels of severity and speed.This work is partially funded by grant CASI-CAM-CM (S2013/ICE-2845), DGUI-Comunidad de Madrid, and grants DAMA (TIN2015-70308-REDT), MINECO, and Macro-ADOBE (TEC 2015-67719-P), MINECO-FEDER-EU

    Enhancing Computer Network Security through Improved Outlier Detection for Data Streams

    Get PDF
    V několika posledních letech se metody strojového učení (zvláště ty zabývající se detekcí odlehlých hodnot - OD) v oblasti kyberbezpečnosti opíraly o zjišťování anomálií síťového provozu spočívajících v nových schématech útoků. Detekce anomálií v počítačových sítích reálného světa se ale stala stále obtížnější kvůli trvalému nárůstu vysoce objemných, rychlých a dimenzionálních průběžně přicházejících dat (SD), pro která nejsou k dispozici obecně uznané a pravdivé informace o anomalitě. Účinná detekční schémata pro vestavěná síťová zařízení musejí být rychlá a paměťově nenáročná a musejí být schopna se potýkat se změnami konceptu, když se vyskytnou. Cílem této disertace je zlepšit bezpečnost počítačových sítí zesílenou detekcí odlehlých hodnot v datových proudech, obzvláště SD, a dosáhnout kyberodolnosti, která zahrnuje jak detekci a analýzu, tak reakci na bezpečnostní incidenty jako jsou např. nové zlovolné aktivity. Za tímto účelem jsou v práci navrženy čtyři hlavní příspěvky, jež byly publikovány nebo se nacházejí v recenzním řízení časopisů. Zaprvé, mezera ve volbě vlastností (FS) bez učitele pro zlepšování již hotových metod OD v datových tocích byla zaplněna navržením volby vlastností bez učitele pro detekci odlehlých průběžně přicházejících dat označované jako UFSSOD. Následně odvozujeme generický koncept, který ukazuje dva aplikační scénáře UFSSOD ve spojení s online algoritmy OD. Rozsáhlé experimenty ukázaly, že UFSSOD coby algoritmus schopný online zpracování vykazuje srovnatelné výsledky jako konkurenční metoda upravená pro OD. Zadruhé představujeme nový aplikační rámec nazvaný izolovaný les založený na počítání výkonu (PCB-iForest), jenž je obecně schopen využít jakoukoliv online OD metodu založenou na množinách dat tak, aby fungovala na SD. Do tohoto algoritmu integrujeme dvě varianty založené na klasickém izolovaném lese. Rozsáhlé experimenty provedené na 23 multidisciplinárních datových sadách týkajících se bezpečnostní problematiky reálného světa ukázaly, že PCB-iForest jasně překonává už zavedené konkurenční metody v 61 % případů a dokonce dosahuje ještě slibnějších výsledků co do vyváženosti mezi výpočetními náklady na klasifikaci a její úspěšností. Zatřetí zavádíme nový pracovní rámec nazvaný detekce odlehlých hodnot a rozpoznávání schémat útoku proudovým způsobem (SOAAPR), jenž je na rozdíl od současných metod schopen zpracovat výstup z různých online OD metod bez učitele proudovým způsobem, aby získal informace o nových schématech útoku. Ze seshlukované množiny korelovaných poplachů jsou metodou SOAAPR vypočítány tři různé soukromí zachovávající podpisy podobné otiskům prstů, které charakterizují a reprezentují potenciální scénáře útoku s ohledem na jejich komunikační vztahy, projevy ve vlastnostech dat a chování v čase. Evaluace na dvou oblíbených datových sadách odhalila, že SOAAPR může soupeřit s konkurenční offline metodou ve schopnosti korelace poplachů a významně ji překonává z hlediska výpočetního času . Navíc se všechny tři typy podpisů ve většině případů zdají spolehlivě charakterizovat scénáře útoků tím, že podobné seskupují k sobě. Začtvrté představujeme algoritmus nepárového kódu autentizace zpráv (Uncoupled MAC), který propojuje oblasti kryptografického zabezpečení a detekce vniknutí (IDS) pro síťovou bezpečnost. Zabezpečuje síťovou komunikaci (autenticitu a integritu) kryptografickým schématem s podporou druhé vrstvy kódy autentizace zpráv, ale také jako vedlejší efekt poskytuje funkcionalitu IDS tak, že vyvolává poplach na základě porušení hodnot nepárového MACu. Díky novému samoregulačnímu rozšíření algoritmus adaptuje svoje vzorkovací parametry na základě zjištění škodlivých aktivit. Evaluace ve virtuálním prostředí jasně ukazuje, že schopnost detekce se za běhu zvyšuje pro různé scénáře útoku. Ty zahrnují dokonce i situace, kdy se inteligentní útočníci snaží využít slabá místa vzorkování.ObhájenoOver the past couple of years, machine learning methods - especially the Outlier Detection (OD) ones - have become anchored to the cyber security field to detect network-based anomalies rooted in novel attack patterns. Due to the steady increase of high-volume, high-speed and high-dimensional Streaming Data (SD), for which ground truth information is not available, detecting anomalies in real-world computer networks has become a more and more challenging task. Efficient detection schemes applied to networked, embedded devices need to be fast and memory-constrained, and must be capable of dealing with concept drifts when they occur. The aim of this thesis is to enhance computer network security through improved OD for data streams, in particular SD, to achieve cyber resilience, which ranges from the detection, over the analysis of security-relevant incidents, e.g., novel malicious activity, to the reaction to them. Therefore, four major contributions are proposed, which have been published or are submitted journal articles. First, a research gap in unsupervised Feature Selection (FS) for the improvement of off-the-shell OD methods in data streams is filled by proposing Unsupervised Feature Selection for Streaming Outlier Detection, denoted as UFSSOD. A generic concept is retrieved that shows two application scenarios of UFSSOD in conjunction with online OD algorithms. Extensive experiments have shown that UFSSOD, as an online-capable algorithm, achieves comparable results with a competitor trimmed for OD. Second, a novel unsupervised online OD framework called Performance Counter-Based iForest (PCB-iForest) is being introduced, which generalized, is able to incorporate any ensemble-based online OD method to function on SD. Two variants based on classic iForest are integrated. Extensive experiments, performed on 23 different multi-disciplinary and security-related real-world data sets, revealed that PCB-iForest clearly outperformed state-of-the-art competitors in 61 % of cases and even achieved more promising results in terms of the tradeoff between classification and computational costs. Third, a framework called Streaming Outlier Analysis and Attack Pattern Recognition, denoted as SOAAPR is being introduced that, in contrast to the state-of-the-art, is able to process the output of various online unsupervised OD methods in a streaming fashion to extract information about novel attack patterns. Three different privacy-preserving, fingerprint-like signatures are computed from the clustered set of correlated alerts by SOAAPR, which characterize and represent the potential attack scenarios with respect to their communication relations, their manifestation in the data's features and their temporal behavior. The evaluation on two popular data sets shows that SOAAPR can compete with an offline competitor in terms of alert correlation and outperforms it significantly in terms of processing time. Moreover, in most cases all three types of signatures seem to reliably characterize attack scenarios to the effect that similar ones are grouped together. Fourth, an Uncoupled Message Authentication Code algorithm - Uncoupled MAC - is presented which builds a bridge between cryptographic protection and Intrusion Detection Systems (IDSs) for network security. It secures network communication (authenticity and integrity) through a cryptographic scheme with layer-2 support via uncoupled message authentication codes but, as a side effect, also provides IDS-functionality producing alarms based on the violation of Uncoupled MAC values. Through a novel self-regulation extension, the algorithm adapts its sampling parameters based on the detection of malicious actions on SD. The evaluation in a virtualized environment clearly shows that the detection rate increases over runtime for different attack scenarios. Those even cover scenarios in which intelligent attackers try to exploit the downsides of sampling

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Task Allocation in Foraging Robot Swarms:The Role of Information Sharing

    Get PDF
    Autonomous task allocation is a desirable feature of robot swarms that collect and deliver items in scenarios where congestion, caused by accumulated items or robots, can temporarily interfere with swarm behaviour. In such settings, self-regulation of workforce can prevent unnecessary energy consumption. We explore two types of self-regulation: non-social, where robots become idle upon experiencing congestion, and social, where robots broadcast information about congestion to their team mates in order to socially inhibit foraging. We show that while both types of self-regulation can lead to improved energy efficiency and increase the amount of resource collected, the speed with which information about congestion flows through a swarm affects the scalability of these algorithms

    Review of Particle Physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 2,143 new measurements from 709 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. Particle properties and search limits are listed in Summary Tables. We give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 120 reviews are many that are new or heavily revised, including a new review on Machine Learning, and one on Spectroscopy of Light Meson Resonances. The Review is divided into two volumes. Volume 1 includes the Summary Tables and 97 review articles. Volume 2 consists of the Particle Listings and contains also 23 reviews that address specific aspects of the data presented in the Listings

    Review of Particle Physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 2,143 new measurements from 709 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. Particle properties and search limits are listed in Summary Tables. We give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 120 reviews are many that are new or heavily revised, including a new review on Machine Learning, and one on Spectroscopy of Light Meson Resonances. The Review is divided into two volumes. Volume 1 includes the Summary Tables and 97 review articles. Volume 2 consists of the Particle Listings and contains also 23 reviews that address specific aspects of the data presented in the Listings. The complete Review (both volumes) is published online on the website of the Particle Data Group (pdg.lbl.gov) and in a journal. Volume 1 is available in print as the PDG Book. A Particle Physics Booklet with the Summary Tables and essential tables, figures, and equations from selected review articles is available in print, as a web version optimized for use on phones, and as an Android app.United States Department of Energy (DOE) DE-AC02-05CH11231government of Japan (Ministry of Education, Culture, Sports, Science and Technology)Istituto Nazionale di Fisica Nucleare (INFN)Physical Society of Japan (JPS)European Laboratory for Particle Physics (CERN)United States Department of Energy (DOE

    Review of Particle Physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 2,143 new measurements from 709 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. Particle properties and search limits are listed in Summary Tables. We give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 120 reviews are many that are new or heavily revised, including a new review on Machine Learning, and one on Spectroscopy of Light Meson Resonances. The Review is divided into two volumes. Volume 1 includes the Summary Tables and 97 review articles. Volume 2 consists of the Particle Listings and contains also 23 reviews that address specific aspects of the data presented in the Listings. The complete Review (both volumes) is published online on the website of the Particle Data Group (pdg.lbl.gov) and in a journal. Volume 1 is available in print as the PDG Book. A Particle Physics Booklet with the Summary Tables and essential tables, figures, and equations from selected review articles is available in print, as a web version optimized for use on phones, and as an Android app
    corecore