646 research outputs found

    A signal analysis of network traffic anomalies

    Get PDF

    Mukautuva moniulotteisten poikkeavuuksien tunnistaminen reaaliaikaisesti

    Get PDF
    Data volumes are growing at a high speed as data emerges from millions of devices. This brings an increasing need for streaming analytics, processing and analysing the data in a record-by-record manner. In this work a comprehensive literature review on streaming analytics is presented, focusing on detecting anomalous behaviour. Challenges and approaches for streaming analytics are discussed. Different ways of determining and identifying anomalies are shown and a large number of anomaly detection methods for streaming data are presented. Also, existing software platforms and solutions for streaming analytics are presented. Based on the literature survey I chose one method for further investigation, namely Lightweight on-line detector of anomalies (LODA). LODA is designed to detect anomalies in real time from even high-dimensional data. In addition, it is an adaptive method and updates the model on-line. LODA was tested both on synthetic and real data sets. This work shows how to define the parameters used with LODA. I present a couple of improvement ideas to LODA and show that three of them bring important benefits. First, I show a simple addition to handle special cases such that it allows computing an anomaly score for all data points. Second, I show cases where LODA fails due to lack of data preprocessing. I suggest preprocessing schemes for streaming data and show that using them improves the results significantly, and they require only a small subset of the data for determining preprocessing parameters. Third, since LODA only gives anomaly scores, I suggest thresholding techniques to define anomalies. This work shows that the suggested techniques work fairly well compared to theoretical best performance. This makes it possible to use LODA in real streaming analytics situations.Datan määrä kasvaa kovaa vauhtia miljoonien laitteiden tuottaessa dataa. Tämä luo kasvavan tarpeen datan prosessoinnille ja analysoinnille reaaliaikaisesti. Tässä työssä esitetään kattava kirjallisuuskatsaus reaaliaikaisesta analytiikasta keskittyen anomalioiden tunnistukseen. Työssä pohditaan reaaliaikaiseen analytiikkaan liittyviä haasteita ja lähestymistapoja. Työssä näytetään erilaisia tapoja määrittää ja tunnistaa anomalioita sekä esitetään iso joukko menetelmiä reaaliaikaiseen anomalioiden tunnistukseen. Työssä esitetään myös reaaliaika-analytiikkaan tarkoitettuja ohjelmistoalustoja ja -ratkaisuja. Kirjallisuuskatsauksen perusteella työssä on valittu yksi menetelmä lähempään tutkimukseen, nimeltään Lightweight on-line detector of anomalies (LODA). LODA on suunniteltu tunnistamaan anomalioita reaaliaikaisesti jopa korkeaulotteisesta datasta. Lisäksi se on adaptiivinen menetelmä ja päivittää mallia reaaliaikaisesti. Työssä testattiin LODAa sekä synteettisellä että oikealla datalla. Työssä näytetään, miten LODAa käytettäessä kannattaa valita mallin parametrit. Työssä esitetään muutama kehitysehdotus LODAlle ja näytetään kolmen kehitysehdotuksen merkittävä hyöty. Ensinnäkin, näytetään erikoistapauksia varten yksinkertainen lisäys, joka mahdollistaa anomaliapisteytyksen laskemisen jokaiselle datapisteelle. Toiseksi, työssä näytetään tapauksia, joissa LODA epäonnistuu, kun dataa ei ole esikäsitelty. Työssä ehdotetaan reaaliaikaisesti prosessoitavalle datalle soveltuvia esikäsittelymenetelmiä ja osoitetaan, että niiden käyttö parantaa tuloksia merkittävästi, samalla käyttäen vain pientä osaa datasta esikäsittelyparametrien määrittämiseen. Kolmanneksi, koska LODA antaa datapisteille vain anomaliapisteytyksen, työssä on ehdotettu, miten sopivat raja-arvot anomalioiden tunnistukseen voitaisiin määrittää. Työssä on osoitettu, että nämä ehdotukset toimivat melko hyvin verrattuna teoreettisesti parhaaseen mahdolliseen tulokseen. Tämä mahdollistaa LODAn käytön oikeissa reaaliaika-analytiikkatapauksissa

    Computer Vision Applications for Autonomous Aerial Vehicles

    Get PDF
    Undoubtedly, unmanned aerial vehicles (UAVs) have experienced a great leap forward over the last decade. It is not surprising anymore to see a UAV being used to accomplish a certain task, which was previously carried out by humans or a former technology. The proliferation of special vision sensors, such as depth cameras, lidar sensors and thermal cameras, and major breakthroughs in computer vision and machine learning fields accelerated the advance of UAV research and technology. However, due to certain unique challenges imposed by UAVs, such as limited payload capacity, unreliable communication link with the ground stations and data safety, UAVs are compelled to perform many tasks on their onboard embedded processing units, which makes it difficult to readily implement the most advanced algorithms on UAVs. This thesis focuses on computer vision and machine learning applications for UAVs equipped with onboard embedded platforms, and presents algorithms that utilize data from multiple modalities. The presented work covers a broad spectrum of algorithms and applications for UAVs, such as indoor UAV perception, 3D understanding with deep learning, UAV localization, and structural inspection with UAVs. Visual guidance and scene understanding without relying on pre-installed tags or markers is the desired approach for fully autonomous navigation of UAVs in conjunction with the global positioning systems (GPS), or especially when GPS information is either unavailable or unreliable. Thus, semantic and geometric understanding of the surroundings become vital to utilize vision as guidance in the autonomous navigation pipelines. In this context, first, robust altitude measurement, safe landing zone detection and doorway detection methods are presented for autonomous UAVs operating indoors. These approaches are implemented on Google Project Tango platform, which is an embedded platform equipped with various sensors including a depth camera. Next, a modified capsule network for 3D object classification is presented with weight optimization so that the network can be fit and run on memory-constrained platforms. Then, a semantic segmentation method for 3D point clouds is developed for a more general visual perception on a UAV equipped with a 3D vision sensor. Next, this thesis presents algorithms for structural health monitoring applications involving UAVs. First, a 3D point cloud-based, drift-free and lightweight localization method is presented for depth camera-equipped UAVs that perform bridge inspection, where GPS signal is unreliable. Next, a thermal leakage detection algorithm is presented for detecting thermal anomalies on building envelopes using aerial thermography from UAVs. Then, building on our thermal anomaly identification expertise gained on the previous task, a novel performance anomaly identification metric (AIM) is presented for more reliable performance evaluation of thermal anomaly identification methods

    Hybrid self-organizing feature map (SOM) for anomaly detection in cloud infrastructures using granular clustering based upon value-difference metrics

    Get PDF
    We have witnessed an increase in the availability of data from diverse sources over the past few years. Cloud computing, big data and Internet-of-Things (IoT) are distinctive cases of such an increase which demand novel approaches for data analytics in order to process and analyze huge volumes of data for security and business use. Cloud computing has been becoming popular for critical structure IT mainly due to cost savings and dynamic scalability. Current offerings, however, are not mature enough with respect to stringent security and resilience requirements. Mechanisms such as anomaly detection hybrid systems are required in order to protect against various challenges that include network based attacks, performance issues and operational anomalies. Such hybrid AI systems include Neural Networks, blackboard systems, belief (Bayesian) networks, case-based reasoning and rule-based systems and can be implemented in a variety of ways. Traffic in the cloud comes from multiple heterogeneous domains and changes rapidly due to the variety of operational characteristics of the tenants using the cloud and the elasticity of the provided services. The underlying detection mechanisms rely upon measurements drawn from multiple sources. However, the characteristics of the distribution of measurements within specific subspaces might be unknown. We argue in this paper that there is a need to cluster the observed data during normal network operation into multiple subspaces each one of them featuring specific local attributes, i.e. granules of information. Clustering is implemented by the inference engine of a model hybrid NN system. Several variations of the so-called value-difference metric (VDM) are investigated like local histograms and the Canberra distance for scalar attributes, the Jaccard distance for binary word attributes, rough sets as well as local histograms over an aggregate ordering distance and the Canberra measure for vectorial attributes. Low-dimensional subspace representations of each group of points (measurements) in the context of anomaly detection in critical cloud implementations is based upon VD metrics and can be either parametric or non-parametric. A novel application of a Self-Organizing-Feature Map (SOFM) of reduced/aggregate ordered sets of objects featuring VD metrics (as obtained from distributed network measurements) is proposed. Each node of the SOFM stands for a structured local distribution of such objects within the input space. The so-called Neighborhood-based Outlier Factor (NOOF) is defined for such reduced/aggregate ordered sets of objects as a value-difference metric of histogrammes. Measurements that do not belong to local distributions are detected as anomalies, i.e. outliers of the trained SOFM. Several methods of subspace clustering using Expectation-Maximization Gaussian Mixture Models (a parametric approach) as well as local data densities (a non-parametric approach) are outlined and compared against the proposed method using data that are obtained from our cloud testbed in emulated anomalous traffic conditions. The results—which are obtained from a model NN system—indicate that the proposed method performs well in comparison with conventional techniques

    Challenges in anomaly and change point detection

    Full text link
    This paper presents an introduction to the state-of-the-art in anomaly and change-point detection. On the one hand, the main concepts needed to understand the vast scientific literature on those subjects are introduced. On the other, a selection of important surveys and books, as well as two selected active research topics in the field, are presented

    Modeling and performance estimation for airborne minefield detection system

    Get PDF
    Many programs aimed at airborne mine and minefield detection are being pursued and different algorithms are being developed and evaluated to achieve performance specifications. Thus far, no single algorithm or detection architecture has been able to fulfill the performance specifications for different mine and minefield detection scenarios...a need exists for a simulation based approach. One such simulation system is developed and evaluated in this thesis. The factors affecting the performance of an airborne detection system include physical parameters (type of background, time of day), data collection parameters (swath width, number of steps, in-step and in-flight overlap), and minefield scenarios. Data collection parameters are included in the simulation tool. False alarms and mine statistics are modeled based on the available data collected as a part of the developmental programs. Various mine and minefield detection algorithms are modeled and evaluated. Simulations are run, and Receiver Operating Characteristic (ROC) curves are used to evaluate the performance at both the mine and minefield levels. Analytical models for minefield detection performance are formulated and used to validate the simulated performance --Abstract, page iii