316 research outputs found

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    Explaining Deep Learning-Based Driver Models

    Get PDF
    Different systems based on Artificial Intelligence (AI) techniques are currently used in relevant areas such as healthcare, cybersecurity, natural language processing, and self-driving cars. However, many of these systems are developed with 'black box” AI, which makes it difficult to explain how they work. For this reason, explainability and interpretability are key factors that need to be taken into consideration in the development of AI systems in critical areas. In addition, different contexts produce different explainability needs which must be met. Against this background, Explainable Artificial Intelligence (XAI) appears to be able to address and solve this situation. In the field of automated driving, XAI is particularly needed because the level of automation is constantly increasing according to the development of AI techniques. For this reason, the field of XAI in the context of automated driving is of particular interest. In this paper, we propose the use of an explainable intelligence technique in the understanding of some of the tasks involved in the development of advanced driver-assistance systems (ADAS). Since ADAS assist drivers in driving functions, it is essential to know the reason for the decisions taken. In addition, trusted AI is the cornerstone of the confidence needed in this research area. Thus, due to the complexity and the different variables that are part of the decision-making process, this paper focuses on two specific tasks in this area: the detection of emotions and the distractions of drivers. The results obtained are promising and show the capacity of the explainable artificial techniques in the different tasks of the proposed environments.This work was supported under projects PEAVAUTO-CM-UC3M, PID2019-104793RB-C31, and RTI2018-096036-B-C22, and by the Region of Madrid’s Excellence Program (EPUC3M17)

    Automatic Labelling of Point Clouds Using Image Semantic Segmentation

    Get PDF
    IsesĂ”itvaid autosid loetakse tehisintellekti jĂ€rgmiseks suureks saavutuseks. Need kasutavad mitmesuguseid sensoreid, nt kaamera ja LiDAR, et koguda infot ĂŒmbritseva maailma kohta. LiDAR salvestab andmed punktipilvena, milles iga punkt on esitatud kolmemÔÔtmeliste koordinaatidega. Uusimad sĂŒgavad nĂ€rvivĂ”rgud suudavad kĂ€sitleda punktipilve algsel kujul, kuid mĂ€rgendatud andmete kogumine treeningprotsessi jaoks on keeruline ning kulukas. KĂ€esoleva töö eesmĂ€rk on kasutada semantiliselt segmenteeritud pilte 3D punktipilve mĂ€rgendamiseks, vĂ”imaldades seelĂ€bi koguda eelmainitud mudelite treenimiseks mĂ€rgendatud andmeid odavamalt. Lisaks hindame olemasolevate semantilise segmenteerimise mudelite kasutamist suure koguse punktipilvede mĂ€rgendamiseks automaatselt. Meetodi testimiseks kasutame KITTI andmestikku, sest see sisaldab nii kaamera kui ka LiDARi andmeid iga stseeni jaoks. Kaamera piltide pikseltasemel mĂ€rgendamiseks kasutame DeepLabv3+ semantilise segmentatsiooni mudelit. Saadud mĂ€rgendused projitseeritakse seejĂ€rel 3D punktipilvele, mille pealt treenitakse PointNet++ mudel. Viimane on seejĂ€rel vĂ”imeline punktipilvi segmenteerima ilma lisainfota. Eksperimentide tulemused nĂ€itavad, et PointNet++ suudab projitseeritud mĂ€rgendustest vĂ”rdlemisi hĂ€sti Ă”ppida. Tulemuste vĂ”rdlused objektide teadaolevate asukohtadega on paljulubavad, saavutades kĂ”rge tĂ€psuse jalakĂ€ijate tuvastamisel ning keskmise tĂ€psuse autode tuvastamisel.Autonomous driving is often seen as the next big breakthrough in artificial intelligence. Autonomous vehicles use a variety of sensors to obtain knowledge from the world, for example cameras and LiDARs. LiDAR provides 3D data about the surrounding world in the form of a point cloud. New deep learning models have emerged that allow for learning directly on point clouds, but obtaining labelled data for training these models is difficult and expensive. We propose to use semantically segmented camera images to project labels from 2D to 3D, therefore enabling the use of cheaper ground truth data to train the aforementioned models. Furthermore, we evaluate the use of mature 2D semantic segmentation models to automatically label vast amounts of point cloud data. This approach is tested on the KITTI dataset, as it provides corresponding camera and LiDAR data for each scene. The DeepLabv3+ semantic segmentation model is used to label the camera images with pixel-level labels, which are then projected onto the 3D point cloud and finally a PointNet++ model is trained to do segmentation from point clouds only. Experiments show that projected 2D labels can be learned reasonably well by PointNet++. Evaluating the results with 3D ground truth provided with KITTI dataset produced promising results, with accuracy being high for detecting pedestrians, but mediocre for cars

    Classifiers accuracy improvement based on missing data imputation

    Get PDF
    In this paper we investigate further and extend our previous work on radar signal identification and classification based on a data set which comprises continuous, discrete and categorical data that represent radar pulse train characteristics such as signal frequencies, pulse repetition, type of modulation, intervals, scan period, scanning type, etc. As the most of the real world datasets, it also contains high percentage of missing values and to deal with this problem we investigate three imputation techniques: Multiple Imputation (MI); K-Nearest Neighbour Imputation (KNNI); and Bagged Tree Imputation (BTI). We apply these methods to data samples with up to 60% missingness, this way doubling the number of instances with complete values in the resulting dataset. The imputation models performance is assessed with Wilcoxon’s test for statistical significance and Cohen’s effect size metrics. To solve the classification task, we employ three intelligent approaches: Neural Networks (NN); Support Vector Machines (SVM); and Random Forests (RF). Subsequently, we critically analyse which imputation method influences most the classifiers’ performance, using a multiclass classification accuracy metric, based on the area under the ROC curves. We consider two superclasses (‘military’ and ‘civil’), each containing several ‘subclasses’, and introduce and propose two new metrics: inner class accuracy (IA); and outer class accuracy (OA), in addition to the overall classification accuracy (OCA) metric. We conclude that they can be used as complementary to the OCA when choosing the best classifier for the problem at hand

    Adaptive Clustering-based Malicious Traffic Classification at the Network Edge

    Get PDF
    • 

    corecore