413 research outputs found

    Dynamic Data Mining: Methodology and Algorithms

    No full text
    Supervised data stream mining has become an important and challenging data mining task in modern organizations. The key challenges are threefold: (1) a possibly infinite number of streaming examples and time-critical analysis constraints; (2) concept drift; and (3) skewed data distributions. To address these three challenges, this thesis proposes the novel dynamic data mining (DDM) methodology by effectively applying supervised ensemble models to data stream mining. DDM can be loosely defined as categorization-organization-selection of supervised ensemble models. It is inspired by the idea that although the underlying concepts in a data stream are time-varying, their distinctions can be identified. Therefore, the models trained on the distinct concepts can be dynamically selected in order to classify incoming examples of similar concepts. First, following the general paradigm of DDM, we examine the different concept-drifting stream mining scenarios and propose corresponding effective and efficient data mining algorithms. ā€¢ To address concept drift caused merely by changes of variable distributions, which we term pseudo concept drift, base models built on categorized streaming data are organized and selected in line with their corresponding variable distribution characteristics. ā€¢ To address concept drift caused by changes of variable and class joint distributions, which we term true concept drift, an effective data categorization scheme is introduced. A group of working models is dynamically organized and selected for reacting to the drifting concept. Secondly, we introduce an integration stream mining framework, enabling the paradigm advocated by DDM to be widely applicable for other stream mining problems. Therefore, we are able to introduce easily six effective algorithms for mining data streams with skewed class distributions. In addition, we also introduce a new ensemble model approach for batch learning, following the same methodology. Both theoretical and empirical studies demonstrate its effectiveness. Future work would be targeted at improving the effectiveness and efficiency of the proposed algorithms. Meantime, we would explore the possibilities of using the integration framework to solve other open stream mining research problems

    Ensemble Feature Learning-Based Event Classification for Cyber-Physical Security of the Smart Grid

    Get PDF
    The power grids are transforming into the cyber-physical smart grid with increasing two-way communications and abundant data flows. Despite the efficiency and reliability promised by this transformation, the growing threats and incidences of cyber attacks targeting the physical power systems have exposed severe vulnerabilities. To tackle such vulnerabilities, intrusion detection systems (IDS) are proposed to monitor threats for the cyber-physical security of electrical power and energy systems in the smart grid with increasing machine-to-machine communication. However, the multi-sourced, correlated, and often noise-contained data, which record various concurring cyber and physical events, are posing significant challenges to the accurate distinction by IDS among events of inadvertent and malignant natures. Hence, in this research, an ensemble learning-based feature learning and classification for cyber-physical smart grid are designed and implemented. The contribution of this research are (i) the design, implementation and evaluation of an ensemble learning-based attack classifier using extreme gradient boosting (XGBoost) to effectively detect and identify attack threats from the heterogeneous cyber-physical information in the smart grid; (ii) the design, implementation and evaluation of stacked denoising autoencoder (SDAE) to extract highlyrepresentative feature space that allow reconstruction of a noise-free input from noise-corrupted perturbations; (iii) the design, implementation and evaluation of a novel ensemble learning-based feature extractors that combine multiple autoencoder (AE) feature extractors and random forest base classifiers, so as to enable accurate reconstruction of each feature and reliable classification against malicious events. The simulation results validate the usefulness of ensemble learning approach in detecting malicious events in the cyber-physical smart grid

    Decoding Neural Signals with Computational Models: A Systematic Review of Invasive BMI

    Full text link
    There are significant milestones in modern human's civilization in which mankind stepped into a different level of life with a new spectrum of possibilities and comfort. From fire-lighting technology and wheeled wagons to writing, electricity and the Internet, each one changed our lives dramatically. In this paper, we take a deep look into the invasive Brain Machine Interface (BMI), an ambitious and cutting-edge technology which has the potential to be another important milestone in human civilization. Not only beneficial for patients with severe medical conditions, the invasive BMI technology can significantly impact different technologies and almost every aspect of human's life. We review the biological and engineering concepts that underpin the implementation of BMI applications. There are various essential techniques that are necessary for making invasive BMI applications a reality. We review these through providing an analysis of (i) possible applications of invasive BMI technology, (ii) the methods and devices for detecting and decoding brain signals, as well as (iii) possible options for stimulating signals into human's brain. Finally, we discuss the challenges and opportunities of invasive BMI for further development in the area.Comment: 51 pages, 14 figures, review articl

    Identifying and Detecting Attacks in Industrial Control Systems

    Get PDF
    The integrity of industrial control systems (ICS) found in utilities, oil and natural gas pipelines, manufacturing plants and transportation is critical to national wellbeing and security. Such systems depend on hundreds of field devices to manage and monitor a physical process. Previously, these devices were specific to ICS but they are now being replaced by general purpose computing technologies and, increasingly, these are being augmented with Internet of Things (IoT) nodes. Whilst there are benefits to this approach in terms of cost and flexibility, it has attracted a wider community of adversaries. These include those with significant domain knowledge, such as those responsible for attacks on Iranā€™s Nuclear Facilities, a Steel Mill in Germany, and Ukraineā€™s power grid; however, non specialist attackers are becoming increasingly interested in the physical damage it is possible to cause. At the same time, the approach increases the number and range of vulnerabilities to which ICS are subject; regrettably, conventional techniques for analysing such a large attack space are inadequate, a cause of major national concern. In this thesis we introduce a generalisable approach based on evolutionary multiobjective algorithms to assist in identifying vulnerabilities in complex heterogeneous ICS systems. This is both challenging and an area that is currently lacking research. Our approach has been to review the security of currently deployed ICS systems, and then to make use of an internationally recognised ICS simulation testbed for experiments, assuming that the attacking community largely lack specific ICS knowledge. Using the simulator, we identified vulnerabilities in individual components and then made use of these to generate attacks. A defence against these attacks in the form of novel intrusion detection systems were developed, based on a range of machine learning models. Finally, this was further subject to attacks created using the evolutionary multiobjective algorithms, demonstrating, for the first time, the feasibility of creating sophisticated attacks against a well-protected adversary using automated mechanisms

    CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS

    Get PDF
    The book collects the short papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS). The meeting has been organized by the Department of Statistics, Computer Science and Applications of the University of Florence, under the auspices of the Italian Statistical Society and the International Federation of Classification Societies (IFCS). CLADAG is a member of the IFCS, a federation of national, regional, and linguistically-based classification societies. It is a non-profit, non-political scientific organization, whose aims are to further classification research

    Vision based environment perception system for next generation off-road ADAS : innovation report

    Get PDF
    Advanced Driver Assistance Systems (ADAS) aids the driver by providing information or automating the driving related tasks to improve driver comfort, reduce workload and improve safety. The vehicle senses its external environment using sensors, building a representation of the world used by the control systems. In on-road applications, the perception focuses on establishing the location of other road participants such as vehicles and pedestrians and identifying the road trajectory. Perception in the off-road environment is more complex, as the structure found in urban environments is absent. Off-road perception deals with the estimation of surface topography and surface type, which are the factors that will affect vehicle behaviour in unstructured environments. Off-road perception has seldom been explored in automotive context. For autonomous off-road driving, the perception solutions are primarily related to robotics and not directly applicable in the ADAS domain due to the different goals of unmanned autonomous systems, their complexity and the cost of employed sensors. Such applications consider only the impact of the terrain on the vehicle safety and progress but do not account for the driver comfort and assistance. This work addresses the problem of processing vision sensor data to extract the required information about the terrain. The main focus of this work is on the perception task with the constraints of automotive sensors and the requirements of the ADAS systems. By providing a semantic representation of the off-road environment including terrain attributes such as terrain type, description of the terrain topography and surface roughness, the perception system can cater for the requirements of the next generation of off-road ADAS proposed by Land Rover. Firstly, a novel and computationally efficient terrain recognition method was developed. The method facilitates recognition of low friction grass surfaces in real-time with high accuracy, by applying machine learning Support Vector Machine with illumination invariant normalised RGB colour descriptors. The proposed method was analysed and its performance was evaluated experimentally in off-road environments. Terrain recognition performance was evaluated on a variety of different surface types including grass, gravel and tarmac, showing high grass detection performance with accuracy of 97%. Secondly, a terrain geometry identification method was proposed which facilitates semantic representation of the terrain in terms of macro terrain features such as slopes, crest and ditches. The terrain geometry identification method processes 3D information reconstructed from stereo imagery and constructs a compact grid representation of the surface topography. This representation is further processed to extract object representation of slopes, ditches and crests. Thirdly, a novel method for surface roughness identification was proposed. The surface roughness descriptor is then further used to recommend a vehicle velocity, which will maintain passenger comfort. Surface roughness is described by the Power Spectral Density of the surface profile which correlates with the acceleration experienced by the vehicle. The surface roughness descriptor is then mapped onto vehicle speed recommendation so that the speed of the vehicle can be adapted in anticipation of the surface roughness. Terrain geometry and surface roughness identification performance were evaluated on a range of off-road courses with varying topology showing the capability of the system to correctly identify terrain features up to 20 m ahead of the vehicle and analyse surface roughness up to 15 m ahead of the vehicle. The speed was recommended correctly within +/- 5 kph. Further, the impact of the perception system on the speed adaptation was evaluated, showing the improvements in speed adaptation allowing for greater passenger comfort. The developed perception components facilitated the development of new off-road ADAS systems and were successfully applied in prototype vehicles. The proposed off-road ADAS are planned to be introduced in future generations of Land Rover products. The benefits of this research also included new Intellectual Property generated for Jaguar Land Rover. In the wider context, the enhanced off-road perception capability may facilitate further development of off-road automated driving and off-road autonomy within the constraints of the automotive platfor

    Automatic Extraction and Assessment of Entities from the Web

    Get PDF
    The search for information about entities, such as people or movies, plays an increasingly important role on the Web. This information is still scattered across many Web pages, making it more time consuming for a user to ļ¬nd all relevant information about an entity. This thesis describes techniques to extract entities and information about these entities from the Web, such as facts, opinions, questions and answers, interactive multimedia objects, and events. The ļ¬ndings of this thesis are that it is possible to create a large knowledge base automatically using a manually-crafted ontology. The precision of the extracted information was found to be between 75ā€“90 % (facts and entities respectively) after using assessment algorithms. The algorithms from this thesis can be used to create such a knowledge base, which can be used in various research ļ¬elds, such as question answering, named entity recognition, and information retrieval

    Fusion of Data from Heterogeneous Sensors with Distributed Fields of View and Situation Evaluation for Advanced Driver Assistance Systems

    Get PDF
    In order to develop a driver assistance system for pedestrian protection, pedestrians in the environment of a truck are detected by radars and a camera and are tracked across distributed fields of view using a Joint Integrated Probabilistic Data Association filter. A robust approach for prediction of the system vehicles trajectory is presented. It serves the computation of a probabilistic collision risk based on reachable sets where different sources of uncertainty are taken into account
    • ā€¦
    corecore