1,199 research outputs found

    Machine Learning-based Approaches for Advanced Monitoring of Smart Glasses

    Get PDF
    openWith today’s growing demand on productivity, product quality and effectiveness, the importance of Machine Learning-based functionalities and services has dramatically increased. Such paradigm shift can be mainly associated with the increasing availability of Internet of Things (IoT) sensors and devices, the increase of data collected in the IoT scenario and the increasing popularity and availability of machine learning approaches. One of the most appealing applications of ML-based solutions is for sure Predictive Maintenance, which aims at improving maintenance management by exploiting the estimation of the health status of a piece of equipment. One of the main formalizations of the PdM problem is the prediction of the Remaining Useful Life (RUL), that is defined as the time/process iterations remaining for a device component to perform its task before it loses functionality. This work investigates a possible application of predictive maintenance techniques for the monitoring of the battery of Smart Glasses. The work starts with the description of the considered devices, the modalities of data collection and the Exploratory Data Analysis for better understanding the task. The first experimental part consists in the application of an unsupervised anomaly detection technique, useful to initially deal with the partial and unlabeled data. The last part of the work contains the results of the application of both classical machine learning and deep learning approaches for the estimation of the RUL of the devices battery. A section for the interpretation of the machine-learning models is included for both the anomaly detection and RUL estimation approaches

    LogEvent2vec : LogEvent-to-vector based anomaly detection for large-scale logs in internet of things

    Get PDF
    Funding: This work was funded by the National Natural Science Foundation of China (Nos. 61802030), the Research Foundation of Education Bureau of Hunan Province, China (No. 19B005), and the International Cooperative Project for “Double First-Class”, CSUST (No. 2018IC24), the open research fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology (Nanjing University of Posts and Telecommunications), Ministry of Education (No. JZNY201905), the Open Research Fund of the Hunan Provincial Key Laboratory of Network Investigational Technology (No. 2018WLZC003). This work was funded by the Researchers Supporting Project No. (RSP-2019/102) King Saud University, Riyadh, Saudi Arabia. Acknowledgments: We thank Researchers Supporting Project No. (RSP-2019/102) King Saud University, Riyadh, Saudi Arabia, for funding this research. We thank Francesco Cauteruccio for proofreading this paper.Peer reviewedPublisher PD

    Modeling the Abnormality: Machine Learning-based Anomaly and Intrusion Detection in Software-defined Networks

    Get PDF
    Modern software-defined networks (SDN) provide additional control and optimal functionality over large-scale computer networks. Due to the rise in networking applications, cyber attacks have also increased progressively. Modern cyber attacks wreak havoc on large-scale SDNs, many of which are part of critical national infrastructures. Artifacts of these attacks may present as network anomalies within the core network or edge anomalies in the SDN edge. As protection, intrusion and anomaly detection must be implemented in both the edge and core. In this dissertation, we investigate and create novel network intrusion and anomaly detection techniques that can handle the next generation of network attacks. We collect and use new network metrics and statistics to perform network intrusion detection. We demonstrated that machine learning models like Random Forest classifiers effectively use network port statistics to differentiate between normal and attack traffic with up to 98% accuracy. These collected metrics are augmented to create a new open-sourced dataset that improves upon class imbalance. The developed dataset outperforms other contemporary datasets with an FÎŒ score of 94% and a minimum F score of 86%. We also propose SDN intrusion detection approaches that provide high confidence scores and explainability to provide additional insights and be implemented in a real-time environment. Through this, we observed that network byte and packet transmissions and their robust statistics can be significant indicators for the prevalence of any attack. Additionally, we propose an anomaly detection technique for time-series SDN edge devices. We observe precision and recall scores inversely correlate as Δ increases, and Δ = 6.0 yielded the best F score. Results also highlight that the best performance was achieved from data that had been moderately smoothed (0.8 ≀ α ≀ 0.4), compared to intensely smoothed or non-smoothed data. In addition, we investigated and analyzed the impact that adversarial attacks can have on machine learning-based network intrusion detection systems for SDN. Results show that the proposed attacks provide substantial deterioration of classifier performance in single SDNs, and some classifiers deteriorate up to ≈60. Finally, we proposed an adversarial attack detection framework for multi-controller SDN setups that uses inherent network architecture features to make decisions. Results indicate efficient detection performance achieved by the framework in determining and localizing the presence of adversarial attacks. However, the performance begins to deteriorate when more than 30% of the SDN controllers have become compromised. The work performed in this dissertation has provided multiple contributions to the network security research community like providing equitable open-sourced SDN datasets, promoting the usage of core network statistics for intrusion detection, proposing robust anomaly detection techniques for time-series data, and analyzing how adversarial attacks can compromise the machine learning algorithms that protect our SDNs. The results of this dissertation can catalyze future developments in network security

    Real-Time Machine Learning for Quickest Detection

    Get PDF
    Safety-critical Cyber-Physical Systems (CPS) require real-time machine learning for control and decision making. One promising solution is to use deep learning to discover useful patterns for event detection from heterogeneous data. However, deep learning algorithms encounter challenges in CPS with assurability requirements: 1) Decision explainability, 2) Real-time and quickest event detection, and 3) Time-eficient incremental learning. To address these obstacles, I developed a real-time Machine Learning Framework for Quickest Detection (MLQD). To be specific, I first propose the zero-bias neural network, which removes decision bias and preferabilities from regular neural networks and provides an interpretable decision process. Second, I discover the latent space characteristic of the zero-bias neural network and the method to mathematically convert a Deep Neural Network (DNN) classifier into a performance-assured binary abnormality detector. In this way, I can seamlessly integrate the deep neural networks\u27 data processing capability with Quickest Detection (QD) and provide real-time sequential event detection paradigm. Thirdly, after discovering that a critical factor that impedes the incremental learning of neural networks is the concept interference (confusion) in latent space, and I prove that to minimize interference, the concept representation vectors (class fingerprints) within the latent space need to be organized orthogonally and I invent a new incremental learning strategy using the findings, I facilitate deep neural networks in the CPS to evolve eficiently without retraining. All my algorithms are evaluated on real-world applications, ADS-B (Automatic Dependent Surveillance Broadcasting) signal identification, and spoofing detection in the aviation communication system. Finally, I discuss the current trends in MLQD and conclude this dissertation by presenting the future research directions and applications. As a summary, the innovations of this dissertation are as follows: i) I propose the zerobias neural network, which provides transparent latent space characteristics, I apply it to solve the wireless device identification problem. ii) I discover and prove the orthogonal memory organization mechanism in artificial neural networks and apply this mechanism in time-efficient incremental learning. iii) I discover and mathematically prove the converging point theorem, with which we can predict the latent space topological characteristics and estimate the topological maturity of neural networks. iv) I bridge the gap between machine learning and quickest detection with assurable performance

    Dynamic Fraud Detection via Sequential Modeling

    Get PDF
    The impacts of information revolution are omnipresent from life to work. The web services have signicantly changed our living styles in daily life, such as Facebook for communication and Wikipedia for knowledge acquirement. Besides, varieties of information systems, such as data management system and management information system, make us work more eciently. However, it is usually a double-edged sword. With the popularity of web services, relevant security issues are arising, such as fake news on Facebook and vandalism on Wikipedia, which denitely impose severe security threats to OSNs and their legitimate participants. Likewise, oce automation incurs another challenging security issue, insider threat, which may involve the theft of condential information, the theft of intellectual property, or the sabotage of computer systems. A recent survey says that 27% of all cyber crime incidents are suspected to be committed by the insiders. As a result, how to ag out these malicious web users or insiders is urgent. The fast development of machine learning (ML) techniques oers an unprecedented opportunity to build some ML models that can assist humans to detect the individuals who conduct misbehaviors automatically. However, unlike some static outlier detection scenarios where ML models have achieved promising performance, the malicious behaviors conducted by humans are often dynamic. Such dynamic behaviors lead to various unique challenges of dynamic fraud detection: Unavailability of sucient labeled data - traditional machine learning approaches usually require a balanced training dataset consisting of normal and abnormal samples. In practice, however, there are far fewer abnormal labeled samples than normal ones. Lack of high quality labels - the labeled training records often have the time gap between the time that fraudulent users commit fraudulent actions and the time that they are suspended by the platforms. Time-evolving nature - users are always changing their behaviors over time. To address the aforementioned challenges, in this dissertation, we conduct a systematic study for dynamic fraud detection, with a focus on: (1) Unavailability of labeled data: we present (a) a few-shot learning framework to handle the extremely imbalanced dataset that abnormal samples are far fewer than the normal ones and (b) a one-class fraud detection method using a complementary GAN (Generative Adversarial Network) to adaptively generate potential abnormal samples; (2) Lack of high-quality labels: we develop a neural survival analysis model for fraud early detection to deal with the time gap; (3) Time-evolving nature: we propose (a) a hierarchical neural temporal point process model and (b) a dynamic Dirichlet marked Hawkes process model for fraud detection

    End-to-end anomaly detection in stream data

    Get PDF
    Nowadays, huge volumes of data are generated with increasing velocity through various systems, applications, and activities. This increases the demand for stream and time series analysis to react to changing conditions in real-time for enhanced efficiency and quality of service delivery as well as upgraded safety and security in private and public sectors. Despite its very rich history, time series anomaly detection is still one of the vital topics in machine learning research and is receiving increasing attention. Identifying hidden patterns and selecting an appropriate model that fits the observed data well and also carries over to unobserved data is not a trivial task. Due to the increasing diversity of data sources and associated stochastic processes, this pivotal data analysis topic is loaded with various challenges like complex latent patterns, concept drift, and overfitting that may mislead the model and cause a high false alarm rate. Handling these challenges leads the advanced anomaly detection methods to develop sophisticated decision logic, which turns them into mysterious and inexplicable black-boxes. Contrary to this trend, end-users expect transparency and verifiability to trust a model and the outcomes it produces. Also, pointing the users to the most anomalous/malicious areas of time series and causal features could save them time, energy, and money. For the mentioned reasons, this thesis is addressing the crucial challenges in an end-to-end pipeline of stream-based anomaly detection through the three essential phases of behavior prediction, inference, and interpretation. The first step is focused on devising a time series model that leads to high average accuracy as well as small error deviation. On this basis, we propose higher-quality anomaly detection and scoring techniques that utilize the related contexts to reclassify the observations and post-pruning the unjustified events. Last but not least, we make the predictive process transparent and verifiable by providing meaningful reasoning behind its generated results based on the understandable concepts by a human. The provided insight can pinpoint the anomalous regions of time series and explain why the current status of a system has been flagged as anomalous. Stream-based anomaly detection research is a principal area of innovation to support our economy, security, and even the safety and health of societies worldwide. We believe our proposed analysis techniques can contribute to building a situational awareness platform and open new perspectives in a variety of domains like cybersecurity, and health

    Addressing Pragmatic Challenges in Utilizing AI for Security of Industrial IoT

    Get PDF
    Industrial control systems (ICSs) are an essential part of every nation\u27s critical infrastructure and have been utilized for a long time to supervise industrial machines and processes. Today’s ICSs are substantially different from the information technology (IT) devices a decade ago. The integration of internet of things (IoT) technology has made them more efficient and optimized, improved automation, and increased quality and compliance. Now, they are a sub (and arguably the most critical) part of IoT\u27s domain, called industrial IoT (IIoT). In the past, to secure ICSs from malicious outside attack, these systems were isolated from the outside world. However, recent advances, increased connectivity with corporate networks, and utilization of internet communications to transmit the information more conveniently have introduced the possibility of cyber-attacks against these systems. Due to the sensitive nature of the industrial applications, security is the foremost concern. We discuss why despite the exceptional performance of artificial intelligent (AI) and machine learning (ML), industry leaders still have a hard time utilizing these models in practice as a standalone units. The goal of this dissertation is to address some of these challenges to help pave the way of utilizing smarter and more modern security solutions in these systems. To be specific, here, we focus on data scarcity for the AI, black-box nature of the AI, high computational load of the AI. Industrial companies almost never release their network data, because they are obligated to follow confidentiality laws and user privacy restrictions. Hence, real-world IIoT datasets are not available for security research area, and we face a data scarcity challenge in IIoT security research community. In this domain, the researchers usually have to resort to commercial or public datasets that are not specific to this domain. In our work, we have developed a real-world testbed that resembles an actual industrial plant. We have emulated a popular industrial system in water treatment processes. So, we could collect datasets containing realist traffic to conduct our research. There exists several specific characteristics of IIoT networks that are unique to them. We have provided an extensive study to figure out them and incorporate them in the design. We have gathered information on relevant cyber-attacks in IIoT systems to run them against the system to gather realistic datasets containing both normal and attack traffic analogous to real industrial network traffic. Their particular communication protocols are also their specific to them. We have implemented one of the most popular one in our dataset. Another attribute that distinguishes the security of these systems from others is the imbalanced data. The number of attack samples are significantly lower compared to the enormous number of normal traffic that flows in the system daily. We have made sure we build our datasets compliant with all the specific attributes of an IIoT. Another challenge that we address here is the ``black box nature of learning models that creates hurdles in generating adequate trust in their decisions. Thus, they are seldom utilized as a standalone unit in IIoT high-risk applications. Explainable AI (XAI) has gained an increasing interest in recent years to help with this problem. However, most of the research works that have been done so far focus on image applications or are very slow. For applications such as security of IIoT, we deal with numerical data and low latency is of utmost importance. In this dissertation, we propose a universal XAI model named Transparency Relying Upon Statistical Theory (TRUST). TRUST is model-agnostic, high-performing, and suitable for numerical applications. We prove its superiority compared to another popular XAI model in performance regarding speed and being able to successfully reason the AI\u27s behavior. When dealing with the IoT technology, especially industrial IoT, we deal with a massive amount of data streaming to and from the IoT devices. In addition, the availability and reliability constraints of industrial systems require them to operate at a fast pace and avoid creating any bottleneck in the system. High computational load of complex AI models might cause a burden by having to deal with a large number of data and producing the results not as fast as required. In this dissertation, we utilize distributed computing in the form of edge/cloud structure to address these problems. We propose Anomaly Detection using Distributed AI (ADDAI) that can easily span out geographically to cover a large number of IoT sources. Due to its distributed nature, it guarantees critical IIoT requirements such as high speed, robustness against a single point of failure, low communication overhead, privacy, and scalability. We formulate the communication cost which is minimized and the improvement in performance

    Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression

    Full text link
    Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases. However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monitoring wearables. This paper presents a novel deep-learning framework employing a variational autoencoder (VAE) for physiological signal compression to reduce wearables' computational complexity and energy consumption. Our approach achieves an impressive compression ratio of 1:293 specifically for spectrogram data, surpassing state-of-the-art compression techniques such as JPEG2000, H.264, Direct Cosine Transform (DCT), and Huffman Encoding, which do not excel in handling physiological signals. We validate the efficacy of the compressed algorithms using collected physiological signals from real patients in the Hospital and deploy the solution on commonly used embedded AI chips (i.e., ARM Cortex V8 and Jetson Nano). The proposed framework achieves a 91% seizure detection accuracy using XGBoost, confirming the approach's reliability, practicality, and scalability
    • 

    corecore