431 research outputs found

    Learning-based Decision Making in Wireless Communications

    Get PDF
    Fueled by emerging applications and exponential increase in data traffic, wireless networks have recently grown significantly and become more complex. In such large-scale complex wireless networks, it is challenging and, oftentimes, infeasible for conventional optimization methods to quickly solve critical decision-making problems. With this motivation, in this thesis, machine learning methods are developed and utilized for obtaining optimal/near-optimal solutions for timely decision making in wireless networks. Content caching at the edge nodes is a promising technique to reduce the data traffic in next-generation wireless networks. In this context, we in the first part of the thesis study content caching at the wireless network edge using a deep reinforcement learning framework with Wolpertinger architecture. Initially, we develop a learning-based caching policy for a single base station aiming at maximizing the long-term cache hit rate. Then, we extend this study to a wireless communication network with multiple edge nodes. In particular, we propose deep actor-critic reinforcement learning based policies for both centralized and decentralized content caching. Next, with the purpose of making efficient use of limited spectral resources, we develop a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. In the single-user model, in order to evaluate the performance of the proposed channel access policy and the framework\u27s tolerance against uncertainty, we explore different channel switching patterns and different switching probabilities. In the case of multiple users, we analyze the probabilities of each user accessing channels with favorable channel conditions and the probability of collision. Following the analysis of the proposed learning-based dynamic multichannel access policy, we consider adversarial attacks on it. In particular, we propose two adversarial policies, one based on feed-forward neural networks and the other based on deep reinforcement learning policies. Both attack strategies aim at minimizing the accuracy of a deep reinforcement learning based dynamic channel access agent, and we demonstrate and compare their performances. Next, anomaly detection as an active hypothesis test problem is studied. Specifically, we study deep reinforcement learning based active sequential testing for anomaly detection. We assume that there is an unknown number of abnormal processes at a time and the agent can only check with one sensor in each sampling step. To maximize the confidence level of the decision and minimize the stopping time concurrently, we propose a deep actor-critic reinforcement learning framework that can dynamically select the sensor based on the posterior probabilities. Separately, we also regard the detection of threshold crossing as an anomaly detection problem, and analyze it via hierarchical generative adversarial networks (GANs). In the final part of the thesis, to address state estimation and detection problems in the presence of noisy sensor observations and probing costs, we develop a soft actor-critic deep reinforcement learning framework. Moreover, considering Byzantine attacks, we design a GAN-based framework to identify the Byzantine sensors. To evaluate the proposed framework, we measure the performance in terms of detection accuracy, stopping time, and the total probing cost needed for detection

    Deep Learning -Powered Computational Intelligence for Cyber-Attacks Detection and Mitigation in 5G-Enabled Electric Vehicle Charging Station

    Get PDF
    An electric vehicle charging station (EVCS) infrastructure is the backbone of transportation electrification. However, the EVCS has various cyber-attack vulnerabilities in software, hardware, supply chain, and incumbent legacy technologies such as network, communication, and control. Therefore, proactively monitoring, detecting, and defending against these attacks is very important. The state-of-the-art approaches are not agile and intelligent enough to detect, mitigate, and defend against various cyber-physical attacks in the EVCS system. To overcome these limitations, this dissertation primarily designs, develops, implements, and tests the data-driven deep learning-powered computational intelligence to detect and mitigate cyber-physical attacks at the network and physical layers of 5G-enabled EVCS infrastructure. Also, the 5G slicing application to ensure the security and service level agreement (SLA) in the EVCS ecosystem has been studied. Various cyber-attacks such as distributed denial of services (DDoS), False data injection (FDI), advanced persistent threats (APT), and ransomware attacks on the network in a standalone 5G-enabled EVCS environment have been considered. Mathematical models for the mentioned cyber-attacks have been developed. The impact of cyber-attacks on the EVCS operation has been analyzed. Various deep learning-powered intrusion detection systems have been proposed to detect attacks using local electrical and network fingerprints. Furthermore, a novel detection framework has been designed and developed to deal with ransomware threats in high-speed, high-dimensional, multimodal data and assets from eccentric stakeholders of the connected automated vehicle (CAV) ecosystem. To mitigate the adverse effects of cyber-attacks on EVCS controllers, novel data-driven digital clones based on Twin Delayed Deep Deterministic Policy Gradient (TD3) Deep Reinforcement Learning (DRL) has been developed. Also, various Bruteforce, Controller clones-based methods have been devised and tested to aid the defense and mitigation of the impact of the attacks of the EVCS operation. The performance of the proposed mitigation method has been compared with that of a benchmark Deep Deterministic Policy Gradient (DDPG)-based digital clones approach. Simulation results obtained from the Python, Matlab/Simulink, and NetSim software demonstrate that the cyber-attacks are disruptive and detrimental to the operation of EVCS. The proposed detection and mitigation methods are effective and perform better than the conventional and benchmark techniques for the 5G-enabled EVCS

    Exploring the adoption of a conceptual data analytics framework for subsurface energy production systems: a study of predictive maintenance, multi-phase flow estimation, and production optimization

    Get PDF
    Als die Technologie weiter fortschreitet und immer stärker in der Öl- und Gasindustrie integriert wird, steht eine enorme Menge an Daten in verschiedenen Wissenschaftsdisziplinen zur Verfügung, die neue Möglichkeiten bieten, informationsreiche und handlungsorientierte Informationen zu gewinnen. Die Konvergenz der digitalen Transformation mit der Physik des Flüssigkeitsflusses durch poröse Medien und Pipeline hat die Entwicklung und Anwendung von maschinellem Lernen (ML) vorangetrieben, um weiteren Mehrwert aus diesen Daten zu gewinnen. Als Folge hat sich die digitale Transformation und ihre zugehörigen maschinellen Lernanwendungen zu einem neuen Forschungsgebiet entwickelt. Die Transformation von Brownfields in digitale Ölfelder kann bei der Energieproduktion helfen, indem verschiedene Ziele erreicht werden, einschließlich erhöhter betrieblicher Effizienz, Produktionsoptimierung, Zusammenarbeit, Datenintegration, Entscheidungsunterstützung und Workflow-Automatisierung. Diese Arbeit zielt darauf ab, ein Rahmenwerk für diese Anwendungen zu präsentieren, insbesondere durch die Implementierung virtueller Sensoren, Vorhersageanalytik mithilfe von Vorhersagewartung für die Produktionshydraulik-Systeme (mit dem Schwerpunkt auf elektrischen Unterwasserpumpen) und präskriptiven Analytik für die Produktionsoptimierung in Dampf- und Wasserflutprojekten. In Bezug auf virtuelle Messungen ist eine genaue Schätzung von Mehrphasenströmen für die Überwachung und Verbesserung von Produktionsprozessen entscheidend. Diese Studie präsentiert einen datengetriebenen Ansatz zur Berechnung von Mehrphasenströmen mithilfe von Sensormessungen in elektrischen untergetauchten Pumpbrunnen. Es wird eine ausführliche exploratorische Datenanalyse durchgeführt, einschließlich einer Ein Variablen Studie der Zielausgänge (Flüssigkeitsrate und Wasseranteil), einer Mehrvariablen-Studie der Beziehungen zwischen Eingaben und Ausgaben sowie einer Datengruppierung basierend auf Hauptkomponentenprojektionen und Clusteralgorithmen. Feature Priorisierungsexperimente werden durchgeführt, um die einflussreichsten Parameter in der Vorhersage von Fließraten zu identifizieren. Die Modellvergleich erfolgt anhand des mittleren absoluten Fehlers, des mittleren quadratischen Fehlers und des Bestimmtheitskoeffizienten. Die Ergebnisse zeigen, dass die CNN-LSTM-Netzwerkarchitektur besonders effektiv bei der Zeitreihenanalyse von ESP-Sensordaten ist, da die 1D-CNN-Schichten automatisch Merkmale extrahieren und informative Darstellungen von Zeitreihendaten erzeugen können. Anschließend wird in dieser Studie eine Methodik zur Umsetzung von Vorhersagewartungen für künstliche Hebesysteme, insbesondere bei der Wartung von Elektrischen Untergetauchten Pumpen (ESP), vorgestellt. Conventional maintenance practices for ESPs require extensive resources and manpower, and are often initiated through reactive monitoring of multivariate sensor data. Um dieses Problem zu lösen, wird die Verwendung von Hauptkomponentenanalyse (PCA) und Extreme Gradient Boosting Trees (XGBoost) zur Analyse von Echtzeitsensordaten und Vorhersage möglicher Ausfälle in ESPs eingesetzt. PCA wird als unsupervised technique eingesetzt und sein Ausgang wird weiter vom XGBoost-Modell für die Vorhersage des Systemstatus verarbeitet. Das resultierende Vorhersagemodell hat gezeigt, dass es Signale von möglichen Ausfällen bis zu sieben Tagen im Voraus bereitstellen kann, mit einer F1-Bewertung größer als 0,71 im Testset. Diese Studie integriert auch Model-Free Reinforcement Learning (RL) Algorithmen zur Unterstützung bei Entscheidungen im Rahmen der Produktionsoptimierung. Die Aufgabe, die optimalen Injektionsstrategien zu bestimmen, stellt Herausforderungen aufgrund der Komplexität der zugrundeliegenden Dynamik, einschließlich nichtlinearer Formulierung, zeitlicher Variationen und Reservoirstrukturheterogenität. Um diese Herausforderungen zu bewältigen, wurde das Problem als Markov-Entscheidungsprozess reformuliert und RL-Algorithmen wurden eingesetzt, um Handlungen zu bestimmen, die die Produktion optimieren. Die Ergebnisse zeigen, dass der RL-Agent in der Lage war, den Netto-Barwert (NPV) durch kontinuierliche Interaktion mit der Umgebung und iterative Verfeinerung des dynamischen Prozesses über mehrere Episoden signifikant zu verbessern. Dies zeigt das Potenzial von RL-Algorithmen, effektive und effiziente Lösungen für komplexe Optimierungsprobleme im Produktionsbereich zu bieten.As technology continues to advance and become more integrated in the oil and gas industry, a vast amount of data is now prevalent across various scientific disciplines, providing new opportunities to gain insightful and actionable information. The convergence of digital transformation with the physics of fluid flow through porous media and pipelines has driven the advancement and application of machine learning (ML) techniques to extract further value from this data. As a result, digital transformation and its associated machine-learning applications have become a new area of scientific investigation. The transformation of brownfields into digital oilfields can aid in energy production by accomplishing various objectives, including increased operational efficiency, production optimization, collaboration, data integration, decision support, and workflow automation. This work aims to present a framework of these applications, specifically through the implementation of virtual sensing, predictive analytics using predictive maintenance on production hydraulic systems (with a focus on electrical submersible pumps), and prescriptive analytics for production optimization in steam and waterflooding projects. In terms of virtual sensing, the accurate estimation of multi-phase flow rates is crucial for monitoring and improving production processes. This study presents a data-driven approach for calculating multi-phase flow rates using sensor measurements located in electrical submersible pumped wells. An exhaustive exploratory data analysis is conducted, including a univariate study of the target outputs (liquid rate and water cut), a multivariate study of the relationships between inputs and outputs, and data grouping based on principal component projections and clustering algorithms. Feature prioritization experiments are performed to identify the most influential parameters in the prediction of flow rates. Model comparison is done using the mean absolute error, mean squared error and coefficient of determination. The results indicate that the CNN-LSTM network architecture is particularly effective in time series analysis for ESP sensor data, as the 1D-CNN layers are capable of extracting features and generating informative representations of time series data automatically. Subsequently, the study presented herein a methodology for implementing predictive maintenance on artificial lift systems, specifically regarding the maintenance of Electrical Submersible Pumps (ESPs). Conventional maintenance practices for ESPs require extensive resources and manpower and are often initiated through reactive monitoring of multivariate sensor data. To address this issue, the study employs the use of principal component analysis (PCA) and extreme gradient boosting trees (XGBoost) to analyze real-time sensor data and predict potential failures in ESPs. PCA is utilized as an unsupervised technique and its output is further processed by the XGBoost model for prediction of system status. The resulting predictive model has been shown to provide signals of potential failures up to seven days in advance, with an F1 score greater than 0.71 on the test set. In addition to the data-driven modeling approach, The present study also in- corporates model-free reinforcement learning (RL) algorithms to aid in decision-making in production optimization. The task of determining the optimal injection strategy poses challenges due to the complexity of the underlying dynamics, including nonlinear formulation, temporal variations, and reservoir heterogeneity. To tackle these challenges, the problem was reformulated as a Markov decision process and RL algorithms were employed to determine actions that maximize production yield. The results of the study demonstrate that the RL agent was able to significantly enhance the net present value (NPV) by continuously interacting with the environment and iteratively refining the dynamic process through multiple episodes. This showcases the potential for RL algorithms to provide effective and efficient solutions for complex optimization problems in the production domain. In conclusion, this study represents an original contribution to the field of data-driven applications in subsurface energy systems. It proposes a data-driven method for determining multi-phase flow rates in electrical submersible pumped (ESP) wells utilizing sensor measurements. The methodology includes conducting exploratory data analysis, conducting experiments to prioritize features, and evaluating models based on mean absolute error, mean squared error, and coefficient of determination. The findings indicate that a convolutional neural network-long short-term memory (CNN-LSTM) network is an effective approach for time series analysis in ESPs. In addition, the study implements principal component analysis (PCA) and extreme gradient boosting trees (XGBoost) to perform predictive maintenance on ESPs and anticipate potential failures up to a seven-day horizon. Furthermore, the study applies model-free reinforcement learning (RL) algorithms to aid decision-making in production optimization and enhance net present value (NPV)

    PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning

    Full text link
    The recent success of deep neural networks (DNNs) for function approximation in reinforcement learning has triggered the development of Deep Reinforcement Learning (DRL) algorithms in various fields, such as robotics, computer games, natural language processing, computer vision, sensing systems, and wireless networking. Unfortunately, DNNs suffer from high computational cost and memory consumption, which limits the use of DRL algorithms in systems with limited hardware resources. In recent years, pruning algorithms have demonstrated considerable success in reducing the redundancy of DNNs in classification tasks. However, existing algorithms suffer from a significant performance reduction in the DRL domain. In this paper, we develop the first effective solution to the performance reduction problem of pruning in the DRL domain, and establish a working algorithm, named Policy Pruning and Shrinking (PoPS), to train DRL models with strong performance while achieving a compact representation of the DNN. The framework is based on a novel iterative policy pruning and shrinking method that leverages the power of transfer learning when training the DRL model. We present an extensive experimental study that demonstrates the strong performance of PoPS using the popular Cartpole, Lunar Lander, Pong, and Pacman environments. Finally, we develop an open source software for the benefit of researchers and developers in related fields.Comment: This paper has been accepted for publication in the IEEE Journal of Selected Topics in Signal Processin

    Anomaly detection and dynamic decision making for stochastic systems

    Full text link
    Thesis (Ph.D.)--Boston UniversityThis dissertation focuses on two types of problems, both of which are related to systems with uncertainties. The first problem concerns network system anomaly detection. We present several stochastic and deterministic methods for anomaly detection of networks whose normal behavior is not time-varying. Our methods cover most of the common techniques in the anomaly detection field. We evaluate all methods in a simulated network that consists of nominal data, three flow-level anomalies and one packet-level attack. Through analyzing the results, we summarize the advantages and the disadvantages of each method. As a next step, we propose two robust stochastic anomaly detection methods for networks whose normal behavior is time-varying. We develop a procedure for learning the underlying family of patterns that characterize a time-varying network. This procedure first estimates a large class of patterns from network data and then refines it to select a representative subset. The latter part formulates the refinement problem using ideas from set covering via integer programming. Then we propose two robust methods, one model-free and one model-based, to evaluate whether a sequence of observations is drawn from the learned patterns. Simulation results show that the robust methods have significant advantages over the alternative stationary methods in time-varying networks. The final anomaly detection setting we consider targets the detection of botnets before they launch an attack. Our method analyzes the social graph of the nodes in a network and consists of two stages: (i) network anomaly detection based on large deviations theory and (ii) community detection based on a refined modularity measure. We apply our method on real-world botnet traffic and compare its performance with other methods. The second problem considered by this dissertation concerns sequential decision mak- ings under uncertainty, which can be modeled by a Markov Decision Processes (MDPs). We focus on methods with an actor-critic structure, where the critic part estimates the gradient of the overall objective with respect to tunable policy parameters and the actor part optimizes a policy with respect to these parameters. Most existing actor- critic methods use Temporal Difference (TD) learning to estimate the gradient and steepest gradient ascent to update the policies. Our first contribution is to propose an actor-critic method that uses a Least Squares Temporal Difference (LSTD) method, which is known to converge faster than the TD methods. Our second contribution is to develop a new Newton-like actor-critic method that performs better especially for ill-conditioned problems. We evaluate our methods in problems motivated from robot motion control

    USING REINFORCEMENT LEARNING TO SPOOF A MONITORED KALMAN FILTER

    Get PDF
    Modern hardware systems rely on state estimators such as Kalman filters to monitor key variables for feedback and performance monitoring. The performance of the hardware system can be monitored using a chi-squared fault detection test. Previous work has shown that Kalman filters are susceptible to false data injection attacks. In a false data injection attack, intentional noise and/or bias is added to sensor measurement data to mislead a Kalman filter in a way that goes undetected by the chi-squared test. This thesis proposes a method to deceive a Kalman filter where the attack data is generated using reinforcement learning. It is shown that reinforcement learning can be used to train an agent to manipulate the output of a Kalman filter via false data injection and without being detected by the chi-squared test. This result shows that machine learning can be used to successfully perform a cyber-physical attack by an actor who does not need to have in-depth knowledge and understanding of mathematics governing the operation of the target system. This result has significant real-world impact as modern smart power grids, aircraft, car, and spacecraft control systems are all cyber-physical systems that rely on trustworthy sensor data to function safely and reliably. A machine learning derived false data injection attack against any of these systems could lead to an undetected and potentially catastrophic failure.DoD SpaceLieutenant, United States NavyApproved for public release. Distribution is unlimited

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Self-Adaptation in SDN-based IoT Networks

    Get PDF
    In the digital age, frightening patterns in digital threats are emerging. It is impossible to ignore threats to IoT networks. Threats can take on any of the typical forms, including Denial-of-Service (DoS), Distributed Denial-of-Service (DDoS), Virus assault, Man-in-the-middle attack (Mitm), Advanced Persistent Threats (APT), Password Assault, and more. It is crucial to eliminate all threats from IoT networks and devices. Reinforcement learning to detect anomalies in an IoT network is seen to be the greatest option for correcting risks in a network, hence fixing the afflicted nodes, according to this thesis, "Self-Adaptation of SDN-based IoT Networks." (Markov) MDP policies and MAPE-K loop properties in Self-aware systems are the bases of the design in this thesis. The network system exhibited self-adaptability features, which makes it self-correcting and self-healing. The objective of this research is to propose a means to secure the devices in an IoT network by protecting them from any form of threats and ensuring that the devices function normally. Even at the advent of abnormal functioning of any node in the network, the system should be able to correct itself. A Software Defined Network (SDN) architecture is proposed for the design in a later section, which explains the kind of SDN that should be in place for the intrusion detection system. Further into the thesis, we dived deep into the general overview of deep reinforcement learning. Then comes the implementation, which talks about the kind of reinforcement learning policy used in the work and how the result was derived. The other section discusses the result and discussion, where the result in this work was compared with the result of the traditional machine learning algorithm
    corecore