10 research outputs found

    Incorporating Deep Q -- Network with Multiclass Classification Algorithms

    Full text link
    In this study, we explore how Deep Q-Network (DQN) might improve the functionality of multiclass classification algorithms. We will use a benchmark dataset from Kaggle to create a framework incorporating DQN with existing supervised multiclass classification algorithms. The findings of this study will bring insight into how deep reinforcement learning strategies may be used to increase multiclass classification accuracy. They have been used in a number of fields, including image recognition, natural language processing, and bioinformatics. This study is focused on the prediction of financial distress in companies in addition to the wider application of Deep Q-Network in multiclass classification. Identifying businesses that are likely to experience financial distress is a crucial task in the fields of finance and risk management. Whenever a business experiences serious challenges keeping its operations going and meeting its financial responsibilities, it is said to be in financial distress. It commonly happens when a company has a sharp and sustained recession in profitability, cash flow issues, or an unsustainable level of debt

    Understanding Reinforcement Learning Control in Cyber-Physical Energy Systems

    Get PDF
    The possibility of modeling a renewable energy system as a Cyber-Physical Energy System (CPES) offers new possibilities in terms of control. More precisely, this document discusses the applicability of Reinforcement Learning (RL) techniques to CPES. By considering a benchmark algorithm, we focus on conceptual and implementation details and on how such details relate to the problem of interest. In this case, we simulate how a RL model can optimize the energy storage control in order to reduce energy costs. The work also discusses the issues that arise in RL models and the possible approaches to these difficulties. Specifically, we propose investigating a better exploitation of the memory mechanism

    A Federated DRL Approach for Smart Micro-Grid Energy Control with Distributed Energy Resources

    Full text link
    The prevalence of the Internet of things (IoT) and smart meters devices in smart grids is providing key support for measuring and analyzing the power consumption patterns. This approach enables end-user to play the role of prosumers in the market and subsequently contributes to diminish the carbon footprint and the burden on utility grids. The coordination of trading surpluses of energy that is generated by house renewable energy resources (RERs) and the supply of shortages by external networks (main grid) is a necessity. This paper proposes a hierarchical architecture to manage energy in multiple smart buildings leveraging federated deep reinforcement learning (FDRL) with dynamic load in a distributed manner. Within the context of the developed FDRL-based framework, each agent that is hosted in local building energy management systems (BEMS) trains a local deep reinforcement learning (DRL) model and shares its experience in the form of model hyperparameters to the federation layer in the energy management system (EMS). Simulation studies are conducted using one EMS and up to twenty smart houses that are equipped with photovoltaic (PV) systems and batteries. This iterative training approach enables the proposed discretized soft actor-critic (SAC) agents to aggregate the collected knowledge to expedite the overall learning procedure and reduce costs and CO2 emissions, while the federation approach can mitigate privacy breaches. The numerical results confirm the performance of the proposed framework under different daytime periods, loads, and temperatures.Comment: 7 pages, 6 figures, accepted for publication at IEEE CAMAD 202

    Adversarial Sample Generation using the Euclidean Jacobian-based Saliency Map Attack (EJSMA) and Classification for IEEE 802.11 using the Deep Deterministic Policy Gradient (DDPG)

    Get PDF
    One of today's most promising developments is wireless networking, as it enables people across the globe to stay connected. As the wireless networks' transmission medium is open, there are potential issues in safeguarding the privacy of the information. Though several security protocols exist in the literature for the preservation of information, most cases fail with a simple spoof attack. So, intrusion detection systems are vital in wireless networks as they help in the identification of harmful traffic. One of the challenges that exist in wireless intrusion detection systems (WIDS) is finding a balance between accuracy and false alarm rate. The purpose of this study is to provide a practical classification scheme for newer forms of attack. The AWID dataset is used in the experiment, which proposes a feature selection strategy using a combination of Elastic Net and recursive feature elimination. The best feature subset is obtained with 22 features, and a deep deterministic policy gradient learning algorithm is then used to classify attacks based on those features. Samples are generated using the Euclidean Jacobian-based Saliency Map Attack (EJSMA) to evaluate classification outcomes using adversarial samples. The meta-analysis reveals improved results in terms of feature production (22 features), classification accuracy (98.75% for testing samples and 85.24% for adversarial samples), and false alarm rates (0.35%).&nbsp

    UAV Maneuvering Target Tracking in Uncertain Environments based on Deep Reinforcement Learning and Meta-learning

    Get PDF
    This paper combines Deep Reinforcement Learning (DRL) with Meta-learning and proposes a novel approach, named Meta Twin Delayed Deep Deterministic policy gradient (Meta-TD3), to realize the control of Unmanned Aerial Vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider multi-tasks experience replay buffer to provide data for multi-tasks learning of DRL algorithm, and we combine Meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness

    Deep Reinforcement Learning for Computation Offloading in Mobile Edge Computing

    Get PDF
    As 5G-networks are deployed worldwide, mobile edge computing (MEC) has been developed to help alleviate resource-intensive computations from an application. Here, IoT devices can offload their computation to an MEC server and receive the computed result. This offloading scheme can be viewed as an optimization problem, where the complexity quickly increases when more devices join the system. In this thesis, we solve the optimization problem and introduce different strategies that are compared to the optimal solution. The strategies implemented are full local computing, full offload to an MEC server, random search, optimal solution, Q- learning, and a deep Q-network (DQN). The main objective for each strategy is to minimize the total cost of the system, where the cost is a combination of energy consumption and delay. However, as the number of devices in the system increases, the results view numerous challenges. This thesis shows that the performance of random search, Q-learning, and DQN strategies are very close to the optimal solution for up to 20 devices. However, the results show generally poor performance for the strategies that can address more than 20 devices. In the end, we further discuss the performance and convergence of a DQN in MEC.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO

    Training & acceleration of deep reinforcement learning agents

    Get PDF
    Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) "Επιστήμη Δεδομένων και Μηχανική Μάθηση

    Prioritized experience replay based deep distributional reinforcement learning for battery operation in microgrids

    Get PDF
    This is the author accepted manuscript. The final version is available on open access from Elsevier via the DOI in this recordData availability: Data will be made available on request.Reinforcement Learning (RL) provides a pathway for efficiently utilizing the battery storage in a microgrid. However, traditional value-based RL algorithms used in battery management focus on formulating the policies based on the reward expectation rather than its probability distribution. Hence the scheduling strategy is solely based on the expectation of the rewards rather than the distribution. This paper focuses on scheduling strategy based on probability distribution of the rewards which optimally reflects the uncertainties in the incoming dataset. Furthermore, the prioritized experience replay samples of the training experience are used to enhance the quality of the learning by reducing bias. The results are obtained with different variants of distributional RL algorithms like C51, Quantile Regression Deep Q-Network (QR-DQN), Fully Quantizable Function (FQF), Implicit Quantile Networks (IQN) and rainbow. Moreover, the results are compared with the traditional deep Q-learning algorithm with prioritized experienced replay. The convergence results on the training dataset are further analyzed by varying the action spaces, using randomized experience replay and without including the tariff-based action while enforcing the penalties for violating battery SoC limits. The best trained Q-network is tested with different load and PV profiles to obtain the battery operation and costs. The performance of the distributional RL algorithms is analyzed under different schemes of Time of Use (ToU) tariff. QR-DQN with prioritized experience replay has been found to be the best performing algorithm in terms of convergence on the training dataset, with least fluctuation in validation dataset and battery operations during different tariff regimes during the day.European Regional Development Fun
    corecore