6,989 research outputs found
Advancing the Applicability of Reinforcement Learning to Autonomous Control
ï»żMit dateneffizientem Reinforcement Learning (RL) konnten
beeindruckendeErgebnisse erzielt werden, z.B. fĂŒr die Regelung von
Gasturbinen. In derPraxis erfordert die Anwendung von RL jedoch noch viel
manuelle Arbeit, wasbisher RL fĂŒr die autonome Regelung untauglich
erscheinen lieĂ. Dievorliegende Arbeit adressiert einige der verbleibenden
Probleme, insbesonderein Bezug auf die ZuverlÀssigkeit der
Policy-Erstellung.
Es werden zunÀchst RL-Probleme mit diskreten Zustands- und
AktionsrĂ€umenbetrachtet. FĂŒr solche Probleme wird hĂ€ufig ein MDP aus
BeobachtungengeschÀtzt, um dann auf Basis dieser MDP-SchÀtzung eine Policy
abzuleiten. DieArbeit beschreibt, wie die SchÀtzer-Unsicherheit des MDP in
diePolicy-Erstellung eingebracht werden kann, um mit diesem Wissen das
Risikoeiner schlechten Policy aufgrund einer fehlerhaften MDP-SchÀtzung
zuverringern. AuĂerdem wird so effiziente Exploration sowie
Policy-Bewertungermöglicht.
AnschlieĂend wendet sich die Arbeit Problemen mit
kontinuierlichenZustandsrÀumen zu und konzentriert sich auf auf
RL-Verfahren, welche aufFitted Q-Iteration (FQI) basieren, insbesondere
Neural Fitted Q-Iteration(NFQ). Zwar ist NFQ sehr dateneffizient, jedoch
nicht so zuverlĂ€ssig, wie fĂŒrdie autonome Regelung nötig wĂ€re. Die Arbeit
schlÀgt die Verwendung vonEnsembles vor, um die ZuverlÀssigkeit von NFQ zu
erhöhen. Es werden eine Reihevon Möglichkeiten der Ensemble-Nutzung
entworfen und evaluiert. Bei allenbetrachteten RL-Problemen sorgen
Ensembles fĂŒr eine zuverlĂ€ssigere Erstellungguter Policies.
Im nÀchsten Schritt werden Möglichkeiten der Policy-Bewertung
beikontinuierlichen ZustandsrÀumen besprochen. Die Arbeit schlÀgt vor,
FittedPolicy Evaluation (FPE), eine Variante von FQI fĂŒr Policy Evaluation,
mitanderen Regressionsverfahren und/oder anderen DatensÀtzen zu
kombinieren, umein MaĂ fĂŒr die Policy-QualitĂ€t zu erhalten. Experimente
zeigen, dassExtra-Tree-FPE ein realistisches QualitĂ€tsmaĂ fĂŒr
NFQ-generierte Policies liefernkann.
SchlieĂlich kombiniert die Arbeit Ensembles und Policy-Bewertung, um mit
sichÀndernden RL-Problemen umzugehen. Der wesentliche Beitrag ist das
EvolvingEnsemble, dessen Policy sich langsam Àndert, indem alte,
untaugliche Policiesentfernt und neue hinzugefĂŒgt werden. Es zeigt sich,
dass das EvolvingEnsemble deutlich besser funktioniert als einfachere
AnsÀtze.With data-efficient reinforcement learning (RL) methods impressive
resultscould be achieved, e.g., in the context of gas turbine control.
However, inpractice the application of RL still requires much human
intervention, whichhinders the application of RL to autonomous control.
This thesis addressessome of the remaining problems, particularly regarding
the reliability of thepolicy generation process.
The thesis first discusses RL problems with discrete state and action
spaces.In that context, often an MDP is estimated from observations. It is
describedhow to incorporate the estimators' uncertainties into the policy
generationprocess. This information can then be used to reduce the risk of
obtaining apoor policy due to flawed MDP estimates. Moreover, it is
discussed how to usethe knowledge of uncertainty for efficient exploration
and the assessment ofpolicy quality without requiring the policy's
execution.
The thesis then moves on to continuous state problems and focuses on
methodsbased on fitted Q-iteration (FQI), particularly neural fitted
Q-iteration(NFQ). Although NFQ has proven to be very data-efficient, it is
not asreliable as required for autonomous control. The thesis proposes to
useensembles to increase reliability. Several ways of ensemble usage in an
NFQcontext are discussed and evaluated on a number of benchmark domains. It
showsthat in all considered domains with ensembles good policies can be
producedmore reliably.
Next, policy assessment in continuous domains is discussed. The
thesisproposes to use fitted policy evaluation (FPE), an adaptation of FQI
to policyevaluation, combined with a different function approximator and/or
differentdataset to obtain a measure for policy quality. Results of
experiments showthat extra-tree FPE, applied to policies generated by NFQ,
produces valuefunctions that can well be used to reason about the true
policy quality.
Finally, the thesis combines ensembles and policy assessment to derive
methodsthat can deal with changing environments. The major contribution is
theevolving ensemble. The policy of the evolving ensemble changes slowly as
newpolicies are added and old policies removed. It turns out that the
evolvingensemble approaches work considerably better than simpler
approaches likesingle policies learned with recent observations or simple
ensembles
Learning a model is paramount for sample efficiency in reinforcement learning control of PDEs
The goal of this paper is to make a strong point for the usage of dynamical
models when using reinforcement learning (RL) for feedback control of dynamical
systems governed by partial differential equations (PDEs). To breach the gap
between the immense promises we see in RL and the applicability in complex
engineering systems, the main challenges are the massive requirements in terms
of the training data, as well as the lack of performance guarantees. We present
a solution for the first issue using a data-driven surrogate model in the form
of a convolutional LSTM with actuation. We demonstrate that learning an
actuated model in parallel to training the RL agent significantly reduces the
total amount of required data sampled from the real system. Furthermore, we
show that iteratively updating the model is of major importance to avoid biases
in the RL training. Detailed ablation studies reveal the most important
ingredients of the modeling process. We use the chaotic Kuramoto-Sivashinsky
equation do demonstarte our findings
Performance Analysis Of Data-Driven Algorithms In Detecting Intrusions On Smart Grid
The traditional power grid is no longer a practical solution for power delivery due to several shortcomings, including chronic blackouts, energy storage issues, high cost of assets, and high carbon emissions. Therefore, there is a serious need for better, cheaper, and cleaner power grid technology that addresses the limitations of traditional power grids. A smart grid is a holistic solution to these issues that consists of a variety of operations and energy measures. This technology can deliver energy to end-users through a two-way flow of communication. It is expected to generate reliable, efficient, and clean power by integrating multiple technologies. It promises reliability, improved functionality, and economical means of power transmission and distribution. This technology also decreases greenhouse emissions by transferring clean, affordable, and efficient energy to users. Smart grid provides several benefits, such as increasing grid resilience, self-healing, and improving system performance. Despite these benefits, this network has been the target of a number of cyber-attacks that violate the availability, integrity, confidentiality, and accountability of the network. For instance, in 2021, a cyber-attack targeted a U.S. power system that shut down the power grid, leaving approximately 100,000 people without power. Another threat on U.S. Smart Grids happened in March 2018 which targeted multiple nuclear power plants and water equipment. These instances represent the obvious reasons why a high level of security approaches is needed in Smart Grids to detect and mitigate sophisticated cyber-attacks. For this purpose, the US National Electric Sector Cybersecurity Organization and the Department of Energy have joined their efforts with other federal agencies, including the Cybersecurity for Energy Delivery Systems and the Federal Energy Regulatory Commission, to investigate the security risks of smart grid networks. Their investigation shows that smart grid requires reliable solutions to defend and prevent cyber-attacks and vulnerability issues. This investigation also shows that with the emerging technologies, including 5G and 6G, smart grid may become more vulnerable to multistage cyber-attacks. A number of studies have been done to identify, detect, and investigate the vulnerabilities of smart grid networks. However, the existing techniques have fundamental limitations, such as low detection rates, high rates of false positives, high rates of misdetection, data poisoning, data quality and processing, lack of scalability, and issues regarding handling huge volumes of data. Therefore, these techniques cannot ensure safe, efficient, and dependable communication for smart grid networks. Therefore, the goal of this dissertation is to investigate the efficiency of machine learning in detecting cyber-attacks on smart grids. The proposed methods are based on supervised, unsupervised machine and deep learning, reinforcement learning, and online learning models. These models have to be trained, tested, and validated, using a reliable dataset. In this dissertation, CICDDoS 2019 was used to train, test, and validate the efficiency of the proposed models. The results show that, for supervised machine learning models, the ensemble models outperform other traditional models. Among the deep learning models, densely neural network family provides satisfactory results for detecting and classifying intrusions on smart grid. Among unsupervised models, variational auto-encoder, provides the highest performance compared to the other unsupervised models. In reinforcement learning, the proposed Capsule Q-learning provides higher detection and lower misdetection rates, compared to the other model in literature. In online learning, the Online Sequential Euclidean Distance Routing Capsule Network model provides significantly better results in detecting intrusion attacks on smart grid, compared to the other deep online models
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
- âŠ