7 research outputs found
PASTA: Pretrained Action-State Transformer Agents
Self-supervised learning has brought about a revolutionary paradigm shift in
various computing domains, including NLP, vision, and biology. Recent
approaches involve pre-training transformer models on vast amounts of unlabeled
data, serving as a starting point for efficiently solving downstream tasks. In
the realm of reinforcement learning, researchers have recently adapted these
approaches by developing models pre-trained on expert trajectories, enabling
them to address a wide range of tasks, from robotics to recommendation systems.
However, existing methods mostly rely on intricate pre-training objectives
tailored to specific downstream applications. This paper presents a
comprehensive investigation of models we refer to as Pretrained Action-State
Transformer Agents (PASTA). Our study uses a unified methodology and covers an
extensive set of general downstream tasks including behavioral cloning, offline
RL, sensor failure robustness, and dynamics change adaptation. Our goal is to
systematically compare various design choices and provide valuable insights to
practitioners for building robust models. Key highlights of our study include
tokenization at the action and state component level, using fundamental
pre-training objectives like next token prediction, training models across
diverse domains simultaneously, and using parameter efficient fine-tuning
(PEFT). The developed models in our study contain fewer than 10 million
parameters and the application of PEFT enables fine-tuning of fewer than 10,000
parameters during downstream adaptation, allowing a broad community to use
these models and reproduce our experiments. We hope that this study will
encourage further research into the use of transformers with first-principles
design choices to represent RL trajectories and contribute to robust policy
learning
Guarantees on Robot System Performance Using Stochastic Simulation Rollouts
We provide finite-sample performance guarantees for control policies executed
on stochastic robotic systems. Given an open- or closed-loop policy and a
finite set of trajectory rollouts under the policy, we bound the expected
value, value-at-risk, and conditional-value-at-risk of the trajectory cost, and
the probability of failure in a sparse rewards setting. The bounds hold, with
user-specified probability, for any policy synthesis technique and can be seen
as a post-design safety certification. Generating the bounds only requires
sampling simulation rollouts, without assumptions on the distribution or
complexity of the underlying stochastic system. We adapt these bounds to also
give a constraint satisfaction test to verify safety of the robot system.
Furthermore, we extend our method to apply when selecting the best policy from
a set of candidates, requiring a multi-hypothesis correction. We show the
statistical validity of our bounds in the Ant, Half-cheetah, and Swimmer MuJoCo
environments and demonstrate our constraint satisfaction test with the Ant.
Finally, using the 20 degree-of-freedom MuJoCo Shadow Hand, we show the
necessity of the multi-hypothesis correction.Comment: Submitted to IEEE-TR
Improving Electricity Distribution System State Estimation with AMR-Based Load Profiles
The ongoing battle against global warming is rapidly increasing the amount of renewable power generation, and smart solutions are needed to integrate these new generation units into the existing distribution systems. Smart grids answer this call by introducing intelligent ways of controlling the network and active resources connected to it. However, before the network can be controlled, the automation system must know what the node voltages and line currents defining the network state are.Distribution system state estimation (DSSE) is needed to find the most likely state of the network when the number and accuracy of measurements are limited. Typically, two types of measurements are used in DSSE: real-time measurements and pseudomeasurements. In recent years, finding cost-efficient ways to improve the DSSE accuracy has been a popular subject in the literature. While others have focused on optimizing the type, amount and location of real-time measurements, the main hypothesis of this thesis is that it is possible to enhance the DSSE accuracy by using interval measurements collected with automatic meter reading (AMR) to improve the load profiles used as pseudo-measurements.The work done in this thesis can be divided into three stages. In the first stage, methods for creating new AMR-based load profiles are studied. AMR measurements from thousands of customers are used to test and compare the different options for improving the load profiling accuracy. Different clustering algorithms are tested and a novel twostage clustering method for load profiling is developed. In the second stage, a DSSE algorithm suited for smart grid environment is developed. Simulations and real-life demonstrations are conducted to verify the accuracy and applicability of the developed state estimator. In the third and final stage, the AMR-based load profiling and DSSE are combined. Matlab simulations with real AMR data and a real distribution network model are made and the developed load profiles are compared with other commonly used pseudo-measurements.The results indicate that clustering is an efficient way to improve the load profiling accuracy. With the help of clustering, both the customer classification and customer class load profiles can be updated simultaneously. Several of the tested clustering algorithms were suited for clustering electricity customers, but the best results were achieved with a modified k-means algorithm. Results from the third stage simulations supported the main hypothesis that the new AMR-based load profiles improve the DSSE accuracy.The results presented in this thesis should motivate distribution system operators and other actors in the field of electricity distribution to utilize AMR data and clustering algorithms in load profiling. It improves not only the DSSE accuracy but also many other functions that rely on load flow calculation and need accurate load estimates or forecasts