12 research outputs found
PASTA: Pretrained Action-State Transformer Agents
Self-supervised learning has brought about a revolutionary paradigm shift in
various computing domains, including NLP, vision, and biology. Recent
approaches involve pre-training transformer models on vast amounts of unlabeled
data, serving as a starting point for efficiently solving downstream tasks. In
the realm of reinforcement learning, researchers have recently adapted these
approaches by developing models pre-trained on expert trajectories, enabling
them to address a wide range of tasks, from robotics to recommendation systems.
However, existing methods mostly rely on intricate pre-training objectives
tailored to specific downstream applications. This paper presents a
comprehensive investigation of models we refer to as Pretrained Action-State
Transformer Agents (PASTA). Our study uses a unified methodology and covers an
extensive set of general downstream tasks including behavioral cloning, offline
RL, sensor failure robustness, and dynamics change adaptation. Our goal is to
systematically compare various design choices and provide valuable insights to
practitioners for building robust models. Key highlights of our study include
tokenization at the action and state component level, using fundamental
pre-training objectives like next token prediction, training models across
diverse domains simultaneously, and using parameter efficient fine-tuning
(PEFT). The developed models in our study contain fewer than 10 million
parameters and the application of PEFT enables fine-tuning of fewer than 10,000
parameters during downstream adaptation, allowing a broad community to use
these models and reproduce our experiments. We hope that this study will
encourage further research into the use of transformers with first-principles
design choices to represent RL trajectories and contribute to robust policy
learning
Survey Assessment for Decision Support Using Self-Organizing Maps Profile Characterization with an Odds and Cluster Heat Map: Application to Children’s Perception of Urban School Environments
The interpretation of opinion and satisfaction surveys based exclusively on statistical analysis often faces difficulties due to the nature of the information and the requirements of the available statistical methods. These difficulties include the concurrence of categorical information with answers based on Likert scales with only a few levels, or the distancing of the necessary heuristic approach of the decision support system (DSS). The artificial neural network used for data analysis, called Kohonen or self-organizing maps (SOM), although rarely used for survey analysis, has been applied in many fields, facilitating the graphical representation and the simple interpretation of high-dimensionality data. This clustering method, based on unsupervised learning, also allows obtaining profiles of respondents without the need to provide additional information for the creation of these clusters. In this work, we propose the identification of profiles using SOM for evaluating opinion surveys. Subsequently, non-parametric chi-square tests were first conducted to contrast whether answer was independent of each profile found, and in the case of statistical significance (p ≤ 0.05), the odds ratio was evaluated as an indicator of the effect size of such dependence. Finally, all results were displayed in an odds and cluster heat map so that they could be easily interpreted and used to make decisions regarding the survey results. The methodology was applied to the analysis of a survey based on forms administered to children (N = 459) about their perception of the urban environment close to their school, obtaining relevant results, facilitating results interpretation, and providing support to the decision-process.This research was funded by Campus de Excelencia Internacional BIOTIC Granada, University of
Granada, grant number V1.2015 and the APC was funded by University of Granada
Guarantees on Robot System Performance Using Stochastic Simulation Rollouts
We provide finite-sample performance guarantees for control policies executed
on stochastic robotic systems. Given an open- or closed-loop policy and a
finite set of trajectory rollouts under the policy, we bound the expected
value, value-at-risk, and conditional-value-at-risk of the trajectory cost, and
the probability of failure in a sparse rewards setting. The bounds hold, with
user-specified probability, for any policy synthesis technique and can be seen
as a post-design safety certification. Generating the bounds only requires
sampling simulation rollouts, without assumptions on the distribution or
complexity of the underlying stochastic system. We adapt these bounds to also
give a constraint satisfaction test to verify safety of the robot system.
Furthermore, we extend our method to apply when selecting the best policy from
a set of candidates, requiring a multi-hypothesis correction. We show the
statistical validity of our bounds in the Ant, Half-cheetah, and Swimmer MuJoCo
environments and demonstrate our constraint satisfaction test with the Ant.
Finally, using the 20 degree-of-freedom MuJoCo Shadow Hand, we show the
necessity of the multi-hypothesis correction.Comment: Submitted to IEEE-TR
European Strategies for Adaptation to Climate Change with the Mayors Adapt Initiative by Self-Organizing Maps
Featured Application: The methodology developed in this research has direct application in
understanding European initiatives and policies for adaptation to climate change through the
identification of differentiated strategic adaptation frameworks.The European Union (EU) has assigned municipal governments a key role in the transformations needed to achieve its climate and energy objectives. One of the main initiatives of the EU has been the “The Covenant of Mayors”, launched in 2008, with impacts beyond Europe due to integration with the “Global Covenant of Mayors for Climate and Energy”. This research focuses on local measures to adapt to climate change, verifying their differences between themselves, and aims to identify and characterize patterns in the different adaptation strategies examined. Further aims are (i) the collection of good practices, framed in the Mayors Adapt initiative, managing multidimensional data from the context and from its adaptation proposals; (ii) the classification of strategies in profiles and patterns using artificial neural networks based on the previous variables; (iii) the characterization and comparison of such profiles. The results substantiate the existence of several well-differentiated approaches, connected with their geographical context, vulnerability and politics. These results provide valuable information for its interpretation and for the planning of climate change adaptation actions, highlighting the value of the creation of networks of institutional collaboration targeted at each strategic framework.This research was funded by Consejería de Economía, Innovación, Ciencia y Empleo, Andalusian
Regional Government (Spain) grant number P12-RNM-1514. And The APC was funded by University of
Granada (Spain)
Improving Electricity Distribution System State Estimation with AMR-Based Load Profiles
The ongoing battle against global warming is rapidly increasing the amount of renewable power generation, and smart solutions are needed to integrate these new generation units into the existing distribution systems. Smart grids answer this call by introducing intelligent ways of controlling the network and active resources connected to it. However, before the network can be controlled, the automation system must know what the node voltages and line currents defining the network state are.Distribution system state estimation (DSSE) is needed to find the most likely state of the network when the number and accuracy of measurements are limited. Typically, two types of measurements are used in DSSE: real-time measurements and pseudomeasurements. In recent years, finding cost-efficient ways to improve the DSSE accuracy has been a popular subject in the literature. While others have focused on optimizing the type, amount and location of real-time measurements, the main hypothesis of this thesis is that it is possible to enhance the DSSE accuracy by using interval measurements collected with automatic meter reading (AMR) to improve the load profiles used as pseudo-measurements.The work done in this thesis can be divided into three stages. In the first stage, methods for creating new AMR-based load profiles are studied. AMR measurements from thousands of customers are used to test and compare the different options for improving the load profiling accuracy. Different clustering algorithms are tested and a novel twostage clustering method for load profiling is developed. In the second stage, a DSSE algorithm suited for smart grid environment is developed. Simulations and real-life demonstrations are conducted to verify the accuracy and applicability of the developed state estimator. In the third and final stage, the AMR-based load profiling and DSSE are combined. Matlab simulations with real AMR data and a real distribution network model are made and the developed load profiles are compared with other commonly used pseudo-measurements.The results indicate that clustering is an efficient way to improve the load profiling accuracy. With the help of clustering, both the customer classification and customer class load profiles can be updated simultaneously. Several of the tested clustering algorithms were suited for clustering electricity customers, but the best results were achieved with a modified k-means algorithm. Results from the third stage simulations supported the main hypothesis that the new AMR-based load profiles improve the DSSE accuracy.The results presented in this thesis should motivate distribution system operators and other actors in the field of electricity distribution to utilize AMR data and clustering algorithms in load profiling. It improves not only the DSSE accuracy but also many other functions that rely on load flow calculation and need accurate load estimates or forecasts
ANUÁRIO CIENTÍFICO IPG 2009
Este documento sintetiza a produção científica, trabalhos de inovação e projectos desenvolvidos ao longo de 2009 pelo corpo docente e técnico do Instituto Politécnico da Guarda