Search CORE

11 research outputs found

Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning

Author: Lindner David
Montesinos Victoriano
Nava Elvis
Perez Ethan
Rocamonde Juan
Publication venue
Publication date: 19/10/2023
Field of study

Reinforcement learning (RL) requires either manually specifying a reward function, which is often infeasible, or learning a reward model from a large amount of human feedback, which is often very expensive. We study a more sample-efficient alternative: using pretrained vision-language models (VLMs) as zero-shot reward models (RMs) to specify tasks via natural language. We propose a natural and general approach to using VLMs as reward models, which we call VLM-RMs. We use VLM-RMs based on CLIP to train a MuJoCo humanoid to learn complex tasks without a manually specified reward function, such as kneeling, doing the splits, and sitting in a lotus position. For each of these tasks, we only provide a single sentence text prompt describing the desired task with minimal prompt engineering. We provide videos of the trained agents at: https://sites.google.com/view/vlm-rm. We can improve performance by providing a second ``baseline'' prompt and projecting out parts of the CLIP embedding space irrelevant to distinguish between goal and baseline. Further, we find a strong scaling effect for VLM-RMs: larger VLMs trained with more compute and data are better reward models. The failure modes of VLM-RMs we encountered are all related to known capability limitations of current VLMs, such as limited spatial reasoning ability or visually unrealistic environments that are far off-distribution for the VLM. We find that VLM-RMs are remarkably robust as long as the VLM is large enough. This suggests that future VLMs will become more and more useful reward models for a wide range of RL applications

arXiv.org e-Print Archive

Meta-Learning via Classifier(-free) Guidance

Author: Grewe Benjamin F
Katzschmann Robert K
Kobayashi Seijin
Nava Elvis
Yin Yifei
Publication venue
Publication date: 01/01/2022
Field of study

State-of-the-art meta-learning techniques do not optimize for zero-shot adaptation to unseen tasks, a setting in which humans excel. On the contrary, meta-learning algorithms learn hyperparameters and weight initializations that explicitly optimize for few-shot learning performance. In this work, we take inspiration from recent advances in generative modeling and language-conditioned image synthesis to propose meta-learning techniques that use natural language guidance to achieve higher zero-shot performance compared to the state-of-the-art. We do so by recasting the meta-learning problem as a multi-modal generative modeling problem: given a task, we consider its adapted neural network weights and its natural language description as equivalent multi-modal task representations. We first train an unconditional generative hypernetwork model to produce neural network weights; then we train a second "guidance" model that, given a natural language task description, traverses the hypernetwork latent space to find high-performance task-adapted weights in a zero-shot manner. We explore two alternative approaches for latent space guidance: "HyperCLIP"-based classifier guidance and a conditional Hypernetwork Latent Diffusion Model ("HyperLDM"), which we show to benefit from the classifier-free guidance technique common in image generation. Finally, we demonstrate that our approaches outperform existing meta-learning methods with zero-shot learning experiments on our Meta-VQA dataset, which we specifically constructed to reflect the multi-modal meta-learning setting

Repository for Publications and Research Data

ZORA

Authenticated encryption of pmu data

Author: Gaona García Elvis Eduardo
Mojica Nava Eduardo Alirio
Rojas Martínez Sergio Leonardo
Trujillo Rodríguez Cesar Leonardo
Publication venue: Universidad Distrital Francisco José de Caldas. Colombia
Publication date: 01/12/2014
Field of study

This paper presents the implementation of anencryption board in order to provide confidentiality, authenticity and integrity of data collected at any point in a power grid, as a potential solution to the Smart Grid cyber security issues. This board consists of a Freescale microcontroller which enables the connection between a PMU (Phasor Measurement Unit) and a ZigBee transmitter. Encryption is done using the SHA256, HMAC-SHA256, KDF-SHA256 and AES256-CBC algorithms. This architecture makes reading and transmission of voltage and currentphasors, energy consumption, frequency, power, power factor and power outages measurements and sendsthis information in real time to a data concentrator where display and subsequent storage are possible. This paper presents the implementation of anencryption board in order to provide confidentiality, authenticity and integrity of data collected at any point in a power grid, as a potential solution to the Smart Grid cyber security issues. This board consists of a Freescale microcontroller which enables the connection between a PMU (Phasor Measurement Unit) and a ZigBee transmitter. Encryption is done using the SHA256, HMAC-SHA256, KDF-SHA256 and AES256-CBC algorithms. This architecture makes reading and transmission of voltage and currentphasors, energy consumption, frequency, power, power factor and power outages measurements and sendsthis information in real time to a data concentrator where display and subsequent storage are possible.

Universidad Distrital de la ciudad de Bogotá: Open Journal Systems

Diversified Sampling for Batched Bayesian Optimization with Determinantal Point Processes

Author: Krause Andreas
Mutný Mojmír
Nava Elvis
Publication venue: PMLR
Publication date: 01/01/2022
Field of study

In Bayesian Optimization (BO) we study black-box function optimization with noisy point evaluations and Bayesian priors. Convergence of BO can be greatly sped up by batching, where multiple evaluations of the black-box function are performed in a single round. The main difficulty in this setting is to propose at the same time diverse and informative batches of evaluation points. In this work, we introduce DPP-Batch Bayesian Optimization (DPP-BBO), a universal framework for inducing batch diversity in sampling based BO by leveraging the repulsive properties of Determinantal Point Processes (DPP) to naturally diversify the batch sampling procedure. We illustrate this framework by formulating DPP-Thompson Sampling (DPP-TS) as a variant of the popular Thompson Sampling (TS) algorithm and introducing a Markov Chain Monte Carlo procedure to sample from it. We then prove novel Bayesian simple regret bounds for both classical batched TS as well as our counterpart DPP-TS, with the latter bound being tighter. Our real-world, as well as synthetic, experiments demonstrate improved performance of DPP-BBO over classical batching methods with Gaussian process and Cox process models.ISSN:2640-349

arXiv.org e-Print Archive

Repository for Publications and Research Data

Diversified Sampling for Batched Bayesian Optimization with Determinantal Point Processes

Author: Krause Andreas
Mutný Mojmír
Nava Elvis
Publication venue: PMLR
Publication date: 01/01/2022
Field of study

Repository for Publications and Research Data

Meta-Learning via Classifier(-free) Diffusion Guidance

Author: Grewe Benjamin
Katzschmann Robert K.
Kobayashi Seijin
Nava Elvis
Yin Yifei
Publication venue: OpenReview
Publication date: 01/01/2023
Field of study

We introduce meta-learning algorithms that perform zero-shot weight-space adaptation of neural network models to unseen tasks. Our methods repurpose the popular generative image synthesis techniques of natural language guidance and diffusion models to generate neural network weights adapted for tasks. We first train an unconditional generative hypernetwork model to produce neural network weights; then we train a second "guidance" model that, given a natural language task description, traverses the hypernetwork latent space to find high-performance task-adapted weights in a zero-shot manner. We explore two alternative approaches for latent space guidance: "HyperCLIP"-based classifier guidance and a conditional Hypernetwork Latent Diffusion Model ("HyperLDM"), which we show to benefit from the classifier-free guidance technique common in image generation. Finally, we demonstrate that our approaches outperform existing multi-task and meta-learning methods in a series of zero-shot learning experiments on our Meta-VQA dataset.ISSN:2835-885

Repository for Publications and Research Data

Authenticated encryption of pmu data

Author: Gaona García Elvis Eduardo
Mojica Nava Eduardo Alirio
Rojas Martínez Sergio Leonardo
Trujillo Rodríguez Cesar Leonardo
Publication venue: 'Universidad Distrital Francisco Jose de Caldas'
Publication date: 19/09/2019
Field of study

Repositorio Institucional Universidad Distrital - RIUD

Fast Aquatic Swimmer Optimization with Differentiable Projective Dynamics and Neural Network Hydrodynamic Models

Author: Du Tao
Grewe Benjamin F
Katzschmann Robert K
Ma Pingchuan
Matusik Wojciech
Michelis Mike Y
Nava Elvis
Zhang John Z
Publication venue: arXiv
Publication date: 01/01/2022
Field of study

ZORA

Recommended from our members

Comparison of dust forecast (GEOS-5 and WRF-Chem), satellite observations and ground-based aerosol measurements in the Caribbean region during the 2020 Summer African dust season

Author: Anonymous
Colarco Peter Richard
Holben Brent N.
Leon Pedro
Martinez-Huertas Bianca Lee
Mayol-Bracero Olga L.
Otis Daniel Brooks
Perez Sheilynette
Reyes Ashford
Rios Angel
Rosas-Nava Josele
Sarangi Bighnaraj
Sealy Andrea M.
Torres-Delgado Elvis
Yu Hongbin
Zuidema Paquita
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/12/2020
Field of study

North African dust reaches the Caribbean region every summer supplying mineral dust particles which play an important role in the regional weather and public health. During the African dust season of summer 2020 several events, including the "Godzilla" mega dust event, were identified over the Caribbean. Under the framework of the NASA-funded project Caribbean Air-quality Alert and Management Assistance System-Public Health (CALIMA-PH), we compare results of the dust forecast models with the ground-based and satellite observations for events that happened in parallel with large convective systems over the region during June-July 2020. The models used are the global dust forecast model Goddard Earth Observing System-5 (GEOS-5) and the regional dust forecast model Weather Research and Forecasting model coupled with Chemistry (WRF-Chem). Satellite observations are from the Visible Infrared Imaging Radiometer Suite (VIIRS), the Moderate Resolution Imaging Spectroradiometer (MODIS), and the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO). Ground-based observations (e.g., aerosol optical depth (AOD), depolarization ratio, particulate matter, scattering Angstrom exponent (SAE), dust surface concentration, height of dust layer) were performed at seven different locations (Cayenne, Martinique, Guadeloupe--French Territories, Barbados, Puerto Rico, Merida--Mexico and Miami--USA) over the Caribbean to provide a better understanding of African dust dispersal patterns over the region with a unique "Lagrangian" measurement, including the Godzilla mega dust event and tropical storms developed in the area. Results show that the dust forecast models were not always in agreement with the observations, and this was the particular case during the presence of tropical storms like Cristobal and Gonzalo. We will show the differences between the forecast provided by both models and the result of another run after ingesting the models with aerosol available data such as AOD

University of Miami: Scholarship Miami

Recommended from our members

"Godzilla" African dust event of June 2020; impacts of air quality in the Greater Caribbean Basin, the Gulf of Mexico and the United States

Author: Anonymous
Colarco Peter Richard
Gaston Cassandra
Holben Brent N.
Ladino Luis
Leon Pedro
Martinez-Huertas Bianca Lee
Martinez-Sanchez Odalys
Mayol-Bracero Olga L.
Mendez-Lazaro Pablo
Molinie Jack
Muller-Karger Frank E.
Ogren John A.
Otis Daniel Brooks
Perez Sheilynette
Prospero Joseph M.
Raga Graciela
Rios Angel
Rosas Daniel
Rosas-Nava Josele
Rueda-Roa Digna T.
Sarangi Bighnaraj
Sealy Andrea M.
Sheridan Patrick J.
Torres-Delgado Elvis
Xian Peng
Yu Hongbin
Zuidema Paquita
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/12/2020
Field of study

On June 19, 2020, the Caribbean region started to feel the effects of an historic African (Saharan) dust plume that has been called "Godzilla" due to its large geographic extent and record amount of dust. This plume, with an area close to the size of the continental USA (8,080,464 km (super 2) ), blanketed areas in the greater Caribbean Basin, the Gulf of Mexico and the southern United States. The occurrence and progression of this "Godzilla" event was predicted by several dust forecast models, among them, the global Goddard Earth Observing System-5 (GEOS-5) and the regional dust forecast model Weather Research and Forecasting model coupled with Chemistry (WRF-Chem). According to data from the NASA Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP Lidar), the dust plume extended from the Earth's surface up to about 5 km altitude. As part of the NASA-funded summer 2020 intensive field phase of the Caribbean Air-quality Alert and Management Assistance System-Public Health (CALIMA-PH) project, eight ground-based stations in the Greater Caribbean Basin (French Guiana, Trinidad and Tobago, Martinique, Guadeloupe, Puerto Rico, Merida-Mexico and Miami-USA) collected surface aerosol data (e.g., PM (sub 10) and PM (sub 2.5) mass concentrations, light scattering and absorption coefficients, visibility, dust concentrations) and column aerosol data (i.e., aerosol optical depth--AOD) during the event. Using these data, together with satellite observations from the Moderate Resolution Imaging Spectroradiometer (MODIS), the Visible Infrared Imaging Radiometer Suite (VIIRS), and CALIOP, we describe the movement of the dust plume through the region and assess its impact. The event caused a decrease in visibility in the atmosphere's boundary layer of less than 3 miles in some locations, showed record values for the aerosol optical properties, and exhibited exceedances in both the US EPA air quality standard and the World Health Organization (WHO) air quality guidelines. For several days, the locations impacted by the "Godzilla" dust plume were exposed to air quality conditions ranging from "Unhealthy for sensitive groups" to "Hazardous", in cases reaching PM (sub 10) values ca. 500 mu g/m (super 3)

University of Miami: Scholarship Miami