Search CORE

1,046 research outputs found

Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets

Author: Badrinath Anirudhan
Brunskill Emma
Flet-Berliac Yannis
Nie Allen
Publication venue
Publication date: 18/11/2023
Field of study

Despite the recent advancements in offline reinforcement learning via supervised learning (RvS) and the success of the decision transformer (DT) architecture in various domains, DTs have fallen short in several challenging benchmarks. The root cause of this underperformance lies in their inability to seamlessly connect segments of suboptimal trajectories. To overcome this limitation, we present a novel approach to enhance RvS methods by integrating intermediate targets. We introduce the Waypoint Transformer (WT), using an architecture that builds upon the DT framework and conditioned on automatically-generated waypoints. The results show a significant increase in the final return compared to existing RvS methods, with performance on par or greater than existing state-of-the-art temporal difference learning-based methods. Additionally, the performance and stability improvements are largest in the most challenging environments and data configurations, including AntMaze Large Play/Diverse and Kitchen Mixed/Partial.Comment: Accepted to the Conference on Neural Information Processing Systems 2023 (NeurIPS 2023

arXiv.org e-Print Archive

LLF-Bench: Benchmark for Interactive Learning from Language Feedback

Author: Cheng Ching-An
Kolobov Andrey
Misra Dipendra
Nie Allen
Swaminathan Adith
Publication venue
Publication date: 13/12/2023
Field of study

We introduce a new benchmark, LLF-Bench (Learning from Language Feedback Benchmark; pronounced as "elf-bench"), to evaluate the ability of AI agents to interactively learn from natural language feedback and instructions. Learning from language feedback (LLF) is essential for people, largely because the rich information this feedback provides can help a learner avoid much of trial and error and thereby speed up the learning process. Large Language Models (LLMs) have recently enabled AI agents to comprehend natural language -- and hence AI agents can potentially benefit from language feedback during learning like humans do. But existing interactive benchmarks do not assess this crucial capability: they either use numeric reward feedback or require no learning at all (only planning or information retrieval). LLF-Bench is designed to fill this omission. LLF-Bench is a diverse collection of sequential decision-making tasks that includes user recommendation, poem writing, navigation, and robot control. The objective of an agent is to interactively solve these tasks based on their natural-language instructions and the feedback received after taking actions. Crucially, to ensure that the agent actually "learns" from the feedback, LLF-Bench implements several randomization techniques (such as paraphrasing and environment randomization) to ensure that the task isn't familiar to the agent and that the agent is robust to various verbalizations. In addition, LLF-Bench provides a unified OpenAI Gym interface for all its tasks and allows the users to easily configure the information the feedback conveys (among suggestion, explanation, and instantaneous performance) to study how agents respond to different types of feedback. Together, these features make LLF-Bench a unique research platform for developing and testing LLF agents

arXiv.org e-Print Archive

N′-(5-Bromo-2-methoxybenzylidene)-2-hydroxybenzohydrazide

Author: Allen
He
Jiu-Fu Lu
Lu
Lu
Lu
Nie
Sheldrick
Shi
Publication venue: International Union of Crystallography
Publication date: 01/10/2008
Field of study

The title Schiff base compound, C15H13BrN2O3, is derived from the condensation of 5-bromo-2-methoxybenzaldehyde with 2-hydroxybenzohydrazide in an ethanol solution. The dihedral angle between the two aromatic rings is 6.9 (9)°. The methoxy group is coplanar with the attached ring [C—O—C—C = 3.1 (12)°]. An intramolecular N—H⋯O hydrogen bond is observed. In the crystal structure, the molecules are linked into chains along the [001] direction by intermolecular O—H⋯N, O—H⋯O and C—H⋯O hydrogen bonds

Crossref

Directory of Open Access Journals

PubMed Central

N′-(2-Hydroxybenzylidene)-2-methoxybenzohydrazide monohydrate

Author: Alhadi
Ali
Allen
Bedia
Fun
He
Nie
Shan
Sheldrick
Shi
Terzioglu
Zou
Publication venue: International Union of Crystallography
Publication date: 01/09/2008
Field of study

In the title compound, C15H14N2O3·H2O, the Schiff base molecule is approximately planar, with a dihedral angle between the two aromatic rings of 10.2 (3)°. The molecular structure is stabilized by O—H⋯N and N—H⋯O hydrogen bonds. In the crystal structure, the Schiff base and water molecules are linked together by intermolecular O—H⋯O hydrogen bonds, forming chains parallel to the a axis

Crossref

Directory of Open Access Journals

PubMed Central

Trichlorido{2-[2-(η5-cyclopentadienyl)-2-methylpropyl]-1-trimethylsilyl-1H-imidazole-κN 3}titanium(IV) tetrahydrofuran hemisolvate

Author: Allen
Andrei V. Churakov
Dolomanov
Enders
Enders
Fang Ge
Herrmann
Krut'ko
Maxim V. Borzov
Nie
Sheldrick
Spek
Wang
Wanli Nie
Publication venue: International Union of Crystallography
Publication date: 01/05/2010
Field of study

The title compound, [Ti(C15H23N2Si)Cl3]·0.5C4H8O, has been prepared from {2-[2-(η5-cyclopentadienyl)-2-methylpropyl]-1H-imidazolyl-κN 1}bis(N,N-diethylamido-κN)titanium(IV), (C12H14N2)Ti(NEt2)2, by reaction with excess of Me3SiCl in tetrahydrofuran (THF) at 353 K. The crystal structure contains THF as adduct solvent, disordered around a center of inversion. The presence of THF and the adduct ratio has been independently supported by 1H NMR spectroscopy. The coordination polyhedron of the Ti atom is distorted square-pyramidal, assuming the cyclopentadienyl (Cp) ring occupies one coordination site. The Ti, Si and CH2 group C atoms only deviate slightly from the imidazole ring plane [by 0.021 (4), 0.133 (4) and 0.094 (4) Å, respectively]. Comparison of the principal geometric parameters with those of the few known structurally characterized analogues reveal small differences in bond lengths and angles at the Ti atom. The title complex is only stable in THF-d 8 in the presence of excess Me3SiCl, otherwise it exists in an equilibrium with equimolar amounts of dichlorido{2-[2-(η5-cyclopentadienyl)-2-methylpropyl]-1H-imidazolyl-κN 3}titanium(IV) and chlorotrimethylsilane

Crossref

Directory of Open Access Journals

PubMed Central

4-Chloro-N′-(2-hydroxybenzylidene)benzohydrazide monohydrate

Author: Alhadi
Ali
Allen
Bedia
Fun
He
Nie
Shan
Sheldrick
Shi
Terzioglu
Zou
Publication venue: International Union of Crystallography
Publication date: 01/09/2008
Field of study

The asymmetric unit of the title compound, C14H11ClN2O2·H2O, contains a Schiff base molecule and a water molecule of crystallization. The dihedral angle between the two aromatic rings is 27.3 (4)°. In the crystal structure, molecules are linked into a two-dimensional network parallel to the bc plane by intermolecular O—H⋯O and N—H⋯O hydrogen bonds involving the water molecules

Crossref

Directory of Open Access Journals

PubMed Central