6,625 research outputs found
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
Pre-trained large language models (LLMs) have recently achieved better
generalization and sample efficiency in autonomous web navigation. However, the
performance on real-world websites has still suffered from (1) open domainness,
(2) limited context length, and (3) lack of inductive bias on HTML. We
introduce WebAgent, an LLM-driven agent that can complete the tasks on real
websites following natural language instructions. WebAgent plans ahead by
decomposing instructions into canonical sub-instructions, summarizes long HTML
documents into task-relevant snippets, and acts on websites via generated
Python programs from those. We design WebAgent with Flan-U-PaLM, for grounded
code generation, and HTML-T5, new pre-trained LLMs for long HTML documents
using local and global attention mechanisms and a mixture of long-span
denoising objectives, for planning and summarization. We empirically
demonstrate that our recipe improves the success on a real website by over 50%,
and that HTML-T5 is the best model to solve HTML-based tasks; achieving 14.9%
higher success rate than prior SoTA on the MiniWoB web navigation benchmark and
better accuracy on offline task planning evaluation
Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review
Annotation tools are an essential component in the creation of datasets for machine learning purposes. Annotation tools have evolved greatly since the turn of the century, and now commonly include collaborative features to divide labor efficiently, as well as automation employed to amplify human efforts. Recent developments in machine learning models, such as Transformers, allow for training upon very large and sophisticated multimodal datasets and enable generalization across domains of knowledge. These models also herald an increasing emphasis on prompt engineering to provide qualitative fine-tuning upon the model itself, adding a novel emerging layer of direct machine learning annotation. These capabilities enable machine intelligence to recognize, predict, and emulate human behavior with much greater accuracy and nuance, a noted shortfall of which have contributed to algorithmic injustice in previous techniques. However, the scale and complexity of training data required for multimodal models presents engineering challenges. Best practices for conducting annotation for large multimodal models in the most safe and ethical, yet efficient, manner have not been established. This paper presents a systematic literature review of crowd and machine learning augmented behavioral annotation methods to distill practices that may have value in multimodal implementations, cross-correlated across disciplines. Research questions were defined to provide an overview of the evolution of augmented behavioral annotation tools in the past, in relation to the present state of the art. (Contains five figures and four tables)
Bildung in der digitalen Transformation
Die Coronapandemie und der durch sie erzwungene zeitweise Übergang von Präsenz- zu Distanzlehre haben die Digitalisierung des Bildungswesens enorm vorangetrieben. Noch deutlicher als vorher traten dabei positive wie negative Aspekte dieser Entwicklung zum Vorschein. Während den Hochschulen der Wechsel mit vergleichsweise geringen Reibungsverlusten gelang, offenbarten sich diese an Schulen weitaus deutlicher. Trotz aller Widrigkeiten erscheint eines klar: Die zeitweisen Veränderungen werden Nachwirkungen zeigen. Eine völlige Rückkehr zum Status quo ante ist kaum noch vorstellbar. Zwei Fragen bestimmen vor diesem Hintergrund die Doppelgesichtigkeit des Themas der 29. Jahrestagung der Gesellschaft für Medien in der Wissenschaft (GMW). Erstens: Wie ‚funktioniert‘ Bildung in der sich derzeit ereignenden digitalen Transformation und welche Herausforderungen gibt es? Und zweitens: Befindet sich möglicherweise Bildung selbst in der Transformation? Beiträge zu diesen und weiteren Fragen vereint der vorliegende Tagungsband
CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society
The rapid advancement of conversational and chat-based language models has
led to remarkable progress in complex task-solving. However, their success
heavily relies on human input to guide the conversation, which can be
challenging and time-consuming. This paper explores the potential of building
scalable techniques to facilitate autonomous cooperation among communicative
agents and provide insight into their "cognitive" processes. To address the
challenges of achieving autonomous cooperation, we propose a novel
communicative agent framework named role-playing. Our approach involves using
inception prompting to guide chat agents toward task completion while
maintaining consistency with human intentions. We showcase how role-playing can
be used to generate conversational data for studying the behaviors and
capabilities of chat agents, providing a valuable resource for investigating
conversational language models. Our contributions include introducing a novel
communicative agent framework, offering a scalable approach for studying the
cooperative behaviors and capabilities of multi-agent systems, and
open-sourcing our library to support research on communicative agents and
beyond. The GitHub repository of this project is made publicly available on:
https://github.com/lightaime/camel
On Transforming Reinforcement Learning by Transformer: The Development Trajectory
Transformer, originally devised for natural language processing, has also
attested significant success in computer vision. Thanks to its super expressive
power, researchers are investigating ways to deploy transformers to
reinforcement learning (RL) and the transformer-based models have manifested
their potential in representative RL benchmarks. In this paper, we collect and
dissect recent advances on transforming RL by transformer (transformer-based RL
or TRL), in order to explore its development trajectory and future trend. We
group existing developments in two categories: architecture enhancement and
trajectory optimization, and examine the main applications of TRL in robotic
manipulation, text-based games, navigation and autonomous driving. For
architecture enhancement, these methods consider how to apply the powerful
transformer structure to RL problems under the traditional RL framework, which
model agents and environments much more precisely than deep RL methods, but
they are still limited by the inherent defects of traditional RL algorithms,
such as bootstrapping and "deadly triad". For trajectory optimization, these
methods treat RL problems as sequence modeling and train a joint state-action
model over entire trajectories under the behavior cloning framework, which are
able to extract policies from static datasets and fully use the long-sequence
modeling capability of the transformer. Given these advancements, extensions
and challenges in TRL are reviewed and proposals about future direction are
discussed. We hope that this survey can provide a detailed introduction to TRL
and motivate future research in this rapidly developing field.Comment: 26 page
One-sided differentiability: a challenge for computer algebra systems
Computer Algebra Systems (CASs) are extremely powerful and widely used digital tools. Focusing on differentiation, CASs include a command that computes the derivative of functions in one variable (and also the partial derivative of functions in several variables). We will focus in this article on real-valued functions of one real variable. Since CASs usually compute the derivative of real-valued functions as a whole, the value of the computed derivative at points where the left derivative and the right derivative are different (that we will call conflicting points) should be something like "undefined", although this isn't always the case: the output could strongly differ depending on the chosen CAS. We have analysed and compared in this article how some well-known CASs behave when addressing differentiation at the conflicting points of five different functions chosen by the authors. Finally, the ability for calculating one-sided limits of CASs allows to directly compute the result in these cumbersome cases using the formal definition of one-sided derivative, which we have also analysed and compared for the selected CASs. Regarding teaching, this is an important issue, as it is a topic of Secondary Education and nowadays the use of CASs as an auxiliary digital tool for teaching mathematics is very common
Towards a Burning Method: how might the contemporary performer build on the legacy of Grotowski's Total Act?
Towards a Burning Method is a practical-theoretical investigation in the context of the contemporary theatre practice of Jerzy Grotowski's ideas, particularly the Total Act from the Theatre of Productions period. The study employs a practice of confrontation with Grotowski's ideas rather than identification, and text and practice are inseparable, interpenetrating each other. The method of confrontation (which is one of Grotowski's ideas) means, in the context of my research, a dialogue which through its dynamics creates a performative situation, stimulates the deepening of knowledge, and seeks its own answers. The resulting performance Burning Method - Four Lectures on Conditional Love is also integral to the theoretical reflections, both representing and containing Grotowski's ideas and my response to them. Other practical activities include video notes of process, classes with students and workshops. Oneof the important results of the research is to broaden the understanding of what performance is, what kind of forms it takes in relation to the Total Act.
The thesis consists of four chapters. Chapter One introduces the cultural-historical context of the emergence of the Total Act, while the following chapters are a guide to understanding the Act in practice. The final chapter is entirely devoted to my practical activities through a dialogue on contemporary performance, its lineage of origin and projective reflections for the future. A reference point or dialogue partner on contemporary theatre is mainly Grotowski's book Towards a Poor Theatre but also Artaud's The Theatre and its Double. The research has helped me
understand what performance work might result from thecontinuation of a distinctly masculine Polish Romantic tradition, and what new possibilities emerge from my perspective as a female performer and theatre-maker. In reading Grotowski (and confronting it in practice) I discover the potential for creative freedom, with the rigour of attention to the execution of ideas. At the same time, working with video projections I verify my thinking about the performer's body and space by finding another dimension of expression in the image in relation
to the audience. I also believe that actor training from this tradition has no temporal, cultural or other limitations as its essence is the search for the living impulse in the performer's body.
Summing up the research is an opening for further explorations revealing the potential and currency of Grotowski's ideas for today
Recommended from our members
The ARETE Ecosystem for the Creation and Delivery of Open Augmented Reality Educational Resources: The PBIS Case Study
Augmented reality (AR) is rapidly emerging as an increasingly useful technology in educational settings. In the ARETE (Augmented Reality Interactive Educational System) H2020 project, consortium members designed and implemented an ecosystem aimed at supporting teachers in building a collaborative learning environment through the use of AR in order to improve educational experiences. In particular, one of the pilot projects aims to introduce AR into school behavior lessons for the first time, leveraging the Positive Behaviour Intervention and Support (PBIS) methodology. Specifically, in this paper we will discuss the proposed architecture within the ARETE project that incorporates AR technology into the learning process of behavior lessons to support the teaching, practice and reinforcement phases of expected behaviors. Through the combination of different technologies and systems, it is possible to create an example of a technological and innovative ecosystem designed for creating behavioral lessons in AR
- …