228 research outputs found

    AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

    Full text link
    We propose a novel framework for learning high-level cognitive capabilities in robot manipulation tasks, such as making a smiley face using building blocks. These tasks often involve complex multi-step reasoning, presenting significant challenges due to the limited paired data connecting human instructions (e.g., making a smiley face) and robot actions (e.g., end-effector movement). Existing approaches relieve this challenge by adopting an open-loop paradigm decomposing high-level instructions into simple sub-task plans, and executing them step-by-step using low-level control models. However, these approaches are short of instant observations in multi-step reasoning, leading to sub-optimal results. To address this issue, we propose to automatically collect a cognitive robot dataset by Large Language Models (LLMs). The resulting dataset AlphaBlock consists of 35 comprehensive high-level tasks of multi-step text plans and paired observation sequences. To enable efficient data acquisition, we employ elaborated multi-round prompt designs that effectively reduce the burden of extensive human involvement. We further propose a closed-loop multi-modal embodied planning model that autoregressively generates plans by taking image observations as input. To facilitate effective learning, we leverage MiniGPT-4 with a frozen visual encoder and LLM, and finetune additional vision adapter and Q-former to enable fine-grained spatial perception for manipulation tasks. We conduct experiments to verify the superiority over existing open and closed-loop methods, and achieve a significant increase in success rate by 21.4% and 14.5% over ChatGPT and GPT-4 based robot tasks. Real-world demos are shown in https://www.youtube.com/watch?v=ayAzID1_qQk

    Trendswatch 2013: Back to the Future

    Get PDF
    TrendsWatch 2013 highlights six trends that CFM's staff and advisors believe are highly significant to museums and their communities, based on our scanning and analysis over the past year. For each trend, we provide a brief summary, list examples of how the trend is playing out in the world, comment on the trend's significance to society and to museums specifically, and suggest ways that museums might respond. We also provide links to additional readings. TrendsWatch provides valuable background and context for your museum's planning and implementation

    Aligning Large Language Models through Synthetic Feedback

    Full text link
    Aligning large language models (LLMs) to human values has become increasingly important as it enables sophisticated steering of LLMs, e.g., making them follow given instructions while keeping them less toxic. However, it requires a significant amount of human demonstrations and feedback. Recently, open-sourced models have attempted to replicate the alignment learning process by distilling data from already aligned LLMs like InstructGPT or ChatGPT. While this process reduces human efforts, constructing these datasets has a heavy dependency on the teacher models. In this work, we propose a novel framework for alignment learning with almost no human labor and no dependency on pre-aligned LLMs. First, we perform reward modeling (RM) with synthetic feedback by contrasting responses from vanilla LLMs with various sizes and prompts. Then, we use the RM for simulating high-quality demonstrations to train a supervised policy and for further optimizing the model with reinforcement learning. Our resulting model, Aligned Language Model with Synthetic Training dataset (ALMoST), outperforms open-sourced models, including Alpaca, Dolly, and OpenAssistant, which are trained on the outputs of InstructGPT or human-annotated instructions. Our 7B-sized model outperforms the 12-13B models in the A/B tests using GPT-4 as the judge with about 75% winning rate on average.Comment: Preprint, 9 pages (with 10 pages of supplementary

    Natural interaction with a virtual guide in a virtual environment: A multimodal dialogue system

    Get PDF
    This paper describes the Virtual Guide, a multimodal dialogue system represented by an embodied conversational agent that can help users to find their way in a virtual environment, while adapting its affective linguistic style to that of the user. We discuss the modular architecture of the system, and describe the entire loop from multimodal input analysis to multimodal output generation. We also describe how the Virtual Guide detects the level of politeness of the user’s utterances in real-time during the dialogue and aligns its own language to that of the user, using different politeness strategies. Finally we report on our first user tests, and discuss some potential extensions to improve the system

    Reimagining Communication Studies in a Digital Age

    Get PDF
    The COVID-19 pandemic challenged society in many ways. In schools and universities, classrooms and campuses were vacated as teaching moved online. In this massive, forced leap of digitalization, the flexibility of digital solutions was harnessed on a large scale, leading to formal education continuing despite the lockdowns. However, despite technology providing possibilities to teach remotely, the overall organization of courses was still stuck in a bureaucratic system not built on the flexible affordances of the digital age. This master’s thesis presents an alternative way of organizing university studies, through the presentation and testing of the Communication Studies Tracker (CST). The CST is designed to allow students to complete their obligatory communications studies at university without having to attend any specific language or communication courses. Instead, the system would track and process communicative tasks done by students, until a sufficient amount of successful experience is gathered in the required languages and focus areas. To gauge the viability of an approach based on the above concept, a usability test of the CST was organized, in which five students and five teachers tested and discussed the system and its underlying concept. The results show participants felt that the CST would provide increased flexibility and meaningfulness in communication studies. Further, participants felt the concept was viable and suitable for implementing at the University of Turku. The main challenges discovered were related to support for weaker students and coordination of group work. The teacher testers also expressed concern regarding how a course-free system would be implemented by the university administration, especially concerning resource allocation.Covid-19-pandemia haastoi yhteiskuntamme monin tavoin. Kouluissa ja yliopistoissa luokkahuoneet ja kampukset tyhjenivät, kun opetus siirtyi verkkoon. Tässä massiivisessa pakotetussa digitalisoinnin harppauksessa digitaalisten ratkaisujen joustavuutta hyödynnettiin laajamittaisesti, mikä johti muodollisen koulutuksen jatkumiseen tilojen suluista huolimatta. Vaikka tekniikka pystyy tarjoamaan opetusta etäyhteyden välityksellä, kurssien yleinen rakenne juontaa juurensa byrokraattisesta järjestelmästä, joka ei perustu digitaalisen aikakauden joustaviin mahdollisuuksiin. Tämä tutkielma esittää vaihtoehtoisen tavan järjestää yliopisto-opintoja esittelemällä ja testaamalla Communication Studies Tracker (CST) järjestelmän. CST on suunniteltu antamaan opiskelijoille mahdollisuus suorittaa yliopisto-tutkinnon pakolliset viestintäopinnot käymättä mitään erityisiä kieli- tai viestintäkursseja. Sen sijaan järjestelmä seuraa ja käsittelee opiskelijoiden tekemiä kommunikaatiotehtäviä, kunnes vaadittavilla kielillä ja kohdealueilla on kerätty riittävä määrä hyväksyttyjä kokemuksia. Yllä olevaan konseptiin perustuvan lähestymistavan toimivuuden arvioimiseksi järjestettiin CST:n käytettävyystesti, jossa viisi opiskelijaa ja viisi opettajaa testasivat järjestelmää ja keskustelivat sen taustalla olevasta konseptista. Tulokset osoittavat, että osallistujat kokivat, että CST lisäisi joustavuutta ja mielekkyyttä viestintäopinnoissa. Tämän lisäksi osallistujat kokivat konseptin toteuttamiskelpoiseksi Turun yliopistossa sekä soveltuvan hyvin yliopiston koulutustavoitteisiin. Suurimmat haasteet liittyivät heikompien opiskelijoiden tukemiseen ja ryhmätyön koordinointiin. Opettajatestaajat ilmaisivat myös huolensa siitä, miten yliopiston hallinto toteuttaisi kurssittoman järjestelmän, kantaen erityisen huolen resurssien kohdentamisesta

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Feasibility report: Delivering case-study based learning using artificial intelligence and gaming technologies

    Get PDF
    This document describes an investigation into the technical feasibility of a game to support learning based on case studies. Information systems students using the game will conduct fact-finding interviews with virtual characters. We survey relevant technologies in computational linguistics and games. We assess the applicability of the various approaches and propose an architecture for the game based on existing techniques. We propose a phased development plan for the development of the game

    A Multiple Case Study of the Training of Instructors Teaching Inclusive Postsecondary Education Students in Typical College Courses

    Get PDF
    Problem Limited research exists regarding the professional development program processes and components used to train instructors to equip students with intellectual and developmental disabilities to receive greater support and access to the benefits of a postsecondary educational experience. Purpose of the Study The principal purpose of this research was to conduct a multiple case study of inclusive postsecondary education (IPSE) programs known as Transition Programs for Students with Intellectual Disabilities (TPSID) and/or Comprehensive Transition Programs (CTP) at institutions of higher education across the United States to examine training provided to instructors teaching students with intellectual and developmental disabilities (IDD) who were enrolled in typical college courses. Method A qualitative, multiple case study design was used. Five IPSE programs across the United States comprised the sample for this study. Two types of sampling were used: convenience sampling and non-probability or purposeful sampling. Convenience sampling was used to select the five IPSE programs based on their willingness to participate and provide the needed documents for analysis. Purposeful sampling was used to select the interview participants based on their ability to provide the most insight and understanding of the instructor training processes. To provide a comprehensive examination of the four research questions related to the training development, components, implementation, and evaluation processes for instructors teaching IPSE program students in typical college courses, interviews of the training affiliates, training observations, and document analysis were conducted within the five programs. Results There is no unified approach to the training of instructors teaching students with IDD in IPSE programs. However, similarities exist in the training development, implementation, and evaluation processes used across programs. In conjunction with knowledge, skills, and practices, potential barriers to success such as the attitudes of instructors must be addressed. The roles of training affiliates in the development, implementation, and evaluation of the training were described. Conclusions The landscape of higher education is changing to provide access and inclusive learning opportunities to a more diverse group of students. There is hope that the institutions of higher education will begin to adopt the teaching and learning practices that best meet the needs of the new and growing group of learner types. Although there has been some progress, much work remains to be done to ensure that instructors are equipped to support the success of students with IDD and other diverse learners

    Full Issue: vol. 65, no. 4

    Get PDF
    • …
    corecore