21,810 research outputs found

    Brain in a Vat: On Missing Pieces Towards Artificial General Intelligence in Large Language Models

    Full text link
    In this perspective paper, we first comprehensively review existing evaluations of Large Language Models (LLMs) using both standardized tests and ability-oriented benchmarks. We pinpoint several problems with current evaluation methods that tend to overstate the capabilities of LLMs. We then articulate what artificial general intelligence should encompass beyond the capabilities of LLMs. We propose four characteristics of generally intelligent agents: 1) they can perform unlimited tasks; 2) they can generate new tasks within a context; 3) they operate based on a value system that underpins task generation; and 4) they have a world model reflecting reality, which shapes their interaction with the world. Building on this viewpoint, we highlight the missing pieces in artificial general intelligence, that is, the unity of knowing and acting. We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations. Additionally, knowledge acquisition isn't solely reliant on passive input but requires repeated trials and errors. We conclude by outlining promising future research directions in the field of artificial general intelligence

    InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback

    Full text link
    Humans write code in a fundamentally interactive manner and rely on constant execution feedback to correct errors, resolve ambiguities, and decompose tasks. While LLMs have recently exhibited promising coding capabilities, current coding benchmarks mostly consider a static instruction-to-code sequence transduction process, which has the potential for error propagation and a disconnect between the generated code and its final execution environment. To address this gap, we introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning (RL) environment, with code as actions and execution feedback as observations. Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution, and is compatible out-of-the-box with traditional seq2seq coding methods, while enabling the development of new methods for interactive code generation. We use InterCode to create two interactive code environments with Bash and SQL as action spaces, leveraging data from the static Spider and NL2Bash datasets. We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies such as ReAct and Plan & Solve. Our results showcase the benefits of interactive code generation and demonstrate that InterCode can serve as a challenging benchmark for advancing code understanding and generation capabilities. InterCode is designed to be easily extensible and can even be used to incorporate new tasks such as Capture the Flag, a popular coding puzzle that is inherently multi-step and involves multiple programming languages. Project site with code and data: https://intercode-benchmark.github.ioComment: Project site with code and data: https://intercode-benchmark.github.i

    Understanding Collaborative Sensemaking for System Design — An Investigation of Musicians\u27 Practice

    Get PDF
    There is surprisingly little written in information science and technology literature about the design of tools used to support the collaboration of creators. Understanding collaborative sensemaking through the use of language has been traditionally applied to non-work domains, but this method is also well-suited for informing hypotheses about the design collaborative systems. The presence of ubiquitous, mobile technology, and development of multi-user virtual spaces invites investigation of design which is based on naturalistic, real world, creative group behaviors, including the collaborative work of musicians. This thesis is considering the co-construction of new (musical) knowledge by small groups. Co-construction of new knowledge is critical to the definition of an information system because it emphasizes coordination and resource sharing among group members (versus individual members independently doing their own tasks and only coming together to collate their contributions as a final product). This work situates the locus of creativity on the process itself, rather than on the output (the musical result) or the individuals (members of the band). This thesis describes a way to apply quantitative observations to inform qualitative assessment of the characteristics of collaborative sensemaking in groups. Conversational data were obtained from nine face-to-face collaborative composing sessions, involving three separate bands producing 18 hours of recorded interactions. Topical characteristics of the discussion, namely objects, plans, properties and performance; as well as emergent patterns of generative, evaluative, revision, and management conversational acts within the group were seen as indicative of knowledge construction. The findings report the use of collaborative pathways: iterative cycles of generation, evaluation and revision of temporary solutions used to move the collaboration forward. In addition, bracketing of temporary solutions served to help collaborators reuse content and offload attentional resources. Ambiguity in language, evaluation criteria, goal formation, and group awareness meant that existing knowledge representations were insufficient in making sense of incoming data and necessitated reformulating those representations. Further, strategic use of affective language was found to be instrumental in bridging knowledge gaps. Based on these findings, features of a collaborative system are proposed to help in facilitating sensemaking routines at various stages of a creative task. This research contributes to the theoretical understanding of collaborative sensemaking during non-work, creative activities in order to inform the design of systems for supporting these activities. By studying an environment which forms a potential microcosm of virtual interaction between groups, it provides a framework for understanding and automating collaborative discussion content in terms of the features of dialogue

    Triggering physics lecturers' reflections on the instructional affordance of their use of representations: a design-based study

    Get PDF
    There is growing awareness in the physics education research community about the importance of using representations in physics teaching and the need for lecturers to reflect on their practice. This research study adopted a design-based research approach in an attempt to design a reliable, valid and practically useful artefact (framework/strategy) that could be used to trigger introductory physics lecturers’ reflections on their instructional use of representations. The artefact, which was instantiated with physics lecturers, comprised an observation protocol, an accompanying definitions key, a communication platform, and an instrument to assess the outcome (the levels of reflection). The video-data of lecturer practice were analysed using a priori codes to generate profiles of teaching practice. The resulting profiles were used to trigger individual video-stimulated reflection. The levels of reflection were assessed using a purpose-designed ‘Expectations of Reflection’ taxonomy. Thereafter a set of design guidelines and design principles were generated to guide further similar design-based educational studies. The process was validated via interview data but, while it was deemed a valid and reliable solution to the research problem, there were varying levels of perceived value of the artefact among the participating lecturers

    Exploring Natural User Abstractions For Shared Perceptual Manipulator Task Modeling & Recovery

    Get PDF
    State-of-the-art domestic robot assistants are essentially autonomous mobile manipulators capable of exerting human-scale precision grasps. To maximize utility and economy, non-technical end-users would need to be nearly as efficient as trained roboticists in control and collaboration of manipulation task behaviors. However, it remains a significant challenge given that many WIMP-style tools require superficial proficiency in robotics, 3D graphics, and computer science for rapid task modeling and recovery. But research on robot-centric collaboration has garnered momentum in recent years; robots are now planning in partially observable environments that maintain geometries and semantic maps, presenting opportunities for non-experts to cooperatively control task behavior with autonomous-planning agents exploiting the knowledge. However, as autonomous systems are not immune to errors under perceptual difficulty, a human-in-the-loop is needed to bias autonomous-planning towards recovery conditions that resume the task and avoid similar errors. In this work, we explore interactive techniques allowing non-technical users to model task behaviors and perceive cooperatively with a service robot under robot-centric collaboration. We evaluate stylus and touch modalities that users can intuitively and effectively convey natural abstractions of high-level tasks, semantic revisions, and geometries about the world. Experiments are conducted with \u27pick-and-place\u27 tasks in an ideal \u27Blocks World\u27 environment using a Kinova JACO six degree-of-freedom manipulator. Possibilities for the architecture and interface are demonstrated with the following features; (1) Semantic \u27Object\u27 and \u27Location\u27 grounding that describe function and ambiguous geometries (2) Task specification with an unordered list of goal predicates, and (3) Guiding task recovery with implied scene geometries and trajectory via symmetry cues and configuration space abstraction. Empirical results from four user studies show our interface was much preferred than the control condition, demonstrating high learnability and ease-of-use that enable our non-technical participants to model complex tasks, provide effective recovery assistance, and teleoperative control

    Fünf evidenzbasierte Heuristiken für den Einsatz von Video in der universitären Lehrerausbildung

    Full text link
    This article provides a research synthesis on the use of video in pre-service teacher education. Common ideas and evidences concerning the use of video in pre-service teacher education are reviewed. Based on the state-of-the-art in using video, five research-based heuristics are derived. Research findings of a number of studies are further used to illustrate the specification of heuristics. Specifically, a set of rules of thumb about when, how, and why to use video is presented to clarify the strengths and limitations of video as a medium to support pre-service teacher learning. (DIPF/Orig.)Der Beitrag liefert eine Forschungssynthese zur Nutzung von Video in der universitären Lehrerausbildung. Die Forschung wird dahingehend zusammengefasst, welche Ideen derzeit verfolgt werden und welche Evidenzen zur Nutzung von Video vorliegen. Basierend auf dem Forschungsstand leiten die Autoren fünf forschungsbasierte Heuristiken zum Einsatz von Video ab. Die Forschungsergebnisse einer Reihe ausgewählter Studien werden genutzt, um die Heuristiken weiter zu spezifizieren. Es werden Erfahrungsregeln vorgestellt, wann, wie und warum Video in der universitären Lehrerbildung eingesetzt werden kann. Die Erfahrungsregeln sollen helfen, Stärken und Schwächen von Video als ein Medium zur Unterstützung des Lernens von Lehramtsstudierenden zu klären. (DIPF/Orig.

    Collaborative trails in e-learning environments

    Get PDF
    This deliverable focuses on collaboration within groups of learners, and hence collaborative trails. We begin by reviewing the theoretical background to collaborative learning and looking at the kinds of support that computers can give to groups of learners working collaboratively, and then look more deeply at some of the issues in designing environments to support collaborative learning trails and at tools and techniques, including collaborative filtering, that can be used for analysing collaborative trails. We then review the state-of-the-art in supporting collaborative learning in three different areas – experimental academic systems, systems using mobile technology (which are also generally academic), and commercially available systems. The final part of the deliverable presents three scenarios that show where technology that supports groups working collaboratively and producing collaborative trails may be heading in the near future
    • …
    corecore