309 research outputs found

    Learning to Speak and Act in a Fantasy Text Adventure Game

    Get PDF
    We introduce a large scale crowdsourced text adventure game as a research platform for studying grounded dialogue. In it, agents can perceive, emote, and act whilst conducting dialogue with other agents. Models and humans can both act as characters within the game. We describe the results of training state-of-the-art generative and retrieval models in this setting. We show that in addition to using past dialogue, these models are able to effectively use the state of the underlying world to condition their predictions. In particular, we show that grounding on the details of the local environment, including location descriptions, and the objects (and their affordances) and characters (and their previous actions) present within it allows better predictions of agent behavior and dialogue. We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully

    Kirjoitetut tunnisteet peruskoulun luonnontieteiden diagrammeissa: kielelliset rakenteet ja diskurssisuhteet

    Get PDF
    Communication, by nature, is multimodal: it uses various forms (modes) of communication, such as spoken language, written language, illustrations, and many others to create meaning. Multimodality research is the study of communicative situations that rely on such various modes and their combinations. One form of multimodality very commonly seen in everyday life comes in diagrams, which can convey very complex concepts by combining visual expressive resources (such as illustrations or photographs), written language, and diagrammatic elements such as lines and arrows. The primary aim of my thesis is to establish whether the linguistic structures of written labels – that is, textual elements – in diagrams can inform the decomposition of visual expressive resources. Put simply, I seek to find if said visual elements can more accurately be divided into further, more granular units in accordance with linguistic patterns in their accompanying textual elements. To answer my main research question, I posit three sub-questions. First, if certain diagram types (macro-structures), such as tables, cycles, or cross-sections co-occur with specific linguistic patterns; second, if different rhetorical functions found in diagrams employ different structures in their written labels as well; and third, if these functions are signaled by other means in tandem with written language. Answering these questions can help in designing future multimodal corpora and their annotation schemata, increasing annotation accuracy and possibilities for their processing. The theoretical framework used in this thesis synthesizes theories from multimodality theory, discourse studies, and diagrams research. I approach diagrams from the perspective of multimodality, highlighting them as discursive artefacts. This is enabled by the diagrammatic mode, which establishes how discourse semantics can function in the context of diagrams and how their interpretation is dynamic; that is, each element or combination of multiple elements can in turn contextualize or be a part of other elements and their combinations on a different scale. I also discuss the discourse-semantic concepts of coherence and cohesion as they relate to multimodal artefacts: different elements, even if not linguistic, can combine to create semantically meaningful connections between constituents in such an artefact. To exemplify this, I also apply Rhetorical Structure Theory (RST), which seeks to formalize how units of discourse are interconnected and work towards a shared communicative goal. RST employs rhetorical relations such as ELABORATION and IDENTIFICATION to describe how units and their combinations relate to other parts of a text (or other communicative whole). The data I use consists of two interrelated and complementary multimodal corpora: AI2D and AI2D-RST. AI2D is a collection of primary-school textbook science diagrams, annotated for blobs (visual expressive resources), labels, and diagrammatic elements, created for question-answering purposes. It also contains the linguistic data in each of the corpus’s diagrams. AI2D-RST contains a subset of the diagrams in AI2D, expanding them with additional annotation layers for information on macro-structures, visual connectivity, and RST, describing each element’s rhetorical relation in the diagram. I computationally find each rhetorical relation containing a label in AI2D-RST, noting its type, the type of the diagram it appears in, and fetching the labels’ linguistic content from AI2D. I then process each label’s contents with spaCy, a library for natural language processing, for linguistic elements such as phrase types, part-of-speech patterns, and average word counts. The output contains data on each label’s rhetorical relation, the possible macro-structure it is contained in, and said linguistic structures. The results show that there are indeed some differences in how distinct rhetorical relations and macro-groups use language: for example, cycles contain the most verb phrases and highest word count, indicating the use of written language to explicate certain processes to viewers. As linguistic patterns differ across these classes and are contextualized by surrounding diagrammatic elements, approaching diagrams from a discursive standpoint may be beneficial for future empirical multimodality research as well as designing annotation schemata to be more intuitive for annotators. With larger datasets and further research, precise sets of rules containing linguistic structures and layout information may be developed to increase accuracy in probability-based computational analysis of diagrams

    Social Media Analytics in Disaster Response: A Comprehensive Review

    Full text link
    Social media has emerged as a valuable resource for disaster management, revolutionizing the way emergency response and recovery efforts are conducted during natural disasters. This review paper aims to provide a comprehensive analysis of social media analytics for disaster management. The abstract begins by highlighting the increasing prevalence of natural disasters and the need for effective strategies to mitigate their impact. It then emphasizes the growing influence of social media in disaster situations, discussing its role in disaster detection, situational awareness, and emergency communication. The abstract explores the challenges and opportunities associated with leveraging social media data for disaster management purposes. It examines methodologies and techniques used in social media analytics, including data collection, preprocessing, and analysis, with a focus on data mining and machine learning approaches. The abstract also presents a thorough examination of case studies and best practices that demonstrate the successful application of social media analytics in disaster response and recovery. Ethical considerations and privacy concerns related to the use of social media data in disaster scenarios are addressed. The abstract concludes by identifying future research directions and potential advancements in social media analytics for disaster management. The review paper aims to provide practitioners and researchers with a comprehensive understanding of the current state of social media analytics in disaster management, while highlighting the need for continued research and innovation in this field.Comment: 11 page

    Understanding Events:A Diversity-driven Human-Machine Approach

    Get PDF

    Tools and Methods to Analyze Multimodal Data in Collaborative Design Ideation

    Get PDF
    Collaborative design ideation is typically characterized by informal acts of sketching, annotation, and discussion. Designers have always used the pencil-and-paper medium for this activity, partly because of the flexibility of the medium, and partly because the ambiguous and ill-defined nature of conceptual design cannot easily be supported by computers. However, recent computational tools for conceptual design have leveraged the availability of hand-held computing devices for creating and sharing ideas. In order to provide computer support for collaborative ideation in a way that augments traditional media rather than imitates it, it is necessary to study the affordances made available by digital media for this process, and to study designers\u27 cognitive and collaborative processes when using such media. In this thesis, we present tools and methods to help make sense of unstructured verbal and sketch data generated during collaborative design, with a view to better understand these collaborative and cognitive processes. This thesis has three main contributions
    • …
    corecore