Search CORE

1,170 research outputs found

Modality-Balanced Models for Visual Dialogue

Author: Bansal Mohit
Kim Hyounghun
Tan Hao
Publication venue
Publication date: 17/01/2020
Field of study

The Visual Dialog task requires a model to exploit both image and conversational context information to generate the next response to the dialogue. However, via manual analysis, we find that a large number of conversational questions can be answered by only looking at the image without any access to the context history, while others still need the conversation context to predict the correct answers. We demonstrate that due to this reason, previous joint-modality (history and image) models over-rely on and are more prone to memorizing the dialogue history (e.g., by extracting certain keywords or patterns in the context information), whereas image-only models are more generalizable (because they cannot memorize or extract keywords from history) and perform substantially better at the primary normalized discounted cumulative gain (NDCG) task metric which allows multiple correct answers. Hence, this observation encourages us to explicitly maintain two models, i.e., an image-only model and an image-history joint model, and combine their complementary abilities for a more balanced multimodal model. We present multiple methods for this integration of the two models, via ensemble and consensus dropout fusion with shared parameters. Empirically, our models achieve strong results on the Visual Dialog challenge 2019 (rank 3 on NDCG and high balance across metrics), and substantially outperform the winner of the Visual Dialog challenge 2018 on most metrics.Comment: AAAI 2020 (11 pages

arXiv.org e-Print Archive

ScholarWorks@UNIST

Association for the Advancement of Artificial Intelligence: AAAI Publications

A network model of interpersonal alignment in dialog

Author: Alexander Mehler
Anderson
Andy Lücking
Barrat
Bonchev
Bunke
Caldarelli
Caldarelli
Church
Clark
Cover
Diestel
Erdős
Feldman
Garey
Giles
Gärdenfors
Halliday
Kamp
Kraskov
Levelt
Lewis
Manning
Maturana
Mehler
Mehler
Pastor-Satorras
Petra Weiß
Rieger
Schenker
Schober
Tuldava
Publication venue
Publication date: 01/01/2010
Field of study

In dyadic communication, both interlocutors adapt to each other linguistically, that is, they align interpersonally. In this article, we develop a framework for modeling interpersonal alignment in terms of the structural similarity of the interlocutors’ dialog lexica. This is done by means of so-called two-layer time-aligned network series, that is, a time-adjusted graph model. The graph model is partitioned into two layers, so that the interlocutors’ lexica are captured as subgraphs of an encompassing dialog graph. Each constituent network of the series is updated utterance-wise. Thus, both the inherent bipartition of dyadic conversations and their gradual development are modeled. The notion of alignment is then operationalized within a quantitative model of structure formation based on the mutual information of the subgraphs that represent the interlocutor’s dialog lexica. By adapting and further developing several models of complex network theory, we show that dialog lexica evolve as a novel class of graphs that have not been considered before in the area of complex (linguistic) networks. Additionally, we show that our framework allows for classifying dialogs according to their alignment status. To the best of our knowledge, this is the first approach to measuring alignment in communication that explores the similarities of graph-like cognitive representations. Keywords: alignment in communication; structural coupling; linguistic networks; graph distance measures; mutual information of graphs; quantitative network analysi

Crossref

Directory of Open Access Journals

Publications at Bielefeld University

Hochschulschriftenserver - Universität Frankfurt am Main

Development of a statistical framework for mass spectrometry data analysis in untargeted Metabolomics studies

Author: Kaever Alexander
Publication venue
Publication date: 06/06/2014
Field of study

Georg-August-University Göttingen

ISAR: Ein Autorensystem für Interaktive Tische

Author: Hodaie Zardosht
Publication venue: Technische Universität München
Publication date: 10/09/2020
Field of study

Developing augmented reality systems involves several challenges, that prevent end users and experts from non-technical domains, such as education, to experiment with this technology. In this research we introduce ISAR, an authoring system for augmented reality tabletops targeting users from non-technical domains. ISAR allows non-technical users to create their own interactive tabletop applications and experiment with the use of this technology in domains such as educations, industrial training, and medical rehabilitation.Die Entwicklung von Augmented-Reality-Systemen ist mit mehreren Herausforderungen verbunden, die Endbenutzer und Experten aus nicht-technischen Bereichen, wie z.B. dem Bildungswesen, daran hindern, mit dieser Technologie zu experimentieren. In dieser Forschung stellen wir ISAR vor, ein Autorensystem für Augmented-Reality-Tabletops, das sich an Benutzer aus nicht-technischen Bereichen richtet. ISAR ermöglicht es nicht-technischen Anwendern, ihre eigenen interaktiven Tabletop-Anwendungen zu erstellen und mit dem Einsatz dieser Technologie in Bereichen wie Bildung, industrieller Ausbildung und medizinischer Rehabilitation zu experimentieren

MediaTUM

Argumentation dialogues in web-based GDSS: an approach using machine learning techniques

Author: Conceição Luís Manuel da Silva
Publication venue
Publication date: 14/04/2023
Field of study

Tese de doutoramento em InformaticsA tomada de decisão está presente no dia a dia de qualquer pessoa, mesmo que muitas vezes ela não tenha consciência disso. As decisões podem estar relacionadas com problemas quotidianos, ou podem estar relacionadas com questões mais complexas, como é o caso das questões organizacionais. Normalmente, no contexto organizacional, as decisões são tomadas em grupo. Os Sistemas de Apoio à Decisão em Grupo têm sido estudados ao longo das últimas décadas com o objetivo de melhorar o apoio prestado aos decisores nas mais diversas situações e/ou problemas a resolver. Existem duas abordagens principais à implementação de Sistemas de Apoio à Decisão em Grupo: a abordagem clássica, baseada na agregação matemática das preferências dos diferentes elementos do grupo e as abordagens baseadas na negociação automática (e.g. Teoria dos Jogos, Argumentação, entre outras). Os atuais Sistemas de Apoio à Decisão em Grupo baseados em argumentação podem gerar uma enorme quantidade de dados. O objetivo deste trabalho de investigação é estudar e desenvolver modelos utilizando técnicas de aprendizagem automática para extrair conhecimento dos diálogos argumentativos realizados pelos decisores, mais concretamente, pretende-se criar modelos para analisar, classificar e processar esses dados, potencializando a geração de novo conhecimento que será utilizado tanto por agentes inteligentes, como por decisiores reais. Promovendo desta forma a obtenção de consenso entre os membros do grupo. Com base no estudo da literatura e nos desafios em aberto neste domínio, formulou-se a seguinte hipótese de investigação - É possível usar técnicas de aprendizagem automática para apoiar diálogos argumentativos em Sistemas de Apoio à Decisão em Grupo baseados na web. No âmbito dos trabalhos desenvolvidos, foram aplicados algoritmos de classificação supervisionados a um conjunto de dados contendo argumentos extraídos de debates online, criando um classificador de frases argumentativas que pode classificar automaticamente (A favor/Contra) frases argumentativas trocadas no contexto da tomada de decisão. Foi desenvolvido um modelo de clustering dinâmico para organizar as conversas com base nos argumentos utilizados. Além disso, foi proposto um Sistema de Apoio à Decisão em Grupo baseado na web que possibilita apoiar grupos de decisores independentemente de sua localização geográfica. O sistema permite a criação de problemas multicritério e a configuração das preferências, intenções e interesses de cada decisor. Este sistema de apoio à decisão baseado na web inclui os dashboards de relatórios inteligentes que são gerados através dos resultados dos trabalhos alcançados pelos modelos anteriores já referidos. A concretização de cada um dos objetivos permitiu validar as questões de investigação identificadas e assim responder positivamente à hipótese definida.Decision-making is present in anyone’s daily life, even if they are often unaware of it. Decisions can be related to everyday problems, or they can be related to more complex issues, such as organizational issues. Normally, in the organizational context, decisions are made in groups. Group Decision Support Systems have been studied over the past decades with the aim of improving the support provided to decision-makers in the most diverse situations and/or problems to be solved. There are two main approaches to implementing Group Decision Support Systems: the classical approach, based on the mathematical aggregation of the preferences of the different elements of the group, and the approaches based on automatic negotiation (e.g. Game Theory, Argumentation, among others). Current argumentation-based Group Decision Support Systems can generate an enormous amount of data. The objective of this research work is to study and develop models using automatic learning techniques to extract knowledge from argumentative dialogues carried out by decision-makers, more specifically, it is intended to create models to analyze, classify and process these data, enhancing the generation of new knowledge that will be used both by intelligent agents and by real decision-makers. Promoting in this way the achievement of consensus among the members of the group. Based on the literature study and the open challenges in this domain, the following research hypothesis was formulated - It is possible to use machine learning techniques to support argumentative dialogues in web-based Group Decision Support Systems. As part of the work developed, supervised classification algorithms were applied to a data set containing arguments extracted from online debates, creating an argumentative sentence classifier that can automatically classify (For/Against) argumentative sentences exchanged in the context of decision-making. A dynamic clustering model was developed to organize conversations based on the arguments used. In addition, a web-based Group Decision Support System was proposed that makes it possible to support groups of decision-makers regardless of their geographic location. The system allows the creation of multicriteria problems and the configuration of preferences, intentions, and interests of each decision-maker. This web-based decision support system includes dashboards of intelligent reports that are generated through the results of the work achieved by the previous models already mentioned. The achievement of each objective allowed validation of the identified research questions and thus responded positively to the defined hypothesis.I also thank to Fundação para a Ciência e a Tecnologia, for the Ph.D. grant funding with the reference: SFRH/BD/137150/2018

Universidade do Minho: RepositoriUM

STICKERCONV: Generating Multimodal Empathetic Responses from Scratch

Author: Feng Shi
Kong Fanheng
Song Kaisong
Sun Shuang
Wang Daling
Wang Lingshuai
Wang Peidong
Zhang Yifei
Zhang Yiqun
Publication venue
Publication date: 16/02/2024
Field of study

Stickers, while widely recognized for enhancing empathetic communication in online interactions, remain underexplored in current empathetic dialogue research, notably due to the challenge of a lack of comprehensive datasets. In this paper, we introduce the Agent for STICKERCONV (Agent4SC), which uses collaborative agent interactions to realistically simulate human behavior with sticker usage, thereby enhancing multimodal empathetic communication. Building on this foundation, we develop a multimodal empathetic dialogue dataset, STICKERCONV, comprising 12.9K dialogue sessions, 5.8K unique stickers, and 2K diverse conversational scenarios. This dataset serves as a benchmark for multimodal empathetic generation. To advance further, we propose PErceive and Generate Stickers (PEGS), a multimodal empathetic response generation framework, complemented by a comprehensive set of empathy evaluation metrics based on LLM. Our experiments demonstrate PEGS's effectiveness in generating contextually relevant and emotionally resonant multimodal empathetic responses, contributing to the advancement of more nuanced and engaging empathetic dialogue systems

arXiv.org e-Print Archive

Collaborative geographic visualization

Author: Oliveira Carlos Manuel Carvalho Santos
Publication venue: FCT - UNL
Publication date: 01/01/2009
Field of study

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para a obtenção do grau de Mestre em Engenharia do Ambiente, perfil Gestão e Sistemas AmbientaisThe present document is a revision of essential references to take into account when developing ubiquitous Geographical Information Systems (GIS) with collaborative visualization purposes. Its chapters focus, respectively, on general principles of GIS, its multimedia components and ubiquitous practices; geo-referenced information visualization and its graphical components of virtual and augmented reality; collaborative environments, its technological requirements, architectural specificities, and models for collective information management; and some final considerations about the future and challenges of collaborative visualization of GIS in ubiquitous environment

Repositório da Universidade Nova de Lisboa

Recommended from our members

Continually improving grounded natural language understanding through human-robot dialog

Author: Thomason Jesse David
Publication venue
Publication date: 23/08/2018
Field of study

As robots become ubiquitous in homes and workplaces such as hospitals and factories, they must be able to communicate with humans. Several kinds of knowledge are required to understand and respond to a human's natural language commands and questions. If a person requests an assistant robot to take me to Alice's office, the robot must know that Alice is a person who owns some unique office, and that take me means it should navigate there. Similarly, if a person requests bring me the heavy, green mug, the robot must have accurate mental models of the physical concepts heavy, green, and mug. To avoid forcing humans to use key phrases or words robots already know, this thesis focuses on helping robots understanding new language constructs through interactions with humans and with the world around them. To understand a command in natural language, a robot must first convert that command to an internal representation that it can reason with. Semantic parsing is a method for performing this conversion, and the target representation is often semantic forms represented as predicate logic with lambda calculus. Traditional semantic parsing relies on hand-crafted resources from a human expert: an ontology of concepts, a lexicon connecting language to those concepts, and training examples of language with abstract meanings. One thrust of this thesis is to perform semantic parsing with sparse initial data. We use the conversations between a robot and human users to induce pairs of natural language utterances with the target semantic forms a robot discovers through its questions, reducing the annotation effort of creating training examples for parsing. We use this data to build more dialog-capable robots in new domains with much less expert human effort (Thomason et al., 2015; Padmakumar et al., 2017). Meanings of many language concepts are bound to the physical world. Understanding object properties and categories, such as heavy, green, and mug requires interacting with and perceiving the physical world. Embodied robots can use manipulation capabilities, such as pushing, picking up, and dropping objects to gather sensory data about them. This data can be used to understand non-visual concepts like heavy and empty (e.g. get the empty carton of milk from the fridge), and assist with concepts that have both visual and non-visual expression (e.g. tall things look big and also exert force sooner than short things when pressed down on). A second thrust of this thesis focuses on strategies for learning these concepts using multi-modal sensory information. We use human-in-the-loop learning to get labels between concept words and actual objects in the environment (Thomason et al., 2016, 2017). We also explore ways to tease out polysemy and synonymy in concept words (Thomason and Mooney, 2017) such as light, which can refer to a weight or a color, the latter sense being synonymous with pale. Additionally, pushing, picking up, and dropping objects to gather sensory information is prohibitively time-consuming, so we investigate strategies for using linguistic information and human input to expedite exploration when learning a new concept (Thomason et al., 2018). Finally, we build an integrated agent with both parsing and perception capabilities that learns from conversations with users to improve both components over time. We demonstrate that parser learning from conversations (Thomason et al., 2015) can be combined with multi-modal perception (Thomason et al., 2016) using predicate-object labels gathered through opportunistic active learning (Thomason et al., 2017) during those conversations to improve performance for understanding natural language commands from humans. Human users also qualitatively rate this integrated learning agent as more usable after it has improved from conversation-based learning.Computer Science

Texas ScholarWorks

Proceedings of the EACL 2009 Workshop on Language Technologies for African Languages

Author: De Pauw Guy
de Schryver Gilles-Maurice
Levin Lori
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

Ghent University Academic Bibliography