38 research outputs found
Probing Language Models' Gesture Understanding for Enhanced Human-AI Interaction
The rise of Large Language Models (LLMs) has affected various disciplines
that got beyond mere text generation. Going beyond their textual nature, this
project proposal aims to investigate the interaction between LLMs and
non-verbal communication, specifically focusing on gestures. The proposal sets
out a plan to examine the proficiency of LLMs in deciphering both explicit and
implicit non-verbal cues within textual prompts and their ability to associate
these gestures with various contextual factors. The research proposes to test
established psycholinguistic study designs to construct a comprehensive dataset
that pairs textual prompts with detailed gesture descriptions, encompassing
diverse regional variations, and semantic labels. To assess LLMs' comprehension
of gestures, experiments are planned, evaluating their ability to simulate
human behaviour in order to replicate psycholinguistic experiments. These
experiments consider cultural dimensions and measure the agreement between
LLM-identified gestures and the dataset, shedding light on the models'
contextual interpretation of non-verbal cues (e.g. gestures).Comment: Preprin
Framing COVID-19: How we conceptualize and discuss the pandemic on Twitter
Doctors and nurses in these weeks are busy in the trenches, fighting against
a new invisible enemy: Covid-19. Cities are locked down and civilians are
besieged in their own homes, to prevent the spreading of the virus. War-related
terminology is commonly used to frame the discourse around epidemics and
diseases. Arguably the discourse around the current epidemic will make use of
war-related metaphors too,not only in public discourse and the media, but also
in the tweets written by non-experts of mass communication. We hereby present
an analysis of the discourse around #Covid-19, based on a corpus of 200k tweets
posted on Twitter during March and April 2020. Using topic modelling we first
analyze the topics around which the discourse can be classified. Then, we show
that the WAR framing is used to talk about specific topics, such as the virus
treatment, but not others, such as the effects of social distancing on the
population. We then measure and compare the popularity of the WAR frame to
three alternative figurative frames (MONSTER, STORM and TSUNAMI) and a literal
frame used as control (FAMILY). The results show that while the FAMILY literal
frame covers a wider portion of the corpus, among the figurative framings WAR
is the most frequently used, and thus arguably the most conventional one.
However, we conclude, this frame is not apt to elaborate the discourse around
many aspects involved in the current situation. Therefore, we conclude, in line
with previous suggestions, a plethora of framing options, or a metaphor menu,
may facilitate the communication of various aspects involved in the
Covid-19-related discourse on the social media, and thus support civilians in
the expression of their feelings, opinions and ideas during the current
pandemic.Comment: 41 pages, 6 figure
Immune Moral Models? Pro-Social Rule Breaking as a Moral Enhancement Approach for Ethical AI
The world is heading towards a state in which Artificial Intelligence (AI)
based agents make most decisions on behalf of humans. From healthcare decision
making to social media censoring, these agents face problems and make decisions
that have ethical and societal implications. Hence, ethical behaviour is a
critical characteristic of a human-centric AI. A common observation in
human-centric industries, like the service industry and healthcare, is that
their professionals tend to break rules, if necessary, for pro-social reasons.
To make AI agents more human-centric, we argue that there is a need for a
mechanism that helps AI agents to identify when and how to break rules set by
their designers. In this paper, we examine the when, i.e., conditions under
which humans break rules for pro-social reasons. In the presented study, we
introduce a 'vaccination strategy dilemma' where one needs to decide whether
they would distribute Covid-19 vaccines only to members of a high-risk group
(follow the rule) or, in selected cases, administer the vaccine to a few social
influencers (break the rule), which might yield an overall greater benefit to
society. Results of the empirical study suggest a relationship between
stakeholder utilities and pro-social rule breaking (PSRB), which either
deontological or utilitarian ethics cannot completely explain. Finally, the
paper discusses the design characteristics of an ethical agent capable of PSRB
and the future research directions on PSRB in the AI realm.Comment: 15 pages, 2 figure
A Crosslingual Investigation of Conceptualization in 1335 Languages
Languages differ in how they divide up the world into concepts and words;
e.g., in contrast to English, Swahili has a single concept for `belly' and
`womb'. We investigate these differences in conceptualization across 1,335
languages by aligning concepts in a parallel corpus. To this end, we propose
Conceptualizer, a method that creates a bipartite directed alignment graph
between source language concepts and sets of target language strings. In a
detailed linguistic analysis across all languages for one concept (`bird') and
an evaluation on gold standard data for 32 Swadesh concepts, we show that
Conceptualizer has good alignment accuracy. We demonstrate the potential of
research on conceptualization in NLP with two experiments. (1) We define
crosslingual stability of a concept as the degree to which it has 1-1
correspondences across languages, and show that concreteness predicts
stability. (2) We represent each language by its conceptualization pattern for
83 concepts, and define a similarity measure on these representations. The
resulting measure for the conceptual similarity of two languages is
complementary to standard genealogical, typological, and surface similarity
measures. For four out of six language families, we can assign languages to
their correct family based on conceptual similarity with accuracy between 54%
and 87%.Comment: ACL 202
Towards Language-Based Modulation of Assistive Robots through Multimodal Models
In the field of Geriatronics, enabling effective and transparent
communication between humans and robots is crucial for enhancing the acceptance
and performance of assistive robots. Our early-stage research project
investigates the potential of language-based modulation as a means to improve
human-robot interaction. We propose to explore real-time modulation during task
execution, leveraging language cues, visual references, and multimodal inputs.
By developing transparent and interpretable methods, we aim to enable robots to
adapt and respond to language commands, enhancing their usability and
flexibility. Through the exchange of insights and knowledge at the workshop, we
seek to gather valuable feedback to advance our research and contribute to the
development of interactive robotic systems for Geriatronics and beyond.Comment: GERIATRONICS SUMMIT 202
TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
We present a framework for specifying, training, evaluating, and deploying
machine learning models. Our focus is on simplifying cutting edge machine
learning for practitioners in order to bring such technologies into production.
Recognizing the fast evolution of the field of deep learning, we make no
attempt to capture the design space of all possible model architectures in a
domain- specific language (DSL) or similar configuration language. We allow
users to write code to define their models, but provide abstractions that guide
develop- ers to write models in ways conducive to productionization. We also
provide a unifying Estimator interface, making it possible to write downstream
infrastructure (e.g. distributed training, hyperparameter tuning) independent
of the model implementation. We balance the competing demands for flexibility
and simplicity by offering APIs at different levels of abstraction, making
common model architectures available out of the box, while providing a library
of utilities designed to speed up experimentation with model architectures. To
make out of the box models flexible and usable across a wide range of problems,
these canned Estimators are parameterized not only over traditional
hyperparameters, but also using feature columns, a declarative specification
describing how to interpret input data. We discuss our experience in using this
framework in re- search and production environments, and show the impact on
code health, maintainability, and development speed.Comment: 8 pages, Appeared at KDD 2017, August 13--17, 2017, Halifax, NS,
Canad
LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation
The convergence of embodied agents and large language models (LLMs) has
brought significant advancements to embodied instruction following.
Particularly, the strong reasoning capabilities of LLMs make it possible for
robots to perform long-horizon tasks without expensive annotated
demonstrations. However, public benchmarks for testing the long-horizon
reasoning capabilities of language-conditioned robots in various scenarios are
still missing. To fill this gap, this work focuses on the tabletop manipulation
task and releases a simulation benchmark, \textit{LoHoRavens}, which covers
various long-horizon reasoning aspects spanning color, size, space, arithmetics
and reference. Furthermore, there is a key modality bridging problem for
long-horizon manipulation tasks with LLMs: how to incorporate the observation
feedback during robot execution for the LLM's closed-loop planning, which is
however less studied by prior work. We investigate two methods of bridging the
modality gap: caption generation and learnable interface for incorporating
explicit and implicit observation feedback to the LLM, respectively. These
methods serve as the two baselines for our proposed benchmark. Experiments show
that both methods struggle to solve some tasks, indicating long-horizon
manipulation tasks are still challenging for current popular models. We expect
the proposed public benchmark and baselines can help the community develop
better models for long-horizon tabletop manipulation tasks.Comment: 6 pages, 4 figures. The video and code of LoHoRavens are available at
https://cisnlp.github.io/lohoravens-webpage
A Crosslingual Investigation of Conceptualization in 1335 Languages
Languages differ in how they divide up the world into concepts and words; e.g., in contrast to English, Swahili has a single concept for ‘belly’ and ‘womb’. We investigate these differences in conceptualization across 1,335 languages by aligning concepts in a parallel corpus. To this end, we propose Conceptualizer, a method that creates a bipartite directed alignment graph between source language concepts and sets of target language strings. In a detailed linguistic analysis across all languages for one concept (‘bird’) and an evaluation on gold standard data for 32 Swadesh concepts, we show that Conceptualizer has good alignment accuracy. We demonstrate the potential of research on conceptualization in NLP with two experiments. (1) We define crosslingual stability of a concept as the degree to which it has 1-1 correspondences across languages, and show that concreteness predicts stability. (2) We represent each language by its conceptualization pattern for 83 concepts, and define a similarity measure on these representations. The resulting measure for the conceptual similarity between two languages is complementary to standard genealogical, typological, and surface similarity measures. For four out of six language families, we can assign languages to their correct family based on conceptual similarity with accuracies between 54% and 87
Perspectives of Ultra Cold Atoms Trapped in Magnetic Micro Potentials
Recent work on magnetic micro traps for ultracold atoms is briefly reviewed.
The basic principles of operation are described together with the loading
methods and some of the realized trap geometries. Experiments are discussed
that study the interaction between atoms and the surface of micro traps as well
as the dynamics of ultracold gases in wave guides are discussed. The results
allow for an outlook towards future directions of research
Computational Storytelling as an Embodied Robot Performance with Gesture and Spatial Metaphor
A story comes to life when it is turned into a performance. Computational approaches to storytelling have primarily focused on stories as textual artifacts and not as performances. But stories can become much more when they are augmented with actors, dialogue, movements and gestures. Where artificial intelligence research has previously investigated these individual layers, this thesis presents an overarching framework of computational storytelling as an embodied robot performance with a focus on gesture and spatial metaphor. This work regards storytelling as a performative act, one that combines linguistic (spoken) and physical (embodied) actions to communicate concepts from performer to audience. The performances can feature multiple robotic agents that distribute the different storytelling tasks across themselves. The robots narrate the story, move across the stage, use appropriate gestures, interpret the actions of the story, present dialogue or give the audience an opportunity to interact with verbal or non-verbal cues, while an underlying system provides the story in an act of computational creativity. The performances are used to evaluate the links between concepts, words and embodied actions. In particular, the robots connect two movement types with the underlying plot: Gestures to enhance theatricality, and spatial movements to mirror character relations in the plot. For both types, we present a comprehensive taxonomy of robotic movement. Moreover, we argue that image schemas play a profound role in the understanding of movement and that, based on this claim, the coherent use of schematic movement is beneficial for our performances and for researchers in the field of robotic performances. To test these claims, the thesis outlines the Scéalability framework for turning generated stories into performances, which are then evaluated in a series of studies. In particular, we show that audiences are sensitive to the coherent use of space, and appreciate the schematic use of spatial movements as much as gestures.2022-06-02 JG: author's signature has been removed from PD