5,713 research outputs found
Conversational affective social robots for ageing and dementia support
Socially assistive robots (SAR) hold significant potential to assist older adults and people with dementia in human engagement and clinical contexts by supporting mental health and independence at home. While SAR research has recently experienced prolific growth, long-term trust, clinical translation and patient benefit remain immature. Affective human-robot interactions are unresolved and the deployment of robots with conversational abilities is fundamental for robustness and humanrobot engagement. In this paper, we review the state of the art within the past two decades, design trends, and current applications of conversational affective SAR for ageing and dementia support. A horizon scanning of AI voice technology for healthcare, including ubiquitous smart speakers, is further introduced to address current gaps inhibiting home use. We discuss the role of user-centred approaches in the design of voice systems, including the capacity to handle communication breakdowns for effective use by target populations. We summarise the state of development in interactions using speech and natural language processing, which forms a baseline for longitudinal health monitoring and cognitive assessment. Drawing from this foundation, we identify open challenges and propose future directions to advance conversational affective social robots for: 1) user engagement, 2) deployment in real-world settings, and 3) clinical translation
Large Language Models for in Situ Knowledge Documentation and Access With Augmented Reality
Augmented reality (AR) has become a powerful tool for assisting operators in complex environments, such as shop floors, laboratories, and industrial settings. By displaying synthetic visual elements anchored in real environments and providing information for specific tasks, AR helps to improve efficiency and accuracy. However, a common bottleneck in these environments is introducing all necessary information, which often requires predefined structured formats and needs more ability for multimodal and Natural Language (NL) interaction. This work proposes a new method for dynamically documenting complex environments using AR in a multimodal, non-structured, and interactive manner. Our method employs Large Language Models (LLMs) to allow experts to describe elements from the real environment in NL and select corresponding AR elements in a dynamic and iterative process. This enables a more natural and flexible way of introducing information, allowing experts to describe the environment in their own words rather than being constrained by a predetermined structure. Any operator can then ask about any aspect of the environment in NL to receive a response and visual guidance from the AR system, thus allowing for a more natural and flexible way of introducing and retrieving information. These capabilities ultimately improve the effectiveness and efficiency of tasks in complex environments
Virtual Reality and Its Application in Education
Virtual reality is a set of technologies that enables two-way communication, from computer to user and vice versa. In one direction, technologies are used to synthesize visual, auditory, tactile, and sometimes other sensory experiences in order to provide the illusion that practically non-existent things can be seen, heard, touched, or otherwise felt. In the other direction, technologies are used to adequately record human movements, sounds, or other potential input data that computers can process and use. This book contains six chapters that cover topics including definitions and principles of VR, devices, educational design principles for effective use of VR, technology education, and use of VR in technical and natural sciences
Multi-modal post-editing of machine translation
As MT quality continues to improve, more and more translators switch from traditional translation from scratch to PE of MT output, which has been shown to save time and reduce errors. Instead of mainly generating text, translators are now asked to correct errors within otherwise helpful translation proposals, where repetitive MT errors make the process tiresome, while hard-to-spot errors make PE a cognitively demanding activity. Our contribution is three-fold: first, we explore whether interaction modalities other than mouse and keyboard could well support PE by creating and testing the MMPE translation environment. MMPE allows translators to cross out or hand-write text, drag and drop words for reordering, use spoken commands or hand gestures to manipulate text, or to combine any of these input modalities. Second, our interviews revealed that translators see value in automatically receiving additional translation support when a high CL is detected during PE. We therefore developed a sensor framework using a wide range of physiological and behavioral data to estimate perceived CL and tested it in three studies, showing that multi-modal, eye, heart, and skin measures can be used to make translation environments cognition-aware. Third, we present two multi-encoder Transformer architectures for APE and discuss how these can adapt MT output to a domain and thereby avoid correcting repetitive MT errors.Angesichts der stetig steigenden Qualität maschineller Übersetzungssysteme (MÜ) post-editieren (PE) immer mehr Übersetzer die MÜ-Ausgabe, was im Vergleich zur herkömmlichen Übersetzung Zeit spart und Fehler reduziert. Anstatt primär Text zu generieren, müssen Übersetzer nun Fehler in ansonsten hilfreichen Übersetzungsvorschlägen korrigieren. Dennoch bleibt die Arbeit durch wiederkehrende MÜ-Fehler mühsam und schwer zu erkennende Fehler fordern die Übersetzer kognitiv. Wir tragen auf drei Ebenen zur Verbesserung des PE bei: Erstens untersuchen wir, ob andere Interaktionsmodalitäten als Maus und Tastatur das PE unterstützen können, indem wir die Übersetzungsumgebung MMPE entwickeln und testen. MMPE ermöglicht es, Text handschriftlich, per Sprache oder über Handgesten zu verändern, Wörter per Drag & Drop neu anzuordnen oder all diese Eingabemodalitäten zu kombinieren. Zweitens stellen wir ein Sensor-Framework vor, das eine Vielzahl physiologischer und verhaltensbezogener Messwerte verwendet, um die kognitive Last (KL) abzuschätzen. In drei Studien konnten wir zeigen, dass multimodale Messung von Augen-, Herz- und Hautmerkmalen verwendet werden kann, um Übersetzungsumgebungen an die KL der Übersetzer anzupassen. Drittens stellen wir zwei Multi-Encoder-Transformer-Architekturen für das automatische Post-Editieren (APE) vor und erörtern, wie diese die MÜ-Ausgabe an eine Domäne anpassen und dadurch die Korrektur von sich wiederholenden MÜ-Fehlern vermeiden können.Deutsche Forschungsgemeinschaft (DFG), Projekt MMP
Impacts of Human Computer Interaction (HCI) on Translation in the Hospitality Industry
More than ever, globalization affects the hospitality industry with tremendous growth of international travelers—in the United States, inbound international visitation rose from around 25 million in 1985, to almost 67 million in 2012 (Mowforth & Munt, 2016). The concept of globalization, according to Robertson (1992), associates the integration and connectivity of culture and communities, while minimizing preformed boundaries. Half of all United States visitation comes from overseas travelers, with the largest share of foreign visitors coming from Japan, China, Brazil and South Korea (U.S. Travel Association, 2016). The United States predicts visitation from Brazil to increase by 70% from the years 2013-2018, while in that same time period visitation from China is expected to increase by 220% (U.S. Department of Commerce, 2014).
The United States, however, is not prepared to fully accommodate international visitors; Korean and Japanese travelers to the U.S. find few on-property adaptions relating to their cultural behavior, while other visitors struggle to communicate with staff untrained in multiple languages (Heo, Jogaratnam, & Buchanan, 2003). Despite advances in speech translation technology, there still remains limits to language, recognition, and simultaneous interpretation in such tools (Nakamura, 2009). International visitors to the United States rank language barriers as a high area of continued frustration and generally look for hospitality services that communicate in their language, understand local customs, and that have a familiarity with their culture (Li, Lai, Harrill, Kline, & Wang, 2011). Despite such issues, little to no research exists on the implementation of technology-based hospitality solutions
Automatic translation of formal data specifications to voice data-input applications.
This thesis introduces a complete solution for automatic translation of formal data specifications to voice data-input applications. The objective of the research is to automatically generate applications for inputting data through speech from specifications of the structure of the data. The formal data specifications are XML DTDs. A new formalization called Grammar-DTD (G-DTD) is introduced as an extended DTD that contains grammars to describe valid values of the DTD elements and attributes. G-DTDs facilitate the automatic generation of Voice XML applications that correspond to the original DTD structure. The development of the automatic application-generator included identifying constraints on the G-DTD to ensure a feasible translation, using predicate calculus to build a knowledge base of inference rules that describes the mapping procedure, and writing an algorithm for the automatic translation based on the inference rules.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2006 .H355. Source: Masters Abstracts International, Volume: 45-01, page: 0354. Thesis (M.Sc.)--University of Windsor (Canada), 2006
Interacção multimodal : contribuições para simplificar o desenvolvimento de aplicações
Doutoramento em Engenharia InformáticaA forma como interagimos com os dispositivos que nos rodeiam, no nosso diaa-
dia, está a mudar constantemente, consequência do aparecimento de novas
tecnologias e métodos que proporcionam melhores e mais aliciantes formas de
interagir com as aplicações. No entanto, a integração destas tecnologias, para
possibilitar a sua utilização alargada, coloca desafios significativos e requer, da
parte de quem desenvolve, um conhecimento alargado das tecnologias
envolvidas. Apesar de a literatura mais recente apresentar alguns avanços no
suporte ao desenho e desenvolvimento de sistemas interactivos multimodais,
vários aspectos chave têm ainda de ser resolvidos para que se atinja o seu
real potencial. Entre estes aspectos, um exemplo relevante é o da dificuldade
em desenvolver e integrar múltiplas modalidades de interacção.
Neste trabalho, propomos, desenhamos e implementamos uma framework que
permite um mais fácil desenvolvimento de interacção multimodal. A nossa
proposta mantém as modalidades de interacção completamente separadas da
aplicação, permitindo um desenvolvimento, independente de cada uma das
partes. A framework proposta já inclui um conjunto de modalidades genéricas
e módulos que podem ser usados em novas aplicações. De entre as
modalidades genéricas, a modalidade de voz mereceu particular atenção,
tendo em conta a relevância crescente da interacção por voz, por exemplo em
cenários como AAL, e a complexidade associada ao seu desenvolvimento.
Adicionalmente, a nossa proposta contempla ainda o suporte à gestão de
aplicações multi-dispositivo e inclui um método e respectivo módulo para criar
fusão entre eventos.
O desenvolvimento da arquitectura e da framework ocorreu num contexto de
I&D diversificado, incluindo vários projectos, cenários de aplicação e parceiros
internacionais. A framework permitiu o desenho e desenvolvimento de um
conjunto alargado de aplicações multimodais, sendo um exemplo digno de
nota o assistente pessoal AALFred, do projecto PaeLife. Estas aplicações, por
sua vez, serviram um contínuo melhoramento da framework, suportando a
recolha iterativa de novos requisitos, e permitido demonstrar a sua
versatilidade e capacidades.The way we interact with the devices around us, in everyday life, is constantly
changing, boosted by emerging technologies and methods, providing better
and more engaging ways to interact with applications. Nevertheless, the
integration with these technologies, to enable their widespread use in current
systems, presents a notable challenge and requires considerable knowhow
from developers. While the recent literature has made some advances in
supporting the design and development of multimodal interactive systems,
several key aspects have yet to be addressed to enable its full potential.
Among these, a relevant example is the difficulty to develop and integrate
multiple interaction modalities.
In this work, we propose, design and implement a framework enabling easier
development of multimodal interaction. Our proposal fully decouples the
interaction modalities from the application, allowing the separate development
of each part. The proposed framework already includes a set of generic
modalities and modules ready to be used in novel applications. Among the
proposed generic modalities, the speech modality deserved particular attention,
attending to the increasing relevance of speech interaction, for example in
scenarios such as AAL, and the complexity behind its development.
Additionally, our proposal also tackles the support for managing multi-device
applications and includes a method and corresponding module to create fusion
of events.
The development of the architecture and framework profited from a rich R&D
context including several projects, scenarios, and international partners. The
framework successfully supported the design and development of a wide set of
multimodal applications, a notable example being AALFred, the personal
assistant of project PaeLife. These applications, in turn, served the continuous
improvement of the framework by supporting the iterative collection of novel
requirements, enabling the proposed framework to show its versatility and
potential
Inclusive Intelligent Learning Management System Framework - Application of Data Science in Inclusive Education
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceBeing a disabled student the author faced higher education with a handicap which as experience
studying during COVID 19 confinement periods matched the findings in recent research about the
importance of digital accessibility through more e-learning intensive academic experiences. Narrative
and systematic literature reviews enabled providing context in World Health Organization’s
International Classification of Functioning, Disability and Health, legal and standards framework and
information technology and communication state-of-the art. Assessing Portuguese higher education
institutions’ web sites alerted to the fact that only outlying institutions implemented near perfect,
accessibility-wise, websites.
Therefore a gap was identified in how accessible the Portuguese higher education websites are, the
needs of all students, including those with disabilities, and even the accessibility minimum legal
requirements for digital products and the services provided by public or publicly funded organizations.
Having identified a problem in society and exploring the scientific base of knowledge for context and
state of the art was a first stage in the Design Science Research methodology, to which followed
development and validation cycles of an Inclusive Intelligent Learning Management System
Framework. The framework blends various Data Science study fields contributions with accessibility
guidelines compliant interface design and content upload accessibility compliance assessment.
Validation was provided by a focus group whose inputs were considered for the version presented in
this dissertation. Not being the purpose of the research to deliver a complete implementation of the
framework and lacking consistent data to put all the modules interacting with each other, the most
relevant modules were tested with open data as proof of concept.
The rigor cycle of DSR started with the inclusion of the previous thesis on Atlântica University Institute
Scientific Repository and is to be completed with the publication of this thesis and the already started
PhD’s findings in relevant journals and conferences
- …