141 research outputs found
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
A robotic platform for precision agriculture and applications
Agricultural techniques have been improved over the centuries to match with the growing demand of an increase in global population. Farming applications are facing new challenges to satisfy global needs and the recent technology advancements in terms of robotic platforms can be exploited.
As the orchard management is one of the most challenging applications because of its tree structure and the required interaction with the environment, it was targeted also by the University of Bologna research group to provide a customized solution addressing new concept for agricultural vehicles.
The result of this research has blossomed into a new lightweight tracked vehicle capable of performing autonomous navigation both in the open-filed scenario and while travelling inside orchards for what has been called in-row navigation. The mechanical design concept, together with customized software implementation has been detailed to highlight the strengths of the platform and some further improvements envisioned to improve the overall performances.
Static stability testing has proved that the vehicle can withstand steep slopes scenarios. Some improvements have also been investigated to refine the estimation of the slippage that occurs during turning maneuvers and that is typical of skid-steering tracked vehicles.
The software architecture has been implemented using the Robot Operating System (ROS) framework, so to exploit community available packages related to common and basic functions, such as sensor interfaces, while allowing dedicated custom implementation of the navigation algorithm developed.
Real-world testing inside the university’s experimental orchards have proven the robustness and stability of the solution with more than 800 hours of fieldwork.
The vehicle has also enabled a wide range of autonomous tasks such as spraying, mowing, and on-the-field data collection capabilities. The latter can be exploited to automatically estimate relevant orchard properties such as fruit counting and sizing, canopy properties estimation, and autonomous fruit harvesting with post-harvesting estimations.Le tecniche agricole sono state migliorate nel corso dei secoli per soddisfare la crescente domanda di aumento della popolazione mondiale. I recenti progressi tecnologici in termini di piattaforme robotiche possono essere sfruttati in questo contesto.
Poiché la gestione del frutteto è una delle applicazioni più impegnative, a causa della sua struttura arborea e della necessaria interazione con l'ambiente, è stata oggetto di ricerca per fornire una soluzione personalizzata che sviluppi un nuovo concetto di veicolo agricolo.
Il risultato si è concretizzato in un veicolo cingolato leggero, capace di effettuare una navigazione autonoma sia nello scenario di pieno campo che all'interno dei frutteti (navigazione interfilare). La progettazione meccanica, insieme all'implementazione del software, sono stati dettagliati per evidenziarne i punti di forza, accanto ad alcuni ulteriori miglioramenti previsti per incrementarne le prestazioni complessive.
I test di stabilità statica hanno dimostrato che il veicolo può resistere a ripidi pendii. Sono stati inoltre studiati miglioramenti per affinare la stima dello slittamento che si verifica durante le manovre di svolta, tipico dei veicoli cingolati.
L'architettura software è stata implementata utilizzando il framework Robot Operating System (ROS), in modo da sfruttare i pacchetti disponibili relativi a componenti base, come le interfacce dei sensori, e consentendo al contempo un'implementazione personalizzata degli algoritmi di navigazione sviluppati.
I test in condizioni reali all'interno dei frutteti sperimentali dell'università hanno dimostrato la robustezza e la stabilità della soluzione con oltre 800 ore di lavoro sul campo.
Il veicolo ha permesso di attivare e svolgere un'ampia gamma di attività agricole in maniera autonoma, come l'irrorazione, la falciatura e la raccolta di dati sul campo. Questi ultimi possono essere sfruttati per stimare automaticamente le proprietà più rilevanti del frutteto, come il conteggio e la calibratura dei frutti, la stima delle proprietà della chioma e la raccolta autonoma dei frutti con stime post-raccolta
Wearable Sensor Gait Analysis for Fall Detection Using Deep Learning Methods
World Health Organization (WHO) data show that around 684,000 people die from falls yearly, making it the second-highest mortality rate after traffic accidents [1]. Early detection of falls, followed by pneumatic protection, is one of the most effective means of ensuring the safety of the elderly. In light of the recent widespread adoption of wearable sensors, it has become increasingly critical that fall detection models are developed that can effectively process large and sequential sensor signal data. Several researchers have recently developed fall detection algorithms based on wearable sensor data. However, real-time fall detection remains challenging because of the wide range of gait variations in older. Choosing the appropriate sensor and placing it in the most suitable location are essential components of a robust real-time fall detection system.
This dissertation implements various detection models to analyze and mitigate injuries due to falls in the senior community. It presents different methods for detecting falls in real-time using deep learning networks. Several sliding window segmentation techniques are developed and compared in the first study. As a next step, various methods are implemented and applied to prevent sampling imbalances caused by the real-world collection of fall data. A study is also conducted to determine whether accelerometers and gyroscopes can distinguish between falls and near-falls.
According to the literature survey, machine learning algorithms produce varying degrees of accuracy when applied to various datasets. The algorithm’s performance depends on several factors, including the type and location of the sensors, the fall pattern, the dataset’s characteristics, and the methods used for preprocessing and sliding window segmentation. Other challenges associated with fall detection include the need for centralized datasets for comparing the results of different algorithms. This dissertation compares the performance of varying fall detection methods using deep learning algorithms across multiple data sets.
Furthermore, deep learning has been explored in the second application of the ECG-based virtual pathology stethoscope detection system. A novel real-time virtual pathology stethoscope (VPS) detection method has been developed. Several deep-learning methods are evaluated for classifying the location of the stethoscope by taking advantage of subtle differences in the ECG signals. This study would significantly extend the simulation capabilities of standard patients by allowing medical students and trainees to perform realistic cardiac auscultation and hear cardiac auscultation in a clinical environment
Machine-Learning-Powered Cyber-Physical Systems
In the last few years, we witnessed the revolution of the Internet of Things (IoT) paradigm and the consequent growth of Cyber-Physical Systems (CPSs). IoT devices, which include a plethora of smart interconnected sensors, actuators, and microcontrollers, have the ability to sense physical phenomena occurring in an environment and provide copious amounts of heterogeneous data about the functioning of a system. As a consequence, the large amounts of generated data represent an opportunity to adopt artificial intelligence and machine learning techniques that can be used to make informed decisions aimed at the optimization of such systems, thus enabling a variety of services and applications across multiple domains. Machine learning processes and analyses such data to generate a feedback, which represents a status the environment is in. A feedback given to the user in order to make an informed decision is called an open-loop feedback. Thus, an open-loop CPS is characterized by the lack of an actuation directed at improving the system itself. A feedback used by the system itself to actuate a change aimed at optimizing the system itself is called a closed-loop feedback. Thus, a closed-loop CPS pairs feedback based on sensing data with an actuation that impacts the system directly. In this dissertation, we propose several applications in the context of CPS. We propose open-loop CPSs designed for the early prediction, diagnosis, and persistency detection of Bovine Respiratory Disease (BRD) in dairy calves, and for gait activity recognition in horses.These works use sensor data, such as pedometers and automated feeders, to perform valuable real-field data collection. Data are then processed by a mix of state-of-the-art approaches as well as novel techniques, before being fed to machine learning algorithms for classification, which informs the user on the status of their animals. Our work further evaluates a variety of trade-offs. In the context of BRD, we adopt optimization techniques to explore the trade-offs of using sensor data as opposed to manual examination performed by domain experts. Similarly, we carry out an extensive analysis on the cost-accuracy trade-offs, which farmers can adopt to make informed decisions on their barn investments. In the context of horse gait recognition we evaluate the benefits of lighter classifications algorithms to improve energy and storage usage, and their impact on classification accuracy. With respect to closed-loop CPS we proposes an incentive-based demand response approach for Heating Ventilation and Air Conditioning (HVAC) designed for peak load reduction in the context of smart grids. Specifically, our approach uses machine learning to process power data from smart thermostats deployed in user homes, along with their personal temperature preferences. Our machine learning models predict power savings due to thermostat changes, which are then plugged into our optimization problem that uses auction theory coupled with behavioral science. This framework selects the set of users who fulfill the power saving requirement, while minimizing financial incentives paid to the users, and, as a consequence, their discomfort. Our work on BRD has been published on IEEE DCOSS 2022 and Frontiers in Animal Science. Our work on gait recognition has been published on IEEE SMARTCOMP 2019 and Elsevier PMC 2020, and our work on energy management and energy prediction has been published on IEEE PerCom 2022 and IEEE SMARTCOMP 2022. Several other works are under submission when this thesis was written, and are included in this document as well
Toward Real-Time, Robust Wearable Sensor Fall Detection Using Deep Learning Methods: A Feasibility Study
Real-time fall detection using a wearable sensor remains a challenging problem due to high gait variability. Furthermore, finding the type of sensor to use and the optimal location of the sensors are also essential factors for real-time fall-detection systems. This work presents real-time fall-detection methods using deep learning models. Early detection of falls, followed by pneumatic protection, is one of the most effective means of ensuring the safety of the elderly. First, we developed and compared different data-segmentation techniques for sliding windows. Next, we implemented various techniques to balance the datasets because collecting fall datasets in the real-time setting has an imbalanced nature. Moreover, we designed a deep learning model that combines a convolution-based feature extractor and deep neural network blocks, the LSTM block, and the transformer encoder block, followed by a position-wise feedforward layer. We found that combining the input sequence with the convolution-learned features of different kernels tends to increase the performance of the fall-detection model. Last, we analyzed that the sensor signals collected by both accelerometer and gyroscope sensors can be leveraged to develop an effective classifier that can accurately detect falls, especially differentiating falls from near-falls. Furthermore, we also used data from sixteen different body parts and compared them to determine the better sensor position for fall-detection methods. We found that the shank is the optimal position for placing our sensors, with an F1 score of 0.97, and this could help other researchers collect high-quality fall datasets
주요 우울 장애의 음성 기반 분석: 연속적인 발화의 음향적 변화를 중심으로
학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 융합과학부(디지털정보융합전공), 2023. 2. 이교구.Major depressive disorder (commonly referred to as depression) is a common disorder that affects 3.8% of the world's population. Depression stems from various causes, such as genetics, aging, social factors, and abnormalities in the neurotransmitter system; thus, early detection and monitoring are essential. The human voice is considered a representative biomarker for observing depression; accordingly, several studies have developed an automatic depression diagnosis system based on speech.
However, constructing a speech corpus is a challenge, studies focus on adults under 60 years of age, and there are insufficient medical hypotheses based on the clinical findings of psychiatrists, limiting the evolution of the medical diagnostic tool. Moreover, the effect of taking antipsychotic drugs on speech characteristics during the treatment phase is overlooked.
Thus, this thesis studies a speech-based automatic depression diagnosis system at the semantic level (sentence). First, to analyze depression among the elderly whose emotional changes do not adequately reflect speech characteristics, it developed the mood-induced sentence to build the elderly depression speech corpus and designed an automatic depression diagnosis system for the elderly.
Second, it constructed an extrapyramidal symptom speech corpus to investigate the extrapyramidal symptoms, a typical side effect that can appear from an antipsychotic drug overdose. Accordingly, there is a strong correlation between the antipsychotic dose and speech characteristics. The study paved the way for a comprehensive examination of the automatic diagnosis system for depression.주요 우울 장애 즉 흔히 우울증이라고 일컬어지는 기분 장애는 전 세계인 중 3.8%에 달하는 사람들이 겪은바 있는 매우 흔한 질병이다. 유전, 노화, 사회적 요인, 신경전달물질 체계의 이상등 다양한 원인으로 발생하는 우울증은 조기 발견 및 일상 생활에서의 관리가 매우 중요하다고 할 수 있다. 인간의 음성은 우울증을 관찰하기에 대표적인 바이오마커로 여겨져 왔으며, 음성 데이터를 기반으로한 자동 우울증 진단 시스템 개발을 위한 여러 연구들이 진행되어 왔다. 그러나 음성 말뭉치 구축의 어려움과 60세 이하의 성인들에게 초점이 맞추어진 연구, 정신과 의사들의 임상 소견을 바탕으로한 의학적 가설 설정의 미흡등의 한계점을 가지고 있으며, 이는 의료 진단 기구로 발전하는데 한계점이라고 할 수 있다. 또한, 항정신성 약물의 복용이 음성 특징에 미칠 수 있는 영향 또한 간과되고 있다.
본 논문에서는 위의 한계점들을 보완하기 위한 의미론적 수준 (문장 단위)에서의 음성 기반 자동 우울증 진단에 대한 연구를 시행하고자 한다. 우선적으로 감정의 변화가 음성 특징을 잘 반영되지 않는 노인층의 우울증 분석을 위해 감정 발화 문장을 개발하여 노인 우울증 음성 말뭉치를 구축하고, 문장 단위에서의 관찰을 통해 노인 우울증 군에서 감정 문장 발화가 미치는 영향과 감정 전이를 확인할 수 있었으며, 노인층의 자동 우울증 진단 시스템을 설계하였다. 최종적으로 항정신병 약물의 과복용으로 나타날 수 있는 대표적인 부작용인 추체외로 증상을 조사하기 위해 추체외로 증상 음성 말뭉치를 구축하였고, 항정신병 약물의 복용량과 음성 특징간의 상관관계를 분석하여 우울증의 치료 과정에서 항정신병 약물이 음성에 미칠 수 있는 영향에 대해서 조사하였다. 이를 통해 주요 우울 장애의 영역에 대한 포괄적인 연구를 진행하였다.Chapter 1 Introduction 1
1.1 Research Motivations 3
1.1.1 Bridging the Gap Between Clinical View and Engineering 3
1.1.2 Limitations of Conventional Depressed Speech Corpora 4
1.1.3 Lack of Studies on Depression Among the Elderly 4
1.1.4 Depression Analysis on Semantic Level 6
1.1.5 How Antipsychotic Drug Affects the Human Voice? 7
1.2 Thesis objectives 9
1.3 Outline of the thesis 10
Chapter 2 Theoretical Background 13
2.1 Clinical View of Major Depressive Disorder 13
2.1.1 Types of Depression 14
2.1.2 Major Causes of Depression 15
2.1.3 Symptoms of Depression 17
2.1.4 Diagnosis of Depression 17
2.2 Objective Diagnostic Markers of Depression 19
2.3 Speech in Mental Disorder 19
2.4 Speech Production and Depression 21
2.5 Automatic Depression Diagnostic System 23
2.5.1 Acoustic Feature Representation 24
2.5.2 Classification / Prediction 27
Chapter 3 Developing Sentences for New Depressed Speech Corpus 31
3.1 Introduction 31
3.2 Building Depressed Speech Corpus 32
3.2.1 Elements of Speech Corpus Production 32
3.2.2 Conventional Depressed Speech Corpora 35
3.2.3 Factors Affecting Depressed Speech Characteristics 39
3.3 Motivations 40
3.3.1 Limitations of Conventional Depressed Speech Corpora 40
3.3.2 Attitude of Subjects to Depression: Masked Depression 43
3.3.3 Emotions in Reading 45
3.3.4 Objectives of this Chapter 45
3.4 Proposed Methods 46
3.4.1 Selection of Words 46
3.4.2 Structure of Sentence 47
3.5 Results 49
3.5.1 Mood-Inducing Sentences (MIS) 49
3.5.2 Neutral Sentences for Extrapyramidal Symptom Analysis 49
3.6 Summary 51
Chapter 4 Screening Depression in The Elderly 52
4.1 Introduction 52
4.2 Korean Elderly Depressive Speech Corpus 55
4.2.1 Participants 55
4.2.2 Recording Procedure 57
4.2.3 Recording Specification 58
4.3 Proposed Methods 59
4.3.1 Voice-based Screening Algorithm for Depression 59
4.3.2 Extraction of Acoustic Features 59
4.3.3 Feature Selection System and Distance Computation 62
4.3.4 Classification and Statistical Analyses 63
4.4 Results 65
4.5 Discussion 69
4.6 Summary 74
Chapter 5 Correlation Analysis of Antipsychotic Dose and Speech Characteristics 75
5.1 Introduction 75
5.2 Korean Extrapyramidal Symptoms Speech Corpus 78
5.2.1 Participants 78
5.2.2 Recording Process 79
5.2.3 Extrapyramidal Symptoms Annotation and Equivalent Dose Calculations 80
5.3 Proposed Methods 81
5.3.1 Acoustic Feature Extraction 81
5.3.2 Speech Characteristics Analysis recording to Eq.dose 83
5.4 Results 83
5.5 Discussion 87
5.6 Summary 90
Chapter 6 Conclusions and Future Work 91
6.1 Conclusions 91
6.2 Future work 95
Bibliography 97
초 록 121박
LIPIcs, Volume 277, GIScience 2023, Complete Volume
LIPIcs, Volume 277, GIScience 2023, Complete Volum
Acoustic-based Smart Tactile Sensing in Social Robots
Mención Internacional en el título de doctorEl sentido del tacto es un componente crucial de la interacción social humana y es único
entre los cinco sentidos. Como único sentido proximal, el tacto requiere un contacto
físico cercano o directo para registrar la información. Este hecho convierte al tacto en
una modalidad de interacción llena de posibilidades en cuanto a comunicación social. A través
del tacto, podemos conocer la intención de la otra persona y comunicar emociones. De esta
idea surge el concepto de social touch o tacto social como el acto de tocar a otra persona en
un contexto social. Puede servir para diversos fines, como saludar, mostrar afecto, persuadir
y regular el bienestar emocional y físico.
Recientemente, el número de personas que interactúan con sistemas y agentes artificiales
ha aumentado, principalmente debido al auge de los dispositivos tecnológicos, como los smartphones
o los altavoces inteligentes. A pesar del auge de estos dispositivos, sus capacidades de
interacción son limitadas. Para paliar este problema, los recientes avances en robótica social han
mejorado las posibilidades de interacción para que los agentes funcionen de forma más fluida y
sean más útiles. En este sentido, los robots sociales están diseñados para facilitar interacciones
naturales entre humanos y agentes artificiales. El sentido del tacto en este contexto se revela
como un vehículo natural que puede mejorar la Human-Robot Interaction (HRI) debido a su
relevancia comunicativa en entornos sociales. Además de esto, para un robot social, la relación
entre el tacto social y su aspecto es directa, al disponer de un cuerpo físico para aplicar o recibir
toques.
Desde un punto de vista técnico, los sistemas de detección táctil han sido objeto recientemente
de nuevas investigaciones, sobre todo dedicado a comprender este sentido para crear sistemas
inteligentes que puedan mejorar la vida de las personas. En este punto, los robots sociales
se han convertido en dispositivos muy populares que incluyen tecnologías para la detección
táctil. Esto está motivado por el hecho de que un robot puede esperada o inesperadamente
tener contacto físico con una persona, lo que puede mejorar o interferir en la ejecución de sus
comportamientos. Por tanto, el sentido del tacto se antoja necesario para el desarrollo de aplicaciones
robóticas. Algunos métodos incluyen el reconocimiento de gestos táctiles, aunque
a menudo exigen importantes despliegues de hardware que requieren de múltiples sensores. Además, la fiabilidad de estas tecnologías de detección es limitada, ya que la mayoría de ellas
siguen teniendo problemas tales como falsos positivos o tasas de reconocimiento bajas. La detección
acústica, en este sentido, puede proporcionar un conjunto de características capaces de
paliar las deficiencias anteriores. A pesar de que se trata de una tecnología utilizada en diversos
campos de investigación, aún no se ha integrado en la interacción táctil entre humanos y robots.
Por ello, en este trabajo proponemos el sistema Acoustic Touch Recognition (ATR), un sistema
inteligente de detección táctil (smart tactile sensing system) basado en la detección acústica
y diseñado para mejorar la interacción social humano-robot. Nuestro sistema está desarrollado
para clasificar gestos táctiles y localizar su origen. Además de esto, se ha integrado en plataformas
robóticas sociales y se ha probado en aplicaciones reales con éxito. Nuestra propuesta
se ha enfocado desde dos puntos de vista: uno técnico y otro relacionado con el tacto social.
Por un lado, la propuesta tiene una motivación técnica centrada en conseguir un sistema táctil
rentable, modular y portátil. Para ello, en este trabajo se ha explorado el campo de las tecnologías
de detección táctil, los sistemas inteligentes de detección táctil y su aplicación en HRI. Por
otro lado, parte de la investigación se centra en el impacto afectivo del tacto social durante la
interacción humano-robot, lo que ha dado lugar a dos estudios que exploran esta idea.The sense of touch is a crucial component of human social interaction and is unique
among the five senses. As the only proximal sense, touch requires close or direct physical
contact to register information. This fact makes touch an interaction modality
full of possibilities regarding social communication. Through touch, we are able to ascertain
the other person’s intention and communicate emotions. From this idea emerges the concept
of social touch as the act of touching another person in a social context. It can serve various purposes,
such as greeting, showing affection, persuasion, and regulating emotional and physical
well-being.
Recently, the number of people interacting with artificial systems and agents has increased,
mainly due to the rise of technological devices, such as smartphones or smart speakers. Still,
these devices are limited in their interaction capabilities. To deal with this issue, recent developments
in social robotics have improved the interaction possibilities to make agents more seamless
and useful. In this sense, social robots are designed to facilitate natural interactions between
humans and artificial agents. In this context, the sense of touch is revealed as a natural interaction
vehicle that can improve HRI due to its communicative relevance. Moreover, for a social
robot, the relationship between social touch and its embodiment is direct, having a physical
body to apply or receive touches.
From a technical standpoint, tactile sensing systems have recently been the subject of further
research, mostly devoted to comprehending this sense to create intelligent systems that can
improve people’s lives. Currently, social robots are popular devices that include technologies
for touch sensing. This is motivated by the fact that robots may encounter expected or unexpected
physical contact with humans, which can either enhance or interfere with the execution
of their behaviours. There is, therefore, a need to detect human touch in robot applications.
Some methods even include touch-gesture recognition, although they often require significant
hardware deployments primarily that require multiple sensors. Additionally, the dependability
of those sensing technologies is constrained because the majority of them still struggle with issues
like false positives or poor recognition rates. Acoustic sensing, in this sense, can provide a
set of features that can alleviate the aforementioned shortcomings. Even though it is a technology that has been utilised in various research fields, it has yet to be integrated into human-robot
touch interaction.
Therefore, in thiswork,we propose theATRsystem, a smart tactile sensing system based on
acoustic sensing designed to improve human-robot social interaction. Our system is developed
to classify touch gestures and locate their source. It is also integrated into real social robotic platforms
and tested in real-world applications. Our proposal is approached from two standpoints,
one technical and the other related to social touch. Firstly, the technical motivation of thiswork
centred on achieving a cost-efficient, modular and portable tactile system. For that, we explore
the fields of touch sensing technologies, smart tactile sensing systems and their application in
HRI. On the other hand, part of the research is centred around the affective impact of touch
during human-robot interaction, resulting in two studies exploring this idea.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Pedro Manuel Urbano de Almeida Lima.- Secretaria: María Dolores Blanco Rojas.- Vocal: Antonio Fernández Caballer
12th International Conference on Geographic Information Science: GIScience 2023, September 12–15, 2023, Leeds, UK
No abstract available
Presentation adaptation for multimodal interface systems: Three essays on the effectiveness of user-centric content and modality adaptation
The use of devices is becoming increasingly ubiquitous and the contexts of their users more and more dynamic. This often leads to situations where one communication channel is rather impractical. Text-based communication is particularly inconvenient when the hands are already occupied with another task. Audio messages induce privacy risks and may disturb other people if used in public spaces. Multimodal interfaces thus offer users the flexibility to choose between multiple interaction modalities. While the choice of a suitable input modality lies in the hands of the users, they may also require output in a different modality depending on their situation. To adapt the output of a system to a particular context, rules are needed that specify how information should be presented given the users’ situation and state. Therefore, this thesis tests three adaptation rules that – based on observations from cognitive science – have the potential to improve the interaction with an application by adapting the presented content or its modality.
Following modality alignment, the output (audio versus visual) of a smart home display is matched with the user’s input (spoken versus manual) to the system. Experimental evaluations reveal that preferences for an input modality are initially too unstable to infer a clear preference for either interaction modality. Thus, the data shows no clear relation between the users’ modality choice for the first interaction and their attitude towards output in different modalities.
To apply multimodal redundancy, information is displayed in multiple modalities. An application of the rule in a video conference reveals that captions can significantly reduce confusion. However, the effect is limited to confusion resulting from language barriers, whereas contradictory auditory reports leave the participants in a state of confusion independent of whether captions are available or not. We therefore suggest to activate captions only when the facial expression of a user – captured by action units, expressions of positive or negative affect, and a reduced blink rate – implies that the captions effectively improve comprehension.
Content filtering in movies puts the character into the spotlight that – according to the distribution of their gaze to elements in the previous scene – the users prefer. If preferences are predicted with machine learning classifiers, this has the potential to significantly improve the user’ involvement compared to scenes of elements that the user does not prefer. Focused attention is additionally higher compared to scenes in which multiple characters take a lead role
- …