40 research outputs found
Recommended from our members
Multimodal Affect Models: An Investigation of Relative Salience of Audio and Visual Cues for Emotion Prediction
People perceive emotions via multiple cues, predominantly speech and visual cues, and a number of emotion recognition systems utilize both audio and visual cues. Moreover, the perception of static aspects of emotion (speaker's arousal level is high/low) and the dynamic aspects of emotion (speaker is becoming more aroused) might be perceived via different expressive cues and these two aspects are integrated to provide a unified sense of emotion state. However, existing multimodal systems only focus on single aspect of emotion perception and the contributions of different modalities toward modeling static and dynamic emotion aspects are not well explored. In this paper, we investigate the relative salience of audio and video modalities to emotion state prediction and emotion change prediction using a Multimodal Markovian affect model. Experiments conducted in the RECOLA database showed that audio modality is better at modeling the emotion state of arousal and video for emotion state of valence, whereas audio shows superior advantages over video in modeling emotion changes for both arousal and valence.</jats:p
Recognizing emotions in spoken dialogue with acoustic and lexical cues
Automatic emotion recognition has long been a focus of Affective Computing. It has
become increasingly apparent that awareness of human emotions in Human-Computer
Interaction (HCI) is crucial for advancing related technologies, such as dialogue
systems. However, performance of current automatic emotion recognition is
disappointing compared to human performance. Current research on emotion
recognition in spoken dialogue focuses on identifying better feature representations
and recognition models from a data-driven point of view. The goal of this thesis
is to explore how incorporating prior knowledge of human emotion recognition
in the automatic model can improve state-of-the-art performance of automatic
emotion recognition in spoken dialogue. Specifically, we study this by proposing
knowledge-inspired features representing occurrences of disfluency and non-verbal
vocalisation in speech, and by building a multimodal recognition model that combines
acoustic and lexical features in a knowledge-inspired hierarchical structure. In our
study, emotions are represented with the Arousal, Expectancy, Power, and Valence
emotion dimensions. We build unimodal and multimodal emotion recognition
models to study the proposed features and modelling approach, and perform emotion
recognition on both spontaneous and acted dialogue.
Psycholinguistic studies have suggested that DISfluency and Non-verbal
Vocalisation (DIS-NV) in dialogue is related to emotions. However, these affective
cues in spoken dialogue are overlooked by current automatic emotion recognition
research. Thus, we propose features for recognizing emotions in spoken dialogue
which describe five types of DIS-NV in utterances, namely filled pause, filler, stutter,
laughter, and audible breath. Our experiments show that this small set of features
is predictive of emotions. Our DIS-NV features achieve better performance than
benchmark acoustic and lexical features for recognizing all emotion dimensions in
spontaneous dialogue. Consistent with Psycholinguistic studies, the DIS-NV features
are especially predictive of the Expectancy dimension of emotion, which relates to
speaker uncertainty. Our study illustrates the relationship between DIS-NVs and
emotions in dialogue, which contributes to Psycholinguistic understanding of them
as well. Note that our DIS-NV features are based on manual annotations, yet our
long-term goal is to apply our emotion recognition model to HCI systems. Thus, we
conduct preliminary experiments on automatic detection of DIS-NVs, and on using
automatically detected DIS-NV features for emotion recognition. Our results show
that DIS-NVs can be automatically detected from speech with stable accuracy, and
auto-detected DIS-NV features remain predictive of emotions in spontaneous dialogue.
This suggests that our emotion recognition model can be applied to a fully automatic
system in the future, and holds the potential to improve the quality of emotional
interaction in current HCI systems.
To study the robustness of the DIS-NV features, we conduct cross-corpora
experiments on both spontaneous and acted dialogue. We identify how dialogue
type influences the performance of DIS-NV features and emotion recognition models.
DIS-NVs contain additional information beyond acoustic characteristics or lexical
contents. Thus, we study the gain of modality fusion for emotion recognition with the
DIS-NV features. Previous work combines different feature sets by fusing modalities
at the same level using two types of fusion strategies: Feature-Level (FL) fusion,
which concatenates feature sets before recognition; and Decision-Level (DL) fusion,
which makes the final decision based on outputs of all unimodal models. However,
features from different modalities may describe data at different time scales or levels
of abstraction. Moreover, Cognitive Science research indicates that when perceiving
emotions, humans make use of information from different modalities at different
cognitive levels and time steps. Therefore, we propose a HierarchicaL (HL) fusion
strategy for multimodal emotion recognition, which incorporates features that describe
data at a longer time interval or which are more abstract at higher levels of its
knowledge-inspired hierarchy. Compared to FL and DL fusion, HL fusion incorporates
both inter- and intra-modality differences. Our experiments show that HL fusion
consistently outperforms FL and DL fusion on multimodal emotion recognition in both
spontaneous and acted dialogue. The HL model combining our DIS-NV features with
benchmark acoustic and lexical features improves current performance of multimodal
emotion recognition in spoken dialogue.
To study how other emotion-related tasks of spoken dialogue can benefit from the
proposed approaches, we apply the DIS-NV features and the HL fusion strategy to
recognize movie-induced emotions. Our experiments show that although designed
for recognizing emotions in spoken dialogue, DIS-NV features and HL fusion
remain effective for recognizing movie-induced emotions. This suggests that other
emotion-related tasks can also benefit from the proposed features and model structure
Techniques for Decentralized and Dynamic Resource Allocation
abstract: This thesis investigates three different resource allocation problems, aiming to achieve two common goals: i) adaptivity to a fast-changing environment, ii) distribution of the computation tasks to achieve a favorable solution. The motivation for this work relies on the modern-era proliferation of sensors and devices, in the Data Acquisition Systems (DAS) layer of the Internet of Things (IoT) architecture. To avoid congestion and enable low-latency services, limits have to be imposed on the amount of decisions that can be centralized (i.e. solved in the ``cloud") and/or amount of control information that devices can exchange. This has been the motivation to develop i) a lightweight PHY Layer protocol for time synchronization and scheduling in Wireless Sensor Networks (WSNs), ii) an adaptive receiver that enables Sub-Nyquist sampling, for efficient spectrum sensing at high frequencies, and iii) an SDN-scheme for resource-sharing across different technologies and operators, to harmoniously and holistically respond to fluctuations in demands at the eNodeB' s layer.
The proposed solution for time synchronization and scheduling is a new protocol, called PulseSS, which is completely event-driven and is inspired by biological networks. The results on convergence and accuracy for locally connected networks, presented in this thesis, constitute the theoretical foundation for the protocol in terms of performance guarantee. The derived limits provided guidelines for ad-hoc solutions in the actual implementation of the protocol.
The proposed receiver for Compressive Spectrum Sensing (CSS) aims at tackling the noise folding phenomenon, e.g., the accumulation of noise from different sub-bands that are folded, prior to sampling and baseband processing, when an analog front-end aliasing mixer is utilized.
The sensing phase design has been conducted via a utility maximization approach, thus the scheme derived has been called Cognitive Utility Maximization Multiple Access (CUMMA).
The framework described in the last part of the thesis is inspired by stochastic network optimization tools and dynamics.
While convergence of the proposed approach remains an open problem, the numerical results here presented suggest the capability of the algorithm to handle traffic fluctuations across operators, while respecting different time and economic constraints.
The scheme has been named Decomposition of Infrastructure-based Dynamic Resource Allocation (DIDRA).Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
2019 EURēCA Abstract Book
Listing of student participant abstracts
National Stereotyping, Identity Politics, European Crises (Volume 27)
The articulation of collective identity by means of a stereotyped repertoire of exclusionary characterizations of Self and Other is one of the longest-standing literary traditions in Europe and as such has become part of a global modernity. Recently, this discourse of Othering and national stereotyping has gained fresh political virulence as a result of the rise of “Identity Politics”. What is more, this newly politicized self/other discourse has affected Europe itself as that continent has been weathering a series of economic and political crises in recent years. The present volume traces the conjunction between cultural and literary traditions and contemporary ideologies during the crisis of European multilateralism. Contributors: Aelita Ambrulevičiūtė, Jürgen Barkhoff, Stefan Berger, Zrinka Blažević, Daniel Carey, Ana María Fraile, Wulf Kansteiner, Joep Leerssen, Hercules Millas, Zenonas Norkus, Aidan O’Malley, Raúl Sánchez Prieto, Karel Šima, Luc Van Doorslaer,Ruth Woda
7th International Conference on Higher Education Advances (HEAd'21)
Information and communication technologies together with new teaching paradigms are reshaping the learning environment.The International Conference on Higher Education Advances (HEAd) aims to become a forum for researchers and practitioners to exchange ideas, experiences,opinions and research results relating to the preparation of students and the organization of educational systems.Doménech I De Soria, J.; Merello Giménez, P.; Poza Plaza, EDL. (2021). 7th International Conference on Higher Education Advances (HEAd'21). Editorial Universitat Politècnica de València. https://doi.org/10.4995/HEAD21.2021.13621EDITORIA
The Language of Paul Muldoon
This book interprets the multifarious writing of the Irish-American word wizard, Paul Muldoon, who has been described by The Times Literary Supplement as ‘the most significant English-language poet born since the second World War’. Readership: All interested in poetry and writing from Ireland and the English-speaking world, and in the enigma of language
Library websites popularity: does Facebook really matter?
The purpose of this paper is to determine whether the utilization of social media (Facebook) is an important factor in increasing the visibility of the library site usage in Malaysian public universities. Nine top ranked Malaysian public universities involved in this research and number of Facebook followers for each library website is listed. Alexa software was used as the approach to study the issue of visibility. Alexa is able to determine web site usage, by showing the percentage of visitors of library related subdomain(s) as listed in the top subdomains for each University website (domain) over a month. It is found that Universiti Utara Malaysia library website scored the highest percentage of visitors based on the library related subdomain(s) as listed in the top subdomains for the University website in Alexa. To check such irregularities in access, this paper use EvalAccess 2.0 and it is found that Universiti Sains Malaysia’s library website scored higher irregularities. In term of number of Facebook followers, Univesity of Malaya library has the highest score. It is showed that the utilization of social media (Facebook) is not yet an important factor in increasing the visibility of the library websites. However, expectedly, top ranked universities’ library web sites, are more visible and popular. This research is limited to the situation in Malaysia where public universities are more noticeable and seldom face financial constraints rather than private universities. It is highly important for those universities’ library web sites that are not highly visible to initiate the necessary measures in improving the development of their web sites as the usage of the website is an indicator of online quality
Safety and Reliability - Safe Societies in a Changing World
The contributions cover a wide range of methodologies and application areas for safety and reliability that contribute to safe societies in a changing world. These methodologies and applications include: - foundations of risk and reliability assessment and management
- mathematical methods in reliability and safety
- risk assessment
- risk management
- system reliability
- uncertainty analysis
- digitalization and big data
- prognostics and system health management
- occupational safety
- accident and incident modeling
- maintenance modeling and applications
- simulation for safety and reliability analysis
- dynamic risk and barrier management
- organizational factors and safety culture
- human factors and human reliability
- resilience engineering
- structural reliability
- natural hazards
- security
- economic analysis in risk managemen