44 research outputs found

    Planning Based System for Child-Robot Interaction in Dynamic Play Environments

    Get PDF
    This paper describes the initial steps towards the design of a robotic system that intends to perform actions autonomously in a naturalistic play environment. At the same time it aims for social human-robot interaction~(HRI), focusing on children. We draw on existing theories of child development and on dimensional models of emotions to explore the design of a dynamic interaction framework for natural child-robot interaction. In this dynamic setting, the social HRI is defined by the ability of the system to take into consideration the socio-emotional state of the user and to plan appropriately by selecting appropriate strategies for execution. The robot needs a temporal planning system, which combines features of task-oriented actions and principles of social human robot interaction. We present initial results of an empirical study for the evaluation of the proposed framework in the context of a collaborative sorting game

    Towards Speech Emotion Recognition "in the wild" using Aggregated Corpora and Deep Multi-Task Learning

    Get PDF
    One of the challenges in Speech Emotion Recognition (SER) "in the wild" is the large mismatch between training and test data (e.g. speakers and tasks). In order to improve the generalisation capabilities of the emotion models, we propose to use Multi-Task Learning (MTL) and use gender and naturalness as auxiliary tasks in deep neural networks. This method was evaluated in within-corpus and various cross-corpus classification experiments that simulate conditions "in the wild". In comparison to Single-Task Learning (STL) based state of the art methods, we found that our MTL method proposed improved performance significantly. Particularly, models using both gender and naturalness achieved more gains than those using either gender or naturalness separately. This benefit was also found in the high-level representations of the feature space, obtained from our method proposed, where discriminative emotional clusters could be observed.Comment: Published in the proceedings of INTERSPEECH, Stockholm, September, 201

    Learning spectro-temporal features with 3D CNNs for speech emotion recognition

    Get PDF
    In this paper, we propose to use deep 3-dimensional convolutional networks (3D CNNs) in order to address the challenge of modelling spectro-temporal dynamics for speech emotion recognition (SER). Compared to a hybrid of Convolutional Neural Network and Long-Short-Term-Memory (CNN-LSTM), our proposed 3D CNNs simultaneously extract short-term and long-term spectral features with a moderate number of parameters. We evaluated our proposed and other state-of-the-art methods in a speaker-independent manner using aggregated corpora that give a large and diverse set of speakers. We found that 1) shallow temporal and moderately deep spectral kernels of a homogeneous architecture are optimal for the task; and 2) our 3D CNNs are more effective for spectro-temporal feature learning compared to other methods. Finally, we visualised the feature space obtained with our proposed method using t-distributed stochastic neighbour embedding (T-SNE) and could observe distinct clusters of emotions.Comment: ACII, 2017, San Antoni

    Dirichlet process approach for radio-based simultaneous localization and mapping

    Full text link
    Due to 5G millimeter wave (mmWave), spatial channel parameters are becoming highly resolvable, enabling accurate vehicle localization and mapping. We propose a novel method of radio simultaneous localization and mapping (SLAM) with the Dirichlet process (DP). The DP, which can estimate the number of clusters as well as clustering, is capable of identifying the locations of reflectors by classifying signals when such 5G signals are reflected and received from various objects. We generate birth points using the measurements from 5G mmWave signals received by the vehicle and classify objects by clustering birth points generated over time. Each time we use the DP clustering method, we can map landmarks in the environment in challenging situations where false alarms exist in the measurements and change the cardinality of received signals. Simulation results demonstrate the performance of the proposed scheme. By comparing the results with the SLAM based on the Rao-Blackwellized probability hypothesis density filter, we confirm a slight drop in SLAM performance, but as a result, we validate that it has a significant gain in computational complexity

    Robot response behaviors to accommodate hearing problems

    Get PDF
    One requirement that arises for a social (semi-autonomous telepresence) robot aimed at conversations with the elderly, is to accommodate hearing problems. In this paper we compare two approaches to this requirement; (1) moving closer, mimicking the leaning behavior commonly observed in elderly with hearing problems, (2) turning up the volume, which is a more mechanical solution. Our findings with elderly participants show that they preferred the turning up of the volume, since they rated it significantly higher

    Learning spectral-temporal features with 3D CNNs for speech emotion recognition

    Get PDF
    In this paper, we propose to use deep 3-dimensional convolutional networks (3D CNNs) in order to address the challenge of modelling spectro-temporal dynamics for speech emotion recognition (SER). Compared to a hybrid of Convolutional Neural Network and Long-Short-Term-Memory (CNN-LSTM), our proposed 3D CNNs simultaneously extract short-term and long-term spectral features with a moderate number of parameters. We evaluated our proposed and other state-of-the-art methods in a speaker-independent manner using aggregated corpora that give a large and diverse set of speakers. We found that 1) shallow temporal and moderately deep spectral kernels of a homogeneous architecture are optimal for the task; and 2) our 3D CNNs are more effective for spectro-temporal feature learning compared to other methods. Finally, we visualised the feature space obtained with our proposed method using t-distributed stochastic neighbour embedding (T-SNE) and could observe distinct clusters of emotions

    Towards Speech Emotion Recognition "in the wild" using Aggregated Corpora and Deep Multi-Task Learning

    Get PDF
    One of the challenges in Speech Emotion Recognition (SER) "in the wild" is the large mismatch between training and test data (e.g. speakers and tasks). In order to improve the generalisation capabilities of the emotion models, we propose to use Multi-Task Learning (MTL) and use gender and naturalness as auxiliary tasks in deep neural networks. This method was evaluated in within-corpus and various cross-corpus classification experiments that simulate conditions "in the wild". In comparison to Single-Task Learning (STL) based state of the art methods, we found that our MTL method proposed improved performance significantly. Particularly, models using both gender and naturalness achieved more gains than those using either gender or naturalness separately. This benefit was also found in the high-level representations of the feature space, obtained from our method proposed, where discriminative emotional clusters could be observed

    Targeting histone deacetylases to modulate graft-versus-host disease and graft-versus-leukemia

    Get PDF
    Allogeneic hematopoietic stem cell transplantation (allo-HSCT) is the main therapeutic strategy for patients with both malignant and nonmalignant disorders. The therapeutic benefits of allo-HSCT in malignant disorders are primarily derived from the graft-versus-leukemia (GvL) effect, in which T cells in the donor graft recognize and eradicate residual malignant cells. However, the same donor T cells can also recognize normal host tissues as foreign, leading to the development of graft-versus-host disease (GvHD), which is difficult to separate from GvL and is the most frequent and serious complication following allo-HSCT. Inhibition of donor T cell toxicity helps in reducing GvHD but also restricts GvL activity. Therefore, developing a novel therapeutic strategy that selectively suppresses GvHD without affecting GvL is essential. Recent studies have shown that inhibition of histone deacetylases (HDACs) not only inhibits the growth of tumor cells but also regulates the cytotoxic activity of T cells. Here, we compile the known therapeutic potential of HDAC inhibitors in preventing several stages of GvHD pathogenesis. Furthermore, we will also review the current clinical features of HDAC inhibitors in preventing and treating GvHD as well as maintaining GvL

    Cooperative mmWave PHD-SLAM with Moving Scatterers

    Full text link
    Using the multiple-model~(MM) probability hypothesis density~(PHD) filter, millimeter wave~(mmWave) radio simultaneous localization and mapping~(SLAM) in vehicular scenarios is susceptible to movements of objects, in particular vehicles driving in parallel with the ego vehicle. We propose and evaluate two countermeasures to track vehicle scatterers~(VSs) in mmWave radio MM-PHD-SLAM. First, locally at each vehicle, we generate and treat the VS map PHD in the context of Bayesian recursion, and modify vehicle state correction with the VS map PHD. Second, in the global map fusion process at the base station, we average the VS map PHD and upload it with self-vehicle posterior density, compute fusion weights, and prune the target with low Gaussian weight in the context of arithmetic average-based map fusion. From simulation results, the proposed cooperative mmWave radio MM-PHD-SLAM filter is shown to outperform the previous filter in VS scenarios

    Automatic analysis of children’s engagement using interactional network features

    Get PDF
    We explored the automatic analysis of vocal non-verbal cues of a group of children in the context of engagement and collaborative play. For the current study, we defined two types of engagement on groups of children: harmonised and unharmonised. A spontaneous audiovisual corpus with groups of children who collaboratively build a 3D puzzle was collected. With this corpus, we modelled the interactions among children using network-based features representing the centrality and similarity of interactions. The centrality measures how interactions among group members are concentrated on a specific speaker while the similarity measures how similar the interactions are. We examined their discriminative characteristics in harmonised and unharmonised engagement situations. High centrality and low similarity values were found in unharmonised engagement situations. In harmonised engagement situations, we found low centrality and high similarity values. These results suggest that interactional network features are promising for the development of automatic detection of engagement at the group level
    corecore