1,041 research outputs found

    Toward a social signaling framework : activity and emphasis in speech

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 67-70).Language is not the only form of verbal communication. Loudness, pitch, speaking rate, and other non-linguistic speech features are crucial aspects of human spoken interaction. In this thesis, we separate these speech features into two categories -- vocal Activity and vocal Emphasis -- and propose a framework for classifying high-level social behavior according to those metrics. We present experiments showing that non-linguistic speech analysis alone can account for appreciable portions of social phenomena. We report statistically significant results in measuring the persuasiveness of pitches, the effectiveness of customer service representatives, and the severity of depression. Effect sizes of these studies explain up to 60% of the sample variances and yield binary decision accuracies nearing 90%.by William T. Stoltzman.M.Eng

    Proceedings of Abstracts Engineering and Computer Science Research Conference 2019

    Get PDF
    © 2019 The Author(s). This is an open-access work distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. For further details please see https://creativecommons.org/licenses/by/4.0/. Note: Keynote: Fluorescence visualisation to evaluate effectiveness of personal protective equipment for infection control is © 2019 Crown copyright and so is licensed under the Open Government Licence v3.0. Under this licence users are permitted to copy, publish, distribute and transmit the Information; adapt the Information; exploit the Information commercially and non-commercially for example, by combining it with other Information, or by including it in your own product or application. Where you do any of the above you must acknowledge the source of the Information in your product or application by including or linking to any attribution statement specified by the Information Provider(s) and, where possible, provide a link to this licence: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/This book is the record of abstracts submitted and accepted for presentation at the Inaugural Engineering and Computer Science Research Conference held 17th April 2019 at the University of Hertfordshire, Hatfield, UK. This conference is a local event aiming at bringing together the research students, staff and eminent external guests to celebrate Engineering and Computer Science Research at the University of Hertfordshire. The ECS Research Conference aims to showcase the broad landscape of research taking place in the School of Engineering and Computer Science. The 2019 conference was articulated around three topical cross-disciplinary themes: Make and Preserve the Future; Connect the People and Cities; and Protect and Care

    Multimodal analysis of depression in unconstrained environments

    Get PDF
    Mental health disorders, such as depression and anxiety, are a significant global problem affecting millions of people, leading to disability, increased mortality from suicide, and reduced quality of life. Traditional diagnostic and evaluation methods rely on subjective approaches and are limited by resource availability, driving the need for more accessible and efficient methods using technology. Digital mental health, a rapidly growing field, merges digital technologies into mental health care, utilizing the Internet and mobile phone software to deliver mental health services. The use of mobile health technologies, such as Ecological Momentary Assessments and digital phenotyping, can improve depression diagnostics by generating objectively measurable markers in natural environments. Technological progress in computer vision, natural language processing, and affective computing has also led to the emergence of automated behavior analysis methods, improving depression assessment and understanding. This thesis addresses the problem of mood assessment and analysis for detecting depression from multimodal data in unconstrained, natural environments. This thesis presents a novel, multi-modal dataset collected from a purpose- built smartphone app for depression recognition in real-world, unconstrained environments and proposes a state-of-the-art, automated depression recognition system leveraging advancements in multimodal analysis. The research outcomes have the potential to be applied in automated patient monitoring or therapy administering platforms. The thesis contributes by: 1) collecting a novel, longitudinal, and multi-modal, Mood-Seasons dataset in real-world settings, 2) benchmarking state-of-the-art video analysis techniques on newly collected and publicly available datasets, 3) building a multimodal spatio-temporal transformer model for automated depression severity prediction, 4) presenting a new framework for face generation that learns to synthesize novel face images that adhere to a given pose and appearance from exemplar image in a semantically meaningful way and 5) applying the face manipulation method for anonymizing the Mood-Seasons dataset for privacy preservation. In conclusion, this thesis addresses the limitations of current depression diagnostics and assessments by integrating smartphone-driven digital phenotyping technologies to advance and personalize depression care. By collecting a novel dataset, proposing state-of-the-art methods for depression recognition, and addressing privacy concerns, this work has the potential to significantly improve mental health care delivery and accessibility

    On the Impact of Voice Anonymization on Speech-Based COVID-19 Detection

    Full text link
    With advances seen in deep learning, voice-based applications are burgeoning, ranging from personal assistants, affective computing, to remote disease diagnostics. As the voice contains both linguistic and paralinguistic information (e.g., vocal pitch, intonation, speech rate, loudness), there is growing interest in voice anonymization to preserve speaker privacy and identity. Voice privacy challenges have emerged over the last few years and focus has been placed on removing speaker identity while keeping linguistic content intact. For affective computing and disease monitoring applications, however, the paralinguistic content may be more critical. Unfortunately, the effects that anonymization may have on these systems are still largely unknown. In this paper, we fill this gap and focus on one particular health monitoring application: speech-based COVID-19 diagnosis. We test two popular anonymization methods and their impact on five different state-of-the-art COVID-19 diagnostic systems using three public datasets. We validate the effectiveness of the anonymization methods, compare their computational complexity, and quantify the impact across different testing scenarios for both within- and across-dataset conditions. Lastly, we show the benefits of anonymization as a data augmentation tool to help recover some of the COVID-19 diagnostic accuracy loss seen with anonymized data.Comment: 11 pages, 10 figure

    Automatic Framework to Aid Therapists to Diagnose Children who Stutter

    Get PDF

    Beyond mobile apps: a survey of technologies for mental well-being

    Get PDF
    Mental health problems are on the rise globally and strain national health systems worldwide. Mental disorders are closely associated with fear of stigma, structural barriers such as financial burden, and lack of available services and resources which often prohibit the delivery of frequent clinical advice and monitoring. Technologies for mental well-being exhibit a range of attractive properties, which facilitate the delivery of state-of-the-art clinical monitoring. This review article provides an overview of traditional techniques followed by their technological alternatives, sensing devices, behaviour changing tools, and feedback interfaces. The challenges presented by these technologies are then discussed with data collection, privacy, and battery life being some of the key issues which need to be carefully considered for the successful deployment of mental health toolkits. Finally, the opportunities this growing research area presents are discussed including the use of portable tangible interfaces combining sensing and feedback technologies. Capitalising on the data these ubiquitous devices can record, state of the art machine learning algorithms can lead to the development of robust clinical decision support tools towards diagnosis and improvement of mental well-being delivery in real-time

    Multimodal analysis of depression in unconstrained environments

    Get PDF
    Mental health disorders, such as depression and anxiety, are a significant global problem affecting millions of people, leading to disability, increased mortality from suicide, and reduced quality of life. Traditional diagnostic and evaluation methods rely on subjective approaches and are limited by resource availability, driving the need for more accessible and efficient methods using technology. Digital mental health, a rapidly growing field, merges digital technologies into mental health care, utilizing the Internet and mobile phone software to deliver mental health services. The use of mobile health technologies, such as Ecological Momentary Assessments and digital phenotyping, can improve depression diagnostics by generating objectively measurable markers in natural environments. Technological progress in computer vision, natural language processing, and affective computing has also led to the emergence of automated behavior analysis methods, improving depression assessment and understanding. This thesis addresses the problem of mood assessment and analysis for detecting depression from multimodal data in unconstrained, natural environments. This thesis presents a novel, multi-modal dataset collected from a purpose- built smartphone app for depression recognition in real-world, unconstrained environments and proposes a state-of-the-art, automated depression recognition system leveraging advancements in multimodal analysis. The research outcomes have the potential to be applied in automated patient monitoring or therapy administering platforms. The thesis contributes by: 1) collecting a novel, longitudinal, and multi-modal, Mood-Seasons dataset in real-world settings, 2) benchmarking state-of-the-art video analysis techniques on newly collected and publicly available datasets, 3) building a multimodal spatio-temporal transformer model for automated depression severity prediction, 4) presenting a new framework for face generation that learns to synthesize novel face images that adhere to a given pose and appearance from exemplar image in a semantically meaningful way and 5) applying the face manipulation method for anonymizing the Mood-Seasons dataset for privacy preservation. In conclusion, this thesis addresses the limitations of current depression diagnostics and assessments by integrating smartphone-driven digital phenotyping technologies to advance and personalize depression care. By collecting a novel dataset, proposing state-of-the-art methods for depression recognition, and addressing privacy concerns, this work has the potential to significantly improve mental health care delivery and accessibility

    An Ordinal Approach to Affective Computing

    Full text link
    Both depression prediction and emotion recognition systems are often based on ordinal ground truth due to subjectively annotated datasets. Yet, both have so far been posed as classification or regression problems. These naive approaches have fundamental issues because they are not focused on ordering, unlike ordinal regression, which is the most appropriate for truly ordinal ground truth. Ordinal regression to date offers comparatively fewer, more limited methods when compared with other branches in machine learning, and its usage has been limited to specific research domains. Accordingly, this thesis presents investigations into ordinal approaches for affective computing by describing a consistent framework to understand all ordinal system designs, proposing ordinal systems for large datasets, and introducing tools and principles to select suitable system designs and evaluation methods. First, three learning approaches are compared using the support vector framework to establish the empirical advantages of ordinal regression, which is lacking from the current literature. Results on depression and emotion corpora indicate that ordinal regression with proper tuning can improve existing depression and emotion systems. Ordinal logistic regression (OLR), which is an extension of logistic regression for ordinal scales, contributes to a number of model structures, from which the best structure must be chosen. Exploiting the newly proposed computationally efficient greedy algorithm for model structure selection (GREP), OLR outperformed or was comparable with state-of-the-art depression systems on two benchmark depression speech datasets. Deep learning has dominated many affective computing fields, and hence ordinal deep learning is an attractive prospect. However, it is under-studied even in the machine learning literature, which motivates an in-depth analysis of appropriate network architectures and loss functions. One of the significant outcomes of this analysis is the introduction of RankCNet, a novel ordinal network which utilises a surrogate loss function of rank correlation. Not only the modelling algorithm but the choice of evaluation measure depends on the nature of the ground truth. Rank correlation measures, which are sensitive to ordering, are more apt for ordinal problems than common classification or regression measures that ignore ordering information. Although rank-based evaluation for ordinal problems is not new, so far in affective computing, ordinality of the ground truth has been widely ignored during evaluation. Hence, a systematic analysis in the affective computing context is presented, to provide clarity and encourage careful choice of evaluation measures. Another contribution is a neural network framework with a novel multi-term loss function to assess the ordinality of ordinally-annotated datasets, which can guide the selection of suitable learning and evaluation methods. Experiments on multiple synthetic and affective speech datasets reveal that the proposed system can offer reliable and meaningful predictions about the ordinality of a given dataset. Overall, the novel contributions and findings presented in this thesis not only improve prediction accuracy but also encourage future research towards ordinal affective computing: a different paradigm, but often the most appropriate

    ATHENA Research Book

    Get PDF
    The ATHENA European University is an alliance of nine Higher Education Institutions with the mission of fostering excellence in research and innovation by facilitating international cooperation. The ATHENA acronym stands for Advanced Technologies in Higher Education Alliance. The partner institutions are from France, Germany, Greece, Italy, Lithuania, Portugal, and Slovenia: the University of Orléans, the University of Siegen, the Hellenic Mediterranean University, the Niccolò Cusano University, the Vilnius Gediminas Technical University, the Polytechnic Institute of Porto, and the University of Maribor. In 2022 institutions from Poland and Spain joined the alliance: the Maria Curie-Skłodowska University and the University of Vigo. This research book presents a selection of the ATHENA university partners' research activities. It incorporates peer-reviewed original articles, reprints and student contributions. The ATHENA Research Book provides a platform that promotes joint and interdisciplinary research projects of both advanced and early-career researchers
    corecore