81,389 research outputs found

    Prerequisites for Affective Signal Processing (ASP)

    Get PDF
    Although emotions are embraced by science, their recognition has not reached a satisfying level. Through a concise overview of affect, its signals, features, and classification methods, we provide understanding for the problems encountered. Next, we identify the prerequisites for successful Affective Signal Processing: validation (e.g., mapping of constructs on signals), triangulation, a physiology-driven approach, and contributions of the signal processing community. Using these directives, a critical analysis of a real-world case is provided. This illustrates that the prerequisites can become a valuable guide for Affective Signal Processing (ASP)

    Learning to Resolve Natural Language Ambiguities: A Unified Approach

    Full text link
    We analyze a few of the commonly used statistics based and machine learning algorithms for natural language disambiguation tasks and observe that they can be re-cast as learning linear separators in the feature space. Each of the methods makes a priori assumptions, which it employs, given the data, when searching for its hypothesis. Nevertheless, as we show, it searches a space that is as rich as the space of all linear separators. We use this to build an argument for a data driven approach which merely searches for a good linear separator in the feature space, without further assumptions on the domain or a specific problem. We present such an approach - a sparse network of linear separators, utilizing the Winnow learning algorithm - and show how to use it in a variety of ambiguity resolution problems. The learning approach presented is attribute-efficient and, therefore, appropriate for domains having very large number of attributes. In particular, we present an extensive experimental comparison of our approach with other methods on several well studied lexical disambiguation tasks such as context-sensitive spelling correction, prepositional phrase attachment and part of speech tagging. In all cases we show that our approach either outperforms other methods tried for these tasks or performs comparably to the best

    Simulating dysarthric speech for training data augmentation in clinical speech applications

    Full text link
    Training machine learning algorithms for speech applications requires large, labeled training data sets. This is problematic for clinical applications where obtaining such data is prohibitively expensive because of privacy concerns or lack of access. As a result, clinical speech applications are typically developed using small data sets with only tens of speakers. In this paper, we propose a method for simulating training data for clinical applications by transforming healthy speech to dysarthric speech using adversarial training. We evaluate the efficacy of our approach using both objective and subjective criteria. We present the transformed samples to five experienced speech-language pathologists (SLPs) and ask them to identify the samples as healthy or dysarthric. The results reveal that the SLPs identify the transformed speech as dysarthric 65% of the time. In a pilot classification experiment, we show that by using the simulated speech samples to balance an existing dataset, the classification accuracy improves by about 10% after data augmentation.Comment: Will appear in Proc. of ICASSP 201

    I hear you eat and speak: automatic recognition of eating condition and food type, use-cases, and impact on ASR performance

    Get PDF
    We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient

    Anticipatory Mobile Computing: A Survey of the State of the Art and Research Challenges

    Get PDF
    Today's mobile phones are far from mere communication devices they were ten years ago. Equipped with sophisticated sensors and advanced computing hardware, phones can be used to infer users' location, activity, social setting and more. As devices become increasingly intelligent, their capabilities evolve beyond inferring context to predicting it, and then reasoning and acting upon the predicted context. This article provides an overview of the current state of the art in mobile sensing and context prediction paving the way for full-fledged anticipatory mobile computing. We present a survey of phenomena that mobile phones can infer and predict, and offer a description of machine learning techniques used for such predictions. We then discuss proactive decision making and decision delivery via the user-device feedback loop. Finally, we discuss the challenges and opportunities of anticipatory mobile computing.Comment: 29 pages, 5 figure

    Use of nonintrusive sensor-based information and communication technology for real-world evidence for clinical trials in dementia

    Get PDF
    Cognitive function is an important end point of treatments in dementia clinical trials. Measuring cognitive function by standardized tests, however, is biased toward highly constrained environments (such as hospitals) in selected samples. Patient-powered real-world evidence using information and communication technology devices, including environmental and wearable sensors, may help to overcome these limitations. This position paper describes current and novel information and communication technology devices and algorithms to monitor behavior and function in people with prodromal and manifest stages of dementia continuously, and discusses clinical, technological, ethical, regulatory, and user-centered requirements for collecting real-world evidence in future randomized controlled trials. Challenges of data safety, quality, and privacy and regulatory requirements need to be addressed by future smart sensor technologies. When these requirements are satisfied, these technologies will provide access to truly user relevant outcomes and broader cohorts of participants than currently sampled in clinical trials

    Prerequisites for Affective Signal Processing (ASP) - Part V: A response to comments and suggestions

    Get PDF
    In four papers, a set of eleven prerequisites for affective signal processing (ASP) were identified (van den Broek et al., 2010): validation, triangulation, a physiology-driven approach, contributions of the signal processing community, identification of users, theoretical specification, integration of biosignals, physical characteristics, historical perspective, temporal construction, and real-world baselines. Additionally, a review (in two parts) of affective computing was provided. Initiated by the reactions on these four papers, we now present: i) an extension of the review, ii) a post-hoc analysis based on the eleven prerequisites of Picard et al.(2001), and iii) a more detailed discussion and illustrations of temporal aspects with ASP

    Structural Material Property Tailoring Using Deep Neural Networks

    Full text link
    Advances in robotics, artificial intelligence, and machine learning are ushering in a new age of automation, as machines match or outperform human performance. Machine intelligence can enable businesses to improve performance by reducing errors, improving sensitivity, quality and speed, and in some cases achieving outcomes that go beyond current resource capabilities. Relevant applications include new product architecture design, rapid material characterization, and life-cycle management tied with a digital strategy that will enable efficient development of products from cradle to grave. In addition, there are also challenges to overcome that must be addressed through a major, sustained research effort that is based solidly on both inferential and computational principles applied to design tailoring of functionally optimized structures. Current applications of structural materials in the aerospace industry demand the highest quality control of material microstructure, especially for advanced rotational turbomachinery in aircraft engines in order to have the best tailored material property. In this paper, deep convolutional neural networks were developed to accurately predict processing-structure-property relations from materials microstructures images, surpassing current best practices and modeling efforts. The models automatically learn critical features, without the need for manual specification and/or subjective and expensive image analysis. Further, in combination with generative deep learning models, a framework is proposed to enable rapid material design space exploration and property identification and optimization. The implementation must take account of real-time decision cycles and the trade-offs between speed and accuracy
    corecore