81,414 research outputs found
Prerequisites for Affective Signal Processing (ASP)
Although emotions are embraced by science, their recognition has not reached a satisfying level. Through a concise overview of affect, its signals, features, and classification methods, we provide understanding for the problems encountered. Next, we identify the prerequisites for successful Affective Signal Processing: validation (e.g., mapping of constructs on signals), triangulation, a physiology-driven approach, and contributions of the signal processing community. Using these directives, a critical analysis of a real-world case is provided. This illustrates that the prerequisites can become a valuable guide for Affective Signal Processing (ASP)
Learning to Resolve Natural Language Ambiguities: A Unified Approach
We analyze a few of the commonly used statistics based and machine learning
algorithms for natural language disambiguation tasks and observe that they can
be re-cast as learning linear separators in the feature space. Each of the
methods makes a priori assumptions, which it employs, given the data, when
searching for its hypothesis. Nevertheless, as we show, it searches a space
that is as rich as the space of all linear separators. We use this to build an
argument for a data driven approach which merely searches for a good linear
separator in the feature space, without further assumptions on the domain or a
specific problem.
We present such an approach - a sparse network of linear separators,
utilizing the Winnow learning algorithm - and show how to use it in a variety
of ambiguity resolution problems. The learning approach presented is
attribute-efficient and, therefore, appropriate for domains having very large
number of attributes.
In particular, we present an extensive experimental comparison of our
approach with other methods on several well studied lexical disambiguation
tasks such as context-sensitive spelling correction, prepositional phrase
attachment and part of speech tagging. In all cases we show that our approach
either outperforms other methods tried for these tasks or performs comparably
to the best
Simulating dysarthric speech for training data augmentation in clinical speech applications
Training machine learning algorithms for speech applications requires large,
labeled training data sets. This is problematic for clinical applications where
obtaining such data is prohibitively expensive because of privacy concerns or
lack of access. As a result, clinical speech applications are typically
developed using small data sets with only tens of speakers. In this paper, we
propose a method for simulating training data for clinical applications by
transforming healthy speech to dysarthric speech using adversarial training. We
evaluate the efficacy of our approach using both objective and subjective
criteria. We present the transformed samples to five experienced
speech-language pathologists (SLPs) and ask them to identify the samples as
healthy or dysarthric. The results reveal that the SLPs identify the
transformed speech as dysarthric 65% of the time. In a pilot classification
experiment, we show that by using the simulated speech samples to balance an
existing dataset, the classification accuracy improves by about 10% after data
augmentation.Comment: Will appear in Proc. of ICASSP 201
I hear you eat and speak: automatic recognition of eating condition and food type, use-cases, and impact on ASR performance
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient
Anticipatory Mobile Computing: A Survey of the State of the Art and Research Challenges
Today's mobile phones are far from mere communication devices they were ten
years ago. Equipped with sophisticated sensors and advanced computing hardware,
phones can be used to infer users' location, activity, social setting and more.
As devices become increasingly intelligent, their capabilities evolve beyond
inferring context to predicting it, and then reasoning and acting upon the
predicted context. This article provides an overview of the current state of
the art in mobile sensing and context prediction paving the way for
full-fledged anticipatory mobile computing. We present a survey of phenomena
that mobile phones can infer and predict, and offer a description of machine
learning techniques used for such predictions. We then discuss proactive
decision making and decision delivery via the user-device feedback loop.
Finally, we discuss the challenges and opportunities of anticipatory mobile
computing.Comment: 29 pages, 5 figure
Use of nonintrusive sensor-based information and communication technology for real-world evidence for clinical trials in dementia
Cognitive function is an important end point of treatments in dementia clinical trials. Measuring cognitive function by standardized tests, however, is biased toward highly constrained environments (such as hospitals) in selected samples. Patient-powered real-world evidence using information and communication technology devices, including environmental and wearable sensors, may help to overcome these limitations. This position paper describes current and novel information and communication technology devices and algorithms to monitor behavior and function in people with prodromal and manifest stages of dementia continuously, and discusses clinical, technological, ethical, regulatory, and user-centered requirements for collecting real-world evidence in future randomized controlled trials. Challenges of data safety, quality, and privacy and regulatory requirements need to be addressed by future smart sensor technologies. When these requirements are satisfied, these technologies will provide access to truly user relevant outcomes and broader cohorts of participants than currently sampled in clinical trials
Prerequisites for Affective Signal Processing (ASP) - Part V: A response to comments and suggestions
In four papers, a set of eleven prerequisites for affective signal processing (ASP) were identified (van den Broek et al., 2010): validation, triangulation, a physiology-driven approach, contributions of the signal processing community, identification of users, theoretical specification, integration of biosignals, physical characteristics, historical perspective, temporal construction, and real-world baselines. Additionally, a review (in two parts) of affective computing was provided. Initiated by the reactions on these four papers, we now present: i) an extension of the review, ii) a post-hoc analysis based on the eleven prerequisites of Picard et al.(2001), and iii) a more detailed discussion and illustrations of temporal aspects with ASP
Structural Material Property Tailoring Using Deep Neural Networks
Advances in robotics, artificial intelligence, and machine learning are
ushering in a new age of automation, as machines match or outperform human
performance. Machine intelligence can enable businesses to improve performance
by reducing errors, improving sensitivity, quality and speed, and in some cases
achieving outcomes that go beyond current resource capabilities. Relevant
applications include new product architecture design, rapid material
characterization, and life-cycle management tied with a digital strategy that
will enable efficient development of products from cradle to grave. In
addition, there are also challenges to overcome that must be addressed through
a major, sustained research effort that is based solidly on both inferential
and computational principles applied to design tailoring of functionally
optimized structures. Current applications of structural materials in the
aerospace industry demand the highest quality control of material
microstructure, especially for advanced rotational turbomachinery in aircraft
engines in order to have the best tailored material property. In this paper,
deep convolutional neural networks were developed to accurately predict
processing-structure-property relations from materials microstructures images,
surpassing current best practices and modeling efforts. The models
automatically learn critical features, without the need for manual
specification and/or subjective and expensive image analysis. Further, in
combination with generative deep learning models, a framework is proposed to
enable rapid material design space exploration and property identification and
optimization. The implementation must take account of real-time decision cycles
and the trade-offs between speed and accuracy
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
- …