Search CORE

1,645 research outputs found

Computation of emotions

Author: Riek L. D.
Robinson P.
Schuller B. W.
Publication venue: ICMI 2014 - Proceedings of the 2014 International Conference on Multimodal Interaction
Publication date: 12/11/2014
Field of study

When people talk to each other, they express their feelings through facial expressions, tone of voice, body postures and gestures. They even do this when they are interacting with machines. These hidden signals are an important part of human communication, but most computer systems ignore them. Emotions need to be considered as an important mode of communication between people and interactive systems. Affective computing has enjoyed considerable success over the past 20 years, but many challenges remain.This is the author's accepted manuscript. The final version is available from ACM in ACM International Conference on Multimodal Interaction published at http://dl.acm.org/citation.cfm?id=2669638

Crossref

Apollo (Cambridge)

The ISOGAL field FC--01863+00035: Mid-IR interstellar extinction and stellar populations

Author: Ganesh S
Jiang B W
Omont A
Schuller F
Simon G
Publication venue
Publication date: 01/01/2003
Field of study

A 0.35\degr

\times

0.29\degr field centered at

l

=--18.63\degr,

b

=0.35\degr was observed during the ISOGAL survey by ISOCAM imaging at 7

\mu

m and 15{\rm

\mu

m}. 648 objects were detected and their brightness are measured. By combining with the DENIS data in the near-infrared J and K

_{\rm S}

bands, one derives the extinction at 7{\rm

\mu

m} through

{\rm A_{K_{\rm S}}-A_7= 0.35 (A_J-A_{K_{\rm S}})}

which yields A

_{7}

_{\it V}

\sim

0.03 from the near-IR extinction values of van de Hulst--Glass (Glass 1999). The extinction structure along the line of sight is then determined from the values of J--K

_{\rm S}

or K

_{\rm S}

--[7] of the ISOGAL sources identified as RGB or early AGB stars with mild mass-loss. The distribution of A

_{\it V}

ranges from 0 to

\sim

45 and it reflects the concentration of the extinction in the spiral arms. Based on their locations in color-magnitude diagrams and a few cross-identifications with IRAS and MSX sources, the nature of objects is discussed in comparison with the case of a low extinction field in Baade's Window. Most of the objects are either AGB stars with moderate mass loss rate or luminous RGB stars. Some of them may be AGB stars with high mass loss rate. In addition, a few young stellar objects (YSOs) are present

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

HAL-INSU

CERN Document Server

Multiple episodes of star formation in the CN15/16/17 molecular complex

Author: Beuther H.
Bik A.
Brandner W.
Gennaro M.
Gouliermis D.
Henning Th.
Hussmann B.
Kudryavtseva N.
Rochau B.
Schuller F.
Stolte A.
Tackenberg J.
Publication venue: 'EDP Sciences'
Publication date: 26/03/2012
Field of study

We have started a campaign to identify massive star clusters inside bright molecular bubbles towards the Galactic Center. The CN15/16/17 molecular complex is the first example of our study. The region is characterized by the presence of two young clusters, DB10 and DB11, visible in the NIR, an ultra-compact HII region identified in the radio, several young stellar objects visible in the MIR, a bright diffuse nebulosity at 8\mu m coming from PAHs and sub-mm continuum emission revealing the presence of cold dust. Given its position on the sky (l=0.58, b=-0.85) and its kinematic distance of ~7.5 kpc, the region was thought to be a very massive site of star formation in proximity of the CMZ. The cluster DB11 was estimated to be as massive as 10^4 M_sun. However the region's properties were known only through photometry and its kinematic distance was very uncertain given its location at the tangential point. We aimed at better characterizing the region and assess whether it could be a site of massive star formation located close to the Galactic Center. We have obtained NTT/SofI JHKs photometry and long slit K band spectroscopy of the brightest members. We have additionally collected data in the radio, sub-mm and mid infrared, resulting in a quite different picture of the region. We have confirmed the presence of massive early B type stars and have derived a spectro-photometric distance of ~1.2 kpc, much smaller than the kinematic distance. Adopting this distance we obtain clusters masses of M(DB10) ~ 170 M_sun and M(DB11) ~ 275 M_sun. This is consistent with the absence of any O star, confirmed by the excitation/ionization status of the nebula. No HeI diffuse emission is detected in our spectroscopic observations at 2.113\mu m, which would be expected if the region was hosting more massive stars. Radio continuum measurements are also consistent with the region hosting at most early B stars.Comment: Accepted for publication in Astronomy and Astrophysics. Fig. 1 and 3 presented in reduced resolutio

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Gravitational dynamics for all tensorial spacetimes carrying predictive, interpretable and quantizable matter

Author: B. Hassett
C. Lämmerzahl
Christof Witte
D. Lovelock
F. W. Hehl
Frederic P. Schuller
Kristina Giesel
Mattias N. R. Wohlfarth
V. Perlick
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2012
Field of study

Only a severely restricted class of tensor fields can provide classical spacetime geometries, namely those that can carry matter field equations that are predictive, interpretable and quantizable. These three conditions on matter translate into three corresponding algebraic conditions on the underlying tensorial geometry, namely to be hyperbolic, time-orientable and energy-distinguishing. Lorentzian metrics, on which general relativity and the standard model of particle physics are built, present just the simplest tensorial spacetime geometry satisfying these conditions. The problem of finding gravitational dynamics---for the general tensorial spacetime geometries satisfying the above minimum requirements---is reformulated in this paper as a system of linear partial differential equations, in the sense that their solutions yield the actions governing the corresponding spacetime geometry. Thus the search for modified gravitational dynamics is reduced to a clear mathematical task.Comment: 47 pages, no figures, minor update

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema

Author: A. Austermann
B. Schuller
B. Schuller
B. Schuller
B. Schuller
B. Schuller
B. Schuller
B. Yang
C. Busso
C. M. Lee
C. Nass
D. Bitouk
D. J. C. MacKay
D. Ververidis
D. Ververidis
D. Watson
E. Benetos
E. Benetos
E. Fersini
E. I. Konstantinidis
F. Burkhardt
F. Burkhardt
Fabio Paternò
H. Altun
H. Gunes
H. K. Mishra
H. Mixdorff
H. P. Espinosa
I. Guyon
I. Guyon
I. R. Murray
J. D. Markel
J. Hirschberg
J. Pittermann
K. Dai
K. R. Scherer
L. B. Jackson
M. Ayadi El
M. Kotti
M. Kotti
M. M. Sondhi
M. Pantic
M. Pantic
Margarita Kotti
N. Sato
N. Vanello
P. Boersma
P. Ekman
P. Ekman
P. N. Juslin
P. Ruvolo
P. Zervas
R. A. Calvo
R. Cowie
R. Tato
R. W. Picard
S. Chandaka
S. Ntalampiras
T. Iliou
T. L. Pao
T. P. Kostoulas
T. Vogt
W. Bosma
W. Minker
Z. Inanoglu
Z. Zeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2012
Field of study

In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisakis model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: Linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers error rates and then to evaluate the information expressed by the classifiers confusion matrices. © Springer Science+Business Media, LLC 2011

Crossref

Spiral - Imperial College Digital Repository

Is speech the new blood? Recent progress in AI-based disease detection from audio in a nutshell

Author: Bartl-Pokorny Katrin D.
Milling Manuel
Pokorny Florian B.
Schuller Björn W.
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

In recent years, advancements in the field of artificial intelligence (AI) have impacted several areas of research and application. Besides more prominent examples like self-driving cars or media consumption algorithms, AI-based systems have further started to gain more and more popularity in the health care sector, however whilst being restrained by high requirements for accuracy, robustness, and explainability. Health-oriented AI research as a sub-field of digital health investigates a plethora of human-centered modalities. In this article, we address recent advances in the so far understudied but highly promising audio domain with a particular focus on speech data and present corresponding state-of-the-art technologies. Moreover, we give an excerpt of recent studies on the automatic audio-based detection of diseases ranging from acute and chronic respiratory diseases via psychiatric disorders to developmental disorders and neurodegenerative disorders. Our selection of presented literature shows that the recent success of deep learning methods in other fields of AI also more and more translates to the field of digital health, albeit expert-designed feature extractors and classical ML methodologies are still prominently used. Limiting factors, especially for speech-based disease detection systems, are related to the amount and diversity of available data, e. g., the number of patients and healthy controls as well as the underlying distribution of age, languages, and cultures. Finally, we contextualize and outline application scenarios of speech-based disease detection systems as supportive tools for health-care professionals under ethical consideration of privacy protection and faulty prediction

OPUS Augsburg

PubMed Central

Dip coating process: Silicon sheet growth development for the large-area silicon sheet task of the low-cost silicon solar array project

Author: Harrison W. B.
Heaps J. D.
Hendrickson G.
Maciolek R. B.
Nelson L. D.
Peterson A. A.
Schuller T. L.
Scott M. W.
Wolner H. A.
Zook J. D.
Publication venue
Publication date
Field of study

The technical and economic feasibility of producing solar cell quality sheet silicon by dip-coating one surface of carbonized ceramic substrates with a thin layer of large grain polycrystalline silicon was investigated. The dip-coating methods studied were directed toward a minimum cost process with the ultimate objective of producing solar cells with a conversion efficiency of 10% or greater. The technique shows excellent promise for low cost, labor-saving, scale-up potentialities and would provide an end product of sheet silicon with a rigid and strong supportive backing. An experimental dip-coating facility was designed and constructed, several substrates were successfully dip-coated with areas as large as 25 sq cm and thicknesses of 12 micron to 250 micron. There appears to be no serious limitation on the area of a substrate that could be coated. Of the various substrate materials dip-coated, mullite appears to best satisfy the requirement of the program. An inexpensive process was developed for producing mullite in the desired geometry

NASA Technical Reports Server

The acoustic dissection of cough: diving into machine listening-based COVID-19 analysis and detection

Author: Bartl-Pokorny Katrin D.
Chang Yi
Pokorny Florian B.
Ren Zhao
Schuller Björn W.
Publication venue
Publication date: 01/01/2022
Field of study

OBJECTIVES: The coronavirus disease 2019 (COVID-19) has caused a crisis worldwide. Amounts of efforts have been made to prevent and control COVID-19′s transmission, from early screenings to vaccinations and treatments. Recently, due to the spring up of many automatic disease recognition applications based on machine listening techniques, it would be fast and cheap to detect COVID-19 from recordings of cough, a key symptom of COVID-19. To date, knowledge of the acoustic characteristics of COVID-19 cough sounds is limited but would be essential for structuring effective and robust machine learning models. The present study aims to explore acoustic features for distinguishing COVID-19 positive individuals from COVID-19 negative ones based on their cough sounds. METHODS: By applying conventional inferential statistics, we analyze the acoustic correlates of COVID-19 cough sounds based on the ComParE feature set, i.e., a standardized set of 6,373 acoustic higher-level features. Furthermore, we train automatic COVID-19 detection models with machine learning methods and explore the latent features by evaluating the contribution of all features to the COVID-19 status predictions. RESULTS: The experimental results demonstrate that a set of acoustic parameters of cough sounds, e.g., statistical functionals of the root mean square energy and Mel-frequency cepstral coefficients, bear essential acoustic information in terms of effect sizes for the differentiation between COVID-19 positive and COVID-19 negative cough samples. Our general automatic COVID-19 detection model performs significantly above chance level, i.e., at an unweighted average recall (UAR) of 0.632, on a data set consisting of 1,411 cough samples (COVID-19 positive/negative: 210/1,201). CONCLUSIONS: Based on the acoustic correlates analysis on the ComParE feature set and the feature analysis in the effective COVID-19 detection approach, we find that several acoustic features that show higher effects in conventional group difference testing are also higher weighted in the machine learning models

OPUS Augsburg

ZENODO

PubMed Central

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institutionelles Repositorium der Leibniz Universität Hannover

Vocalisation repertoire at the end of the first year of life: an exploratory comparison of Rett syndrome and typical development

Author: Bartl-Pokorny Katrin D.
Garrido Dunia
Marschik Peter B.
Pokorny Florian B.
Schuller Björn W.
Zhang Dajie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Rett syndrome (RTT) is a rare, late detected developmental disorder associated with severe deficits in the speech-language domain. Despite a few reports about atypicalities in the speech-language development of infants and toddlers with RTT, a detailed analysis of the pre-linguistic vocalisation repertoire of infants with RTT is yet missing. Based on home video recordings, we analysed the vocalisations between 9 and 11 months of age of three female infants with typical RTT and compared them to three age-matched typically developing (TD) female controls. The video material of the infants had a total duration of 424 min with 1655 infant vocalisations. For each month, we (1) calculated the infants’ canonical babbling ratios with CBR(UTTER), i.e., the ratio of number of utterances containing canonical syllables to total number of utterances, and (2) classified their pre-linguistic vocalisations in three non-canonical and four canonical vocalisation subtypes. All infants achieved the milestone of canonical babbling at 9 months of age according to their canonical babbling ratios, i.e. CBR(UTTER) ≥ 0.15. We revealed overall lower CBRs(UTTER) and a lower proportion of canonical pre-linguistic vocalisations consisting of well-formed sounds that could serve as parts of target-language words for the RTT group compared to the TD group. Further studies with more data from individuals with RTT are needed to study the atypicalities in the pre-linguistic vocalisation repertoire which may portend the later deficits in spoken language that are characteristic features of RTT

OPUS Augsburg

PubMed Central

Speaking corona? Human and machine recognition of COVID-19 from voice

Author: Arnrich Bert
Bartl-Pokorny Katrin D.
Eyben Florian
Hantke Simone
Hecker Pascal
Pokorny Florian B.
Reichel Uwe
Ren Zhao
Schuller Björn W.
Schuller Dagmar M.
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

OPUS Augsburg