Search CORE

322 research outputs found

Improving the robustness of the usual fbe-based asr front-end

Author: Hernando Pericás Francisco Javier
Macho D
Nadeu Camprubí Climent
Publication venue: Mergablum
Publication date: 01/01/2000
Field of study

All speech recognition systems require some form of signal representation that parametrically models the temporal evolution of the spectral envelope. Current parameterizations involve, either explicitly or implicitly, a set of energies from frequency bands which are often distributed in a mel scale. The computation of those filterbank energies (FBE) always includes smoothing of basic spectral measurements and non-linear amplitude compression. A variety of linear transformations are typically applied to this time-frequency representation prior to the Hidden Markov Model (HMM) pattern-matching stage of recognition. In the paper, we will discuss some robustness issues involved in both the computation of the FBEs and the posterior linear transformations, presenting alternative techniques that can improve robustness in additive noise conditions. In particular, the root non-linearity, a voicing-dependent FBE computation technique and a time&frequency filtering (tiffing) technique will be considered. Recognition results for the Aurora database will be shown to illustrate the potential application of these alternatives techniques for enhancing the robustness of speech recognition systems.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Investigating Voice as a Biomarker for Leucine-Rich Repeat Kinase 2-Associated Parkinson's Disease

Author: AlDakheel Amaal
Arora Siddharth
Connolly Barbara S.
Faust-Socher Achinoam
Gasca-Salas Carmen
Jain Jennifer
Kern Drew S.
Lang Anthony E.
Little Max A.
Marras Connie
Mestre Tiago A.
Slow Elizabeth J.
Tsanas Athanasios
Visanji Naomi P.
Publication venue: 'IOS Press'
Publication date: 01/01/2018
Field of study

We investigate the potential association between leucine-rich repeat kinase 2 (LRRK2) mutations and voice. Sustained phonations ('aaah' sounds) were recorded from 7 individuals with LRRK2-associated Parkinson's disease (PD), 17 participants with idiopathic PD (iPD), 20 non-manifesting LRRK2-mutation carriers, 25 related non-carriers, and 26 controls. In distinguishing LRRK2-associated PD and iPD, the mean sensitivity was 95.4% (SD 17.8%) and mean specificity was 89.6% (SD 26.5%). Voice features for non-manifesting carriers, related non-carriers, and controls were much less discriminatory. Vocal deficits in LRRK2-associated PD may be different than those in iPD. These preliminary results warrant longitudinal analyses and replication in larger cohorts

arXiv.org e-Print Archive

Aston Publications Explorer

Edinburgh Research Explorer

Oxford University Research Archive

Only Words Count; the Rest Is Mere Chattering: A Cross-Disciplinary Approach to the Verbal Expression of Emotional Experience

Author: Cutuli Debora
Fabrizio Carlo
Greco Francesca
Laricchiuta Daniela
Mandolesi Laura
Marini Andrea
Passarello Noemi
Petrosini Laura
Picerni Eleonora
Piras Fabrizio
Spalletta Gianfranco
Termine Andrea
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

The analysis of sequences of words and prosody, meter, and rhythm provided in an interview addressing the capacity to identify and describe emotions represents a powerful tool to reveal emotional processing. The ability to express and identify emotions was analyzed by means of the Toronto Structured Interview for Alexithymia (TSIA), and TSIA transcripts were analyzed by Natural Language Processing to shed light on verbal features. The brain correlates of the capacity to translate emotional experience into words were determined through cortical thickness measures. A machine learning methodology proved that individuals with deficits in identifying and describing emotions (n = 7) produced language distortions, frequently used the present tense of auxiliary verbs, and few possessive determiners, as well as scarcely connected the speech, in comparison to individuals without deficits (n = 7). Interestingly, they showed high cortical thickness at left temporal pole and low at isthmus of the right cingulate cortex. Overall, we identified the neuro-linguistic pattern of the expression of emotional experience

Archivio istituzionale della ricerca - Università degli Studi di Udine

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Only Words Count; the Rest Is Mere Chattering: A Cross-Disciplinary Approach to the Verbal Expression of Emotional Experience

Author: Cutuli Debora
Fabrizio Carlo
Greco Francesca
Laricchiuta Daniela
Mandolesi Laura
Marini Andrea
Passarello Noemi
Petrosini Laura
Picerni Eleonora
Piras Fabrizio
Spalletta Gianfranco
Termine Andrea
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Archivio della ricerca - Università degli studi di Napoli Federico II

Learned versus Hand-Designed Feature Representations for 3d Agglomeration

Author: Bogovic John A.
Huang Gary B.
Jain Viren
Publication venue
Publication date: 20/12/2013
Field of study

For image recognition and labeling tasks, recent results suggest that machine learning methods that rely on manually specified feature representations may be outperformed by methods that automatically derive feature representations based on the data. Yet for problems that involve analysis of 3d objects, such as mesh segmentation, shape retrieval, or neuron fragment agglomeration, there remains a strong reliance on hand-designed feature descriptors. In this paper, we evaluate a large set of hand-designed 3d feature descriptors alongside features learned from the raw data using both end-to-end and unsupervised learning techniques, in the context of agglomeration of 3d neuron fragments. By combining unsupervised learning techniques with a novel dynamic pooling scheme, we show how pure learning-based methods are for the first time competitive with hand-designed 3d shape descriptors. We investigate data augmentation strategies for dramatically increasing the size of the training set, and show how combining both learned and hand-designed features leads to the highest accuracy

arXiv.org e-Print Archive

CiteSeerX

Robust correlated and individual component analysis

Author: Nicolaou M
Panagakis Y
Pantic M
Zafeiriou S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/11/2015
Field of study

© 1979-2012 IEEE.Recovering correlated and individual components of two, possibly temporally misaligned, sets of data is a fundamental task in disciplines such as image, vision, and behavior computing, with application to problems such as multi-modal fusion (via correlated components), predictive analysis, and clustering (via the individual ones). Here, we study the extraction of correlated and individual components under real-world conditions, namely i) the presence of gross non-Gaussian noise and ii) temporally misaligned data. In this light, we propose a method for the Robust Correlated and Individual Component Analysis (RCICA) of two sets of data in the presence of gross, sparse errors. We furthermore extend RCICA in order to handle temporal incongruities arising in the data. To this end, two suitable optimization problems are solved. The generality of the proposed methods is demonstrated by applying them onto 4 applications, namely i) heterogeneous face recognition, ii) multi-modal feature fusion for human behavior analysis (i.e., audio-visual prediction of interest and conflict), iii) face clustering, and iv) thetemporal alignment of facial expressions. Experimental results on 2 synthetic and 7 real world datasets indicate the robustness and effectiveness of the proposed methodson these application domains, outperforming other state-of-the-art methods in the field

Spiral - Imperial College Digital Repository

Acoustic signal processing with robust machine learning algorithm for improved monitoring of particulate solid materials in a gas flowline

Author: Andrew Cowell
Bello
Don McGlinchey
Droubi
El-Alej
El-Alej
Guido
Guo
Haugsdal
Hu
Isaacson
Kos
Kuda Tijjani Aminu
Le
Ludeña-Choez
Mackinnon
Mason
McCulloch
McKay
Mirjalili
Mirjalili
Mitrović
Mittal
Odigie
Ooi
Riedmiller
Shannon
Shuiping
Sun
Sun
Thiruvenkatanathan
Toh
Waibel
Wang
Wang
Xie
Yan
Publication venue: 'Elsevier BV'
Publication date: 01/03/2019
Field of study

Crossref

ResearchOnline@GCU

Palatalization in Romanian — Acoustic properties and perception

Author: Bunnell H. Timothy
Spinu Laura
Vogel Irene
Publication venue: CUNY Academic Works
Publication date: 01/01/2012
Field of study

This paper presents the results of an acoustic study of fricatives from four places of articulation produced by 31 native speakers of Romanian, as well as those of a perceptual study using the stimuli from the acoustic experiment, allowing for a direct comparison between acoustic properties and perception. It was found that there are greater acoustic differences between plain and palatalized labials and dorsals as compared to coronals. The acoustic results were paralleled by the perceptual findings. This pattern departs from cross-linguistic generalizations made with respect to the properties of secondary palatalization. A likely source of the differences is the fact that previous studies of secondary palatalization typically involved stops which tend to exhibit various enhancement phenomena at the coronal place of articulation. Since the enhancement generally involves additional frication, this is not a useful strategy for fricatives at the coronal, or any other place of articulation. These findings form the basis of a discussion highlighting the differences between enhanced and non-enhanced secondary palatalization

City University of New York