Search CORE

2,483 research outputs found

Introducing Temporal Asymmetries in Feature Extraction for Automatic Speech Recognition

Author: Hermansky Hynek
Sivaram G. S. V. S.
Publication venue
Publication date: 11/02/2010
Field of study

We propose a new auditory inspired feature extraction technique for automatic speech recognition (ASR). Features are extracted by filtering the temporal trajectory of spectral energies in each critical band of speech by a bank of finite impulse response (FIR) filters. Impulse responses of these filters are derived from a modified Gabor envelope in order to emulate asymmetries of the temporal receptive field (TRF) profiles observed in higher level auditory neurons. We obtain

11.4\%

relative improvement in word error rate on OGI-Digits database and,

3.2\%

relative improvement in phoneme error rate on TIMIT database over the MRASTA technique

Infoscience - École polytechnique fédérale de Lausanne

Support Vector Machine Classification of Vocal Fold Vibrations Based on Phonovibrogram Features

Author: Andrew McWhorter
Jan Svec
Jörg Lohscheller
Melda Kunduk
Michael Döllinger
Publication venue: 'IntechOpen'
Publication date: 04/04/2011
Field of study

IntechOpen

The model of an anomaly detector for HiLumi LHC magnets based on Recurrent Neural Networks and adaptive quantization

Author: De Matteis Ernesto
Mertik Matej
Skoczeń Andrzej
Wielgosz Maciej
Publication venue: 'Elsevier BV'
Publication date: 28/09/2017
Field of study

This paper focuses on an examination of an applicability of Recurrent Neural Network models for detecting anomalous behavior of the CERN superconducting magnets. In order to conduct the experiments, the authors designed and implemented an adaptive signal quantization algorithm and a custom GRU-based detector and developed a method for the detector parameters selection. Three different datasets were used for testing the detector. Two artificially generated datasets were used to assess the raw performance of the system whereas the 231 MB dataset composed of the signals acquired from HiLumi magnets was intended for real-life experiments and model training. Several different setups of the developed anomaly detection system were evaluated and compared with state-of-the-art OC-SVM reference model operating on the same data. The OC-SVM model was equipped with a rich set of feature extractors accounting for a range of the input signal properties. It was determined in the course of the experiments that the detector, along with its supporting design methodology, reaches F1 equal or very close to 1 for almost all test sets. Due to the profile of the data, the best_length setup of the detector turned out to perform the best among all five tested configuration schemes of the detection system. The quantization parameters have the biggest impact on the overall performance of the detector with the best values of input/output grid equal to 16 and 8, respectively. The proposed solution of the detection significantly outperformed OC-SVM-based detector in most of the cases, with much more stable performance across all the datasets.Comment: Related to arXiv:1702.0083

arXiv.org e-Print Archive

CERN Document Server

Strength is in numbers: Can concordant artificial listeners improve prediction of emotion from speech?

Author: Daprati E
Di Natale C
Martinelli E
Mencattini A
Publication venue
Publication date: 01/01/2016
Field of study

Humans can communicate their emotions by modulating facial expressions or the tone of their voice. Albeit numerous applications exist that enable machines to read facial emotions and recognize the content of verbal messages, methods for speech emotion recognition are still in their infancy. Yet, fast and reliable applications for emotion recognition are the obvious advancement of present 'intelligent personal assistants', and may have countless applications in diagnostics, rehabilitation and research. Taking inspiration from the dynamics of human group decision-making, we devised a novel speech emotion recognition system that applies, for the first time, a semi-supervised prediction model based on consensus. Three tests were carried out to compare this algorithm with traditional approaches. Labeling performances relative to a public database of spontaneous speeches are reported. The novel system appears to be fast, robust and less computationally demanding than traditional methods, allowing for easier implementation in portable voice-analyzers (as used in rehabilitation, research, industry, etc.) and for applications in the research domain (such as real-time pairing of stimuli to participants' emotional state, selective/differential data collection based on emotional content, etc.)

Directory of Open Access Journals

PubMed Central

ART

FigShare

Building Portuguese Language Resources for Natural Language Processing Tasks

Author: Rúben Filipe Seabra de Almeida
Publication venue
Publication date: 20/07/2023
Field of study

Repositório Aberto da Universidade do Porto

Semantic radical consistency and character transparency effects in Chinese: an ERP study

Author: Su IF
Weekes BS
Publication venue: 'United States Sports Academy'
Publication date: 01/01/2009
Field of study

BACKGROUND: This event-related potential (ERP) study aims to investigate the representation and temporal dynamics of Chinese orthography-to-semantics mappings by simultaneously manipulating character transparency and semantic radical consistency. Character components, referred to as radicals, make up the building blocks used dur...postprin

HKU Scholars Hub

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

Directory of Open Access Books (DOAB)

Can humain association norm evaluate latent semantic analysis?

Author: Gatkowska Izabela
Korzycki Michał
Lubaszewski Wiesław
Publication venue: [s.n.]
Publication date: 01/01/2013
Field of study

This paper presents the comparison of word association norm created by a psycholinguistic experiment to association lists generated by algorithms operating on text corpora. We compare lists generated by Church and Hanks algorithm and lists generated by LSA algorithm. An argument is presented on how those automatically generated lists reflect real semantic relations

Jagiellonian Univeristy Repository