Search CORE

528 research outputs found

Exploiting Contextual Information for Prosodic Event Detection Using Auto-Context

Author: Johnson Michael T
Liu Jia
Xia Shanhong
Yang Hua
Zhang Wei-Qiang
Zhao Junhong
Publication venue: e-Publications@Marquette
Publication date: 01/12/2013
Field of study

Prosody and prosodic boundaries carry significant information regarding linguistics and paralinguistics and are important aspects of speech. In the field of prosodic event detection, many local acoustic features have been investigated; however, contextual information has not yet been thoroughly exploited. The most difficult aspect of this lies in learning the long-distance contextual dependencies effectively and efficiently. To address this problem, we introduce the use of an algorithm called auto-context. In this algorithm, a classifier is first trained based on a set of local acoustic features, after which the generated probabilities are used along with the local features as contextual information to train new classifiers. By iteratively using updated probabilities as the contextual information, the algorithm can accurately model contextual dependencies and improve classification ability. The advantages of this method include its flexible structure and the ability of capturing contextual relationships. When using the auto-context algorithm based on support vector machine, we can improve the detection accuracy by about 3% and F-score by more than 7% on both two-way and four-way pitch accent detections in combination with the acoustic context. For boundary detection, the accuracy improvement is about 1% and the F-score improvement reaches 12%. The new algorithm outperforms conditional random fields, especially on boundary detection in terms of F-score. It also outperforms an n-gram language model on the task of pitch accent detection

epublications@Marquette

Springer - Publisher Connector

Prosodic Event Recognition using Convolutional Neural Networks with Context Information

Author: Stehwien Sabrina
Vu Ngoc Thang
Publication venue
Publication date: 02/06/2017
Field of study

This paper demonstrates the potential of convolutional neural networks (CNN) for detecting and classifying prosodic events on words, specifically pitch accents and phrase boundary tones, from frame-based acoustic features. Typical approaches use not only feature representations of the word in question but also its surrounding context. We show that adding position features indicating the current word benefits the CNN. In addition, this paper discusses the generalization from a speaker-dependent modelling approach to a speaker-independent setup. The proposed method is simple and efficient and yields strong results not only in speaker-dependent but also speaker-independent cases.Comment: Interspeech 2017 4 pages, 1 figur

arXiv.org e-Print Archive

Crossref

The CALO meeting speech recognition and understanding system

Author: A Stolcke
B Favre
C Frederickson
D Hakkani-Tür
D Kintzing
D Vergyri
E Shriberg
F Yang
G Tur
J Dowding
J Niekrasz
J Tien
K Leveque
K Riedhammer
L Voss
M Frampton
M Frandsen
M Graciarena
M Purver
R Fernandez
S Mason
S Peters
Publication venue
Publication date: 01/01/2008
Field of study

ABSTRACT The CALO Meeting Assistant provides for distributed meeting capture, annotation, automatic transcription and semantic analysis of multi-party meetings, and is part of the larger CALO personal assistant system. This paper summarizes the CALO-MA architecture and its speech recognition and understanding components, which include realtime and offline speech transcription, dialog act segmentation and tagging, question-answer pair identification, action item recognition, and summarization

CiteSeerX

Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures

Author: Asaei Afsaneh
Bourlard Hervé
Cernak Milos
Luyet Gil
Publication venue: Idiap
Publication date: 19/04/2016
Field of study

This paper shows that exemplar-based speech processing using class-conditional posterior probabilities admits a highly effective search strategy relying on posteriors' intrinsic sparsity structures. The posterior probabilities are estimated for phonetic and phonological classes using deep neural network (DNN) computational framework. Exploiting the class-specific sparsity leads to a simple quantized posterior hashing procedure to reduce the search space of posterior exemplars. To that end, small number of quantized posteriors are regarded as representatives of the posterior space and used as hash keys to index subsets of neighboring exemplars. The

k

nearest neighbor (

k

NN) method is applied for posterior based classification problems. The phonetic posterior probabilities are used as exemplars for phonetic classification whereas the phonological posteriors are used as exemplars for automatic prosodic event detection. Experimental results demonstrate that posterior hashing improves the efficiency of

k

NN classification drastically. This work encourages the use of posteriors as discriminative exemplars appropriate for large scale speech classification tasks

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures

Author: Azzalini
Azzalini
Basellini
Bergeron-Boucher
Bohk-Ewald
Booth
Brouard
Canudas-Romo
Canudas-Romo
Canudas-Romo
Canudas-Romo
Colchero
Coles
Congdon
Davison
Edwards
Fries
Gage
Garg
Gillespie
Gompertz
Graunt
Guillot
Guillot
Heligman
Horiuchi
Horiuchi
Kaergaard
Kannisto
Lexis
Makeham
Mazzuco
Missov
Missov
Missov
Pearson
Perks
Preston
Riley
Rogers
Shkolnikov
Siler
Siler
Tabeau
van Raalte
van Raalte
Vaupel
Vaupel
Vaupel
Wilmoth
Zanotto
Publication venue: Idiap
Publication date: 01/01/2016
Field of study

k

nearest neighbor (

k

k

NN classification drastically. This work encourages the use of posteriors as discriminative exemplars appropriate for large scale speech classification tasks

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

The Australian National University

Archivio istituzionale della ricerca - Università di Padova

First impressions: A survey on vision-based apparent personality trait analysis

Author: Andújar Gran Carlos Antonio
Baró Solé Xavier
Escalante Balderas Hugo Jair
Escalera Guerrero Sergio
Guyon Isabelle
Güçlü Umut
Güçlütürk Yagmur
Jacques Junior Julio
Pérez Quintana Marc
van Gerven Marcel A. J.
van Lier Rob
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

VBN

Radboud Repository