4,716 research outputs found

    Investigating cross-language speech retrieval for a spontaneous conversational speech collection

    Get PDF
    Cross-language retrieval of spontaneous speech combines the challenges of working with noisy automated transcription and language translation. The CLEF 2005 Cross-Language Speech Retrieval (CL-SR) task provides a standard test collection to investigate these challenges. We show that we can improve retrieval performance: by careful selection of the term weighting scheme; by decomposing automated transcripts into phonetic substrings to help ameliorate transcription errors; and by combining automatic transcriptions with manually-assigned metadata. We further show that topic translation with online machine translation resources yields effective CL-SR

    Temporal Parameters of Spontaneous Speech in Forensic Speaker Identification in Case of Language Mismatch: Serbian as L1 and English as L2

    Get PDF
    Celem badania jest analiza możliwości identyfikacji mówcy kryminalistycznego i sądowego podczas zadawania pytań w różnych językach, z wykorzystaniem parametrów temporalnych. (wskaźnik artykulcji, wskaźnik mowy, stopień niezdecydowania, odsetek pauz, średnia czas trwania pauzy). Korpus obejmuje 10 mówców kobiet z Serbii, które znają język angielksi na poziomie zaawwansowanym. Patrametry są badane z wykorzystaniem beayesowskiego wzoru wskaźnika prawdopodobieństwa w 40 parach tcyh samych mówców i w 230 parach różnych mówców, z uwzględnieniem szacunku wskaźnika błędu, równiego wskaźnika błędu i Całościowego Wskaźnika Prawdopodobieństwa. badanie ma charakter pionierski w zakresie językoznawstwa sądowego i kryminalistycznego por1) ónawczego w parze jezyka serbskiego i angielskiego, podobnie, jak analiza parametrów temporalnych mówców bilingwalnych. Dalsze badania inny skoncentrować się na porównaniu języków z rytmem akcentowym i z rytmem sylabicznym. The purpose of the research is to examine the possibility of forensic speaker identification if question and suspect sample are in different languages using temporal parameters (articulation rate, speaking rate, degree of hesitancy, percentage of pauses, average pause duration). The corpus includes 10 female native speakers of Serbian who are proficient in English. The parameters are tested using Bayesian likelihood ratio formula in 40 same-speaker and 360 different-speaker pairs, including estimation of error rates, equal error rates and Overall Likelihood Ratio. One-way ANOVA is performed to determine whether inter-speaker variability is higher than intra- speaker variability across languages. The most successful discriminant is degree of hesitancy with ER of 42.5%/28%, (EER: 33%), followed by average pause duration with ER 35%/45.56%, (EER: 40%). Although the research features a closed-set comparison, which is not very common in forensic reality, the results are still relevant for forensic phoneticians working on criminal cases or as expert witnesses. This study pioneers in forensically comparing Serbian and English as well as in forensically testing temporal parameters on bilingual speakers. Further research should focus on comparing two stress-timed or two syllable-timed languages to test whether they will be more comparable in terms of temporal aspects of speech.

    BEA – A multifunctional Hungarian spoken language database

    Get PDF
    In diverse areas of linguistics, the demand for studying actual language use is on the increase. The aim of developing a phonetically-based multi-purpose database of Hungarian spontaneous speech, dubbed BEA2, is to accumulate a large amount of spontaneous speech of various types together with sentence repetition and reading. Presently, the recorded material of BEA amounts to 260 hours produced by 280 present-day Budapest speakers (ages between 20 and 90, 168 females and 112 males), providing also annotated materials for various types of research and practical applications

    Identyfikacja parametrów czasowych mowy spontanicznej mówców kryminalistycznych w przypadku niedopasowania językowego: język serbski jako L1 i język angielski jako L2

    Get PDF
    The purpose of the research is to examine the possibility of forensic speaker identification if question and suspect sample are in different languages using temporal parameters (articulation rate, speaking rate, degree of hesitancy, percentage of pauses, average pause duration). The corpus includes 10 female native speakers of Serbian who are proficient in English. The parameters are tested using Bayesian likelihood ratio formula in 40 same-speaker and 360 different-speaker pairs, including estimation of error rates, equal error rates and Overall Likelihood Ratio. One-way ANOVA is performed to determine whether inter-speaker variability is higher than intra- speaker variability across languages. The most successful discriminant is degree of hesitancy with ER of 42.5%/28%, (EER: 33%), followed by average pause duration with ER 35%/45.56%, (EER: 40%). Although the research features a closed-set comparison, which is not very common in forensic reality, the results are still relevant for forensic phoneticians working on criminal cases or as expert witnesses. This study pioneers in forensically comparing Serbian and English as well as in forensically testing temporal parameters on bilingual speakers. Further research should focus on comparing two stress-timed or two syllable-timed languages to test whether they will be more comparable in terms of temporal aspects of speech. Celem badania jest analiza możliwości identyfikacji mówcy kryminalistycznego i sądowego podczas zadawania pytań w różnych językach, z wykorzystaniem parametrów temporalnych. (wskaźnik artykulcji, wskaźnik mowy, stopień niezdecydowania, odsetek pauz, średnia czas trwania pauzy). Korpus obejmuje 10 mówców kobiet z Serbii, które znają język angielksi na poziomie zaawwansowanym. Patrametry są badane z wykorzystaniem beayesowskiego wzoru wskaźnika prawdopodobieństwa w 40 parach tcyh samych mówców i w 230 parach różnych mówców, z uwzględnieniem szacunku wskaźnika błędu, równiego wskaźnika błędu i Całościowego Wskaźnika Prawdopodobieństwa. badanie ma charakter pionierski w zakresie językoznawstwa sądowego i kryminalistycznego por1) ónawczego w parze jezyka serbskiego i angielskiego, podobnie, jak analiza parametrów temporalnych mówców bilingwalnych. Dalsze badania inny skoncentrować się na porównaniu języków z rytmem akcentowym i z rytmem sylabicznym.

    Machine Assisted Analysis of Vowel Length Contrasts in Wolof

    Full text link
    Growing digital archives and improving algorithms for automatic analysis of text and speech create new research opportunities for fundamental research in phonetics. Such empirical approaches allow statistical evaluation of a much larger set of hypothesis about phonetic variation and its conditioning factors (among them geographical / dialectal variants). This paper illustrates this vision and proposes to challenge automatic methods for the analysis of a not easily observable phenomenon: vowel length contrast. We focus on Wolof, an under-resourced language from Sub-Saharan Africa. In particular, we propose multiple features to make a fine evaluation of the degree of length contrast under different factors such as: read vs semi spontaneous speech ; standard vs dialectal Wolof. Our measures made fully automatically on more than 20k vowel tokens show that our proposed features can highlight different degrees of contrast for each vowel considered. We notably show that contrast is weaker in semi-spontaneous speech and in a non standard semi-spontaneous dialect.Comment: Accepted to Interspeech 201

    A Study of the Assimilative Behavior of the Voiced Labio-Dental Fricative in American English

    Get PDF
    Gradation is one of the main features of colloquial speech. It implies the presence of certain phonological processes that ease the transition between phonemes with different articulatory features. For English, one of these implied processes is assimilation, which is when the articulation of a segment is modified into another one already existing in the system. Our study takes Gimson (1994)’s suggestion that /v/ assimilates into /m/ when it is followed by the bilabial nasal. After observing and describing different cases of assimilation, we suggest more possible explanations to this phenomenon and more assimilative behaviors of /v/. Therefore, we conduct an experiment with six American- English L1s where they evaluate sentences whose articulation includes our suggested proposals. The results show Gimson’s theory not to be as accurate as expected. Furthermore, we prove that /v/ can assimilate into /b/, /ɂ/ and /d/ when it is followed by bilabial, velar and alveolar phonemes.La gradación es una de las características más significativas del lenguaje coloquial. Esta implica la presencia de ciertos procesos fonológicos que facilitan la transición entre fonemas con distintas articulaciones. En el caso del inglés, uno de estos procesos es la asimilación, que consiste en cambiar la articulación de un segmento por la de otro existente en el sistema. Este estudio se basa en la propuesta de Gimson (1994), por la que /v/ se asimila a /m/ cuando le sigue la bilabial nasal. Tras observar y describir más casos de asimilación, nos planteamos distintos comportamientos asimilativos de /v/ en este y otros contextos, que fueron evaluados por medio de un experimento realizado a seis nativos de inglés-americano. Los resultados muestran que la teoría de Gimson no es tan apropiada como se esperaba. Además, concluimos que /v/ puede asimilar a /b/, /ɂ/ y /d/ cuando le siguen ciertos sonidos bilabiales, velares y alveolares.Grado en Estudios Inglese

    English read by Japanese phonetic corpus: an interim report

    Get PDF
    The primary purpose of this paper is to explain the procedure of developing the English Read by Japanese Phonetic Corpus. A series of preliminary studies (Makino 2007, 2008, 2009) made it clear that a phonetically-transcribed computerized corpus of Japanese speakers’ English speech was worth making. Because corpus studies on L2 pronunciation have been very rare, we intend to fill this gap. For the corpus building, the 1,902 sentence files in the English Read by Japanese speech database scored for their individual sounds by American English teachers trained in phonetics in Minematsu, et al. (2002b) have been chosen. The files were pre-processed with the Penn Phonetics Lab Forced Aligner to generate Praat TextGrids where target English words and phonemes were forced-aligned to the speech files. Two additional tiers (actual phones and substitutions) were added to those TextGrids, the actual phones were manually transcribed and the other tiers were aligned to that tier. Then the TextGrids were imported to ELAN, which has a much better searching functionality. So far, fewer than 10% of the files have been completed and the corpus-building is still in its initial stage. The secondary purpose of this paper is to report on some findings from the small part of the corpus that has been completed. Although it is still premature to talk of any tendency in the corpus, it is worth noting that we have found evidence of phenomena which are not readily predicted from L1 phonological transfer, such as the spirantization of voiceless plosives, which is not considered normal in the pronunciation of Japanese

    Fundamental frequency height as a resource for the management of overlap in talk-in-interaction.

    Get PDF
    Overlapping talk is common in talk-in-interaction. Much of the previous research on this topic agrees that speaker overlaps can be either turn competitive or noncompetitive. An investigation of the differences in prosodic design between these two classes of overlaps can offer insight into how speakers use and orient to prosody as a resource for turn competition. In this paper, we investigate the role of fundamental frequency (F0) as a resource for turn competition in overlapping speech. Our methodological approach combines detailed conversation analysis of overlap instances with acoustic measurements of F0 in the overlapping sequence and in its local context. The analyses are based on a collection of overlap instances drawn from the ICSI Meeting corpus. We found that overlappers mark an overlapping incoming as competitive by raising F0 above their norm for turn beginnings, and retaining this higher F0 until the point of overlap resolution. Overlappees may respond to these competitive incomings by returning competition, in which case they raise their F0 too. Our results thus provide instrumental support for earlier claims made on impressionistic evidence, namely that participants in talk-in-interaction systematically manipulate F0 height when competing for the turn
    corecore