Search CORE

6 research outputs found

Lexical Access Model for Italian -- Modeling human speech processing: identification of words in running speech toward lexical access based on the detection of landmarks and other acoustic cues to features

Author: Arango Javier
Chan Ian
Choi Jeung-Yoon
De Nardis Luca
DeCaprio Alec
Di Benedetto Maria-Gabriella
Shattuck-Hufnagel Stefanie
Publication venue
Publication date: 01/01/2021
Field of study

Modelling the process that a listener actuates in deriving the words intended by a speaker requires setting a hypothesis on how lexical items are stored in memory. This work aims at developing a system that imitates humans when identifying words in running speech and, in this way, provide a framework to better understand human speech processing. We build a speech recognizer for Italian based on the principles of Stevens' model of Lexical Access in which words are stored as hierarchical arrangements of distinctive features (Stevens, K. N. (2002). "Toward a model for lexical access based on acoustic landmarks and distinctive features," J. Acoust. Soc. Am., 111(4):1872-1891). Over the past few decades, the Speech Communication Group at the Massachusetts Institute of Technology (MIT) developed a speech recognition system for English based on this approach. Italian will be the first language beyond English to be explored; the extension to another language provides the opportunity to test the hypothesis that words are represented in memory as a set of hierarchically-arranged distinctive features, and reveal which of the underlying mechanisms may have a language-independent nature. This paper also introduces a new Lexical Access corpus, the LaMIT database, created and labeled specifically for this work, that will be provided freely to the speech research community. Future developments will test the hypothesis that specific acoustic discontinuities - called landmarks - that serve as cues to features, are language independent, while other cues may be language-dependent, with powerful implications for understanding how the human brain recognizes speech.Comment: Submitted to Language and Speech, 202

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

The LaMIT database: a read speech corpus for acoustic studies of the Italian language toward lexical access based on the detection of landmarks and other acoustic cues to features

Author: Arango Javier
Budoni Sara
Chan Ian
Choi Jeung-Yoon
De Nardis Luca
DeCaprio Alec
Di Benedetto Maria-Gabriella
Shattuck-Hufnagel Stefanie
Publication venue: place:New York
Publication date: 01/01/2022
Field of study

The LaMIT database consists in recordings of 100 Italian sentences. The sentences in the database were designed so to include all phonemes of the Italian language, and also take into account the typical frequency of each phoneme in written Italian. Four native adult speakers of Standard Italian, raised and living in Rome, Italy, two female and two male, pronounced the sentences in two different recording sessions; two repetitions for each sentence per speaker were therefore collected, for a total of 800 recordings. The database was specifically created for application in the LaMIT project, that focuses on the application to the Italian language of the Lexical Access model proposed by Ken Stevens for American English. The model relies on the detection of specific acoustic discontinuities called landmarks and other acoustic cues to features that characterize each phoneme. Each recording was thus processed to generate a set of labeling files that identify both predicted landmarks and other cues, and actual landmarks/cues. The labeling files, compiled according to the labeling syntax used in the Praat speech processing software, are also made available as part of the LAMIT database

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Estimation of the frequency of occurence of italian phonemes in text

Author: Arango Javier
Baik Sunwoo
DeCaprio Alec
Di Benedetto Maria-Gabriella
Shattuck-Hufnagel Stefanie
Yao Stephanie
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2020
Field of study

Meeting abstract. No PDF available. ABSTRACT The purpose of this project was to derive a reliable estimate of the frequency of occurrence of the 30 phonemes – plus consonant geminated counterparts- of the Italian language, based on four selected written texts. Since no comparable dataset was found in previous literature, the present analysis may serve as a reference in future studies. Four textual sources were considered: Come si fa una tesi di laurea: le materie umanistiche by Umberto Eco, I promessi sposi by Alessandro Manzoni, a recent article in Corriere della Sera (a popular daily Italian newspaper), and In altre parole by Jhumpa Lahiri. The sources were chosen to represent varied genres, subject matter, time periods, and writing styles. Results of the analysis, which also included an analysis of variance, showed that, for all four sources, the frequencies of occurrence reached relatively stable values after about 6000 phonemes (approximately 1250 words), varying by <0.025%. Estimated frequencies are provided for each single source and as an average across sources

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Estimation of the Frequency of Occurrence of Italian Phonemes in Text

Author: Alec DeCaprio
Javi Arango
Luca De Nardis
Maria Gabriella Di Benedetto
Stefanie Shattuck-Hufnagel
Sunwoo Baik
Publication venue
Publication date: 01/01/2021
Field of study

The purpose of this project was to derive a reliable estimate of the frequency of occurrence of the 30 phonemes – plus consonant geminated counterparts- of the Italian language, based on four selected written texts. Since no comparable dataset was found in previous literature, the present analysis may serve as a reference in future studies. Four textual sources were considered: Come si fa una tesi di laurea: le materie umanistiche by Umberto Eco, I promessi sposi by Alessandro Manzoni, a recent article in Corriere della Sera (a popular daily Italian newspaper), and In altre parole by Jhumpa Lahiri. The sources were chosen to represent varied genres, subject matter, time periods, and writing styles. Results of the analysis, which also included an analysis of variance, showed that, for all four sources, the frequencies of occurrence reached relatively stable values after about 6,000 phonemes (approx.1,250 words), varying by <0.025%. Estimated frequencies are provided for each single source and as an average across sources

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Speech recognition of spoken Italian based on detection of landmarks and other acoustic cues to distinctive features

Author: Arango Javier
Budoni Sara
Choi Jeung-Yoon
De Nardis Luca
DeCaprio Alec
Di Benedetto Maria-Gabriella
Shattuck-Hufnagel Stefanie
Vivaldi Jacopo
Yao Stephanie
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2020
Field of study

Modeling the process that a listener actuates in deriving words intended by a speaker, requires setting a hypothesis on how lexical items are stored in memory. Stevens’ model (2002) postulates that lexical items are stored in memory according to distinctive features, and that these features are hierarchically organized. The model highlights the importance of abrupt acoustic events, named landmarks, in the perception process. In this model, the detection of landmarks is primary in human perception, corresponding to the first phase of recognition. The temporal area around the landmark is then further processed by the listener. Based on the above model, the Speech Communication Group of the Massachusetts Institute of Technology (MIT) developed a speech recognition system—for spoken English—over a span of more than 20 years. In the current work (LaMIT project, Lexical access Model for Italian) the above model is applied to Italian. Exploring a new language will provide insight into how Stevens' approach has universal application across languages, with relevant implications for understanding how the human brain recognizes speech. K. N. Stevens “Toward a model for lexical access based on acoustic landmarks and distinctive features,” J. Acoust. Soc. Am., 111(4), 1872–1891 (2002)

Crossref

Archivio della ricerca- Università di Roma La Sapienza