6 research outputs found
Lexical Access Model for Italian -- Modeling human speech processing: identification of words in running speech toward lexical access based on the detection of landmarks and other acoustic cues to features
Modelling the process that a listener actuates in deriving the words intended
by a speaker requires setting a hypothesis on how lexical items are stored in
memory. This work aims at developing a system that imitates humans when
identifying words in running speech and, in this way, provide a framework to
better understand human speech processing. We build a speech recognizer for
Italian based on the principles of Stevens' model of Lexical Access in which
words are stored as hierarchical arrangements of distinctive features (Stevens,
K. N. (2002). "Toward a model for lexical access based on acoustic landmarks
and distinctive features," J. Acoust. Soc. Am., 111(4):1872-1891). Over the
past few decades, the Speech Communication Group at the Massachusetts Institute
of Technology (MIT) developed a speech recognition system for English based on
this approach. Italian will be the first language beyond English to be
explored; the extension to another language provides the opportunity to test
the hypothesis that words are represented in memory as a set of
hierarchically-arranged distinctive features, and reveal which of the
underlying mechanisms may have a language-independent nature. This paper also
introduces a new Lexical Access corpus, the LaMIT database, created and labeled
specifically for this work, that will be provided freely to the speech research
community. Future developments will test the hypothesis that specific acoustic
discontinuities - called landmarks - that serve as cues to features, are
language independent, while other cues may be language-dependent, with powerful
implications for understanding how the human brain recognizes speech.Comment: Submitted to Language and Speech, 202
The LaMIT database: a read speech corpus for acoustic studies of the Italian language toward lexical access based on the detection of landmarks and other acoustic cues to features
The LaMIT database consists in recordings of 100 Italian sentences. The sentences in the database were designed so to include all phonemes of the Italian language, and also take into account the typical frequency of each phoneme in written Italian. Four native adult speakers of Standard Italian, raised and living in Rome, Italy, two female and two male, pronounced the sentences in two different recording sessions; two repetitions for each sentence per speaker were therefore collected, for a total of 800 recordings.
The database was specifically created for application in the LaMIT project, that focuses on the application to the Italian language of the Lexical Access model proposed by Ken Stevens for American English. The model relies on the detection of specific acoustic discontinuities called landmarks and other acoustic cues to features that characterize each phoneme. Each recording was thus processed to generate a set of labeling files that identify both predicted landmarks and other cues, and actual landmarks/cues. The labeling files, compiled according to the labeling syntax used in the Praat speech processing software, are also made available as part
of the LAMIT database
Estimation of the frequency of occurence of italian phonemes in text
Meeting abstract. No PDF available.
ABSTRACT
The purpose of this project was to derive a reliable estimate of the frequency of occurrence of the 30 phonemes – plus consonant geminated counterparts- of the Italian language, based on four selected written texts. Since no comparable dataset was found in previous literature, the present analysis may serve as a reference in future studies. Four textual sources were considered: Come si fa una tesi di laurea: le materie umanistiche by Umberto Eco, I promessi sposi by Alessandro Manzoni, a recent article in Corriere della Sera (a popular daily Italian newspaper), and In altre parole by Jhumpa Lahiri. The sources were chosen to represent varied genres, subject matter, time periods, and writing styles. Results of the analysis, which also included an analysis of variance, showed that, for all four sources, the frequencies of occurrence reached relatively stable values after about 6000 phonemes (approximately 1250 words), varying by <0.025%. Estimated frequencies are provided for each single source and as an average across sources
Estimation of the Frequency of Occurrence of Italian Phonemes in Text
The purpose of this project was to derive a reliable estimate of the frequency of occurrence of the 30 phonemes – plus consonant geminated counterparts- of the Italian language, based on four selected written texts. Since no comparable dataset was found in previous literature, the present analysis may serve as a reference in future studies. Four textual sources were considered: Come si fa una tesi di laurea: le materie umanistiche by Umberto Eco, I promessi sposi by Alessandro Manzoni, a recent article in Corriere della Sera (a popular daily Italian newspaper), and In altre parole by Jhumpa Lahiri. The sources were chosen to represent varied genres, subject matter, time periods, and writing styles. Results of the analysis, which also included an analysis of variance, showed that, for all four sources, the frequencies of occurrence reached relatively stable values after about 6,000 phonemes (approx.1,250 words), varying by <0.025%. Estimated frequencies are provided for each single source and as an average across sources
Speech recognition of spoken Italian based on detection of landmarks and other acoustic cues to distinctive features
Modeling the process that a listener actuates in deriving words intended by a speaker, requires setting a hypothesis on how lexical items are stored in memory. Stevens’ model (2002) postulates that lexical items are stored in memory according to distinctive features, and that these features are hierarchically organized. The model highlights the importance of abrupt acoustic events, named landmarks, in the perception process. In this model, the detection of landmarks is primary in human perception, corresponding to the first phase of recognition. The temporal area around the landmark is then further processed by the listener. Based on the above model, the Speech Communication Group of the Massachusetts Institute of Technology (MIT) developed a speech recognition system—for spoken English—over a span of more than 20 years. In the current work (LaMIT project, Lexical access Model for Italian) the above model is applied to Italian. Exploring a new language will provide insight into how Stevens' approach has universal application across languages, with relevant implications for understanding how the human brain recognizes speech. K. N. Stevens “Toward a model for lexical access based on acoustic landmarks and distinctive features,” J. Acoust. Soc. Am., 111(4), 1872–1891 (2002)