203,669 research outputs found

    Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech

    Get PDF
    The searching spontaneous speech can be enhanced by combining automatic speech transcriptions with semantically related metadata. An important question is what can be expected from search of such transcriptions and different sources of related metadata in terms of retrieval effectiveness. The Cross-Language Speech Retrieval (CL-SR) track at recent CLEF workshops provides a spontaneous speech test collection with manual and automatically derived metadata fields. Using this collection we investigate the comparative search effectiveness of individual fields comprising automated transcriptions and the available metadata. A further important question is how transcriptions and metadata should be combined for the greatest benefit to search accuracy. We compare simple field merging of individual fields with the extended BM25 model for weighted field combination (BM25F). Results indicate that BM25F can produce improved search accuracy, but that it is currently important to set its parameters suitably using a suitable training set

    Machito and His Afro-Cubans: Selected Transcriptions

    Full text link
    Machito (Francisco Raúl Grillo, 1909–1984) was born into a musical family in Havana, Cuba, and was already an experienced vocalist when he arrived in New York City in 1937. In 1940 he teamed up with his brother-in-law, the Cuban trumpeter Mario Bauzá (1911–1993), who had already made a name for himself with top African American swing bands such as those of Chick Webb and Cab Calloway. Together, Machito and Bauzá formed Machito and his Afro-Cubans. With Bauzá as musical director, the band forged vital pan-African connections by fusing Afro-Cuban rhythms with modern jazz and by collaborating with major figures in the bebop movement. Highly successful with Latino as well as black and white audiences, Machito and his Afro-Cubans recorded extensively and performed in dance halls, nightclubs, and on the concert stage. In this volume, ethnomusicologist Paul Austerlitz and bandleader and professor Jere Laukkanen (both experienced Latin jazz performers) present transcriptions from Machito’s recordings which meticulously illustrate the improvised as well as scored vocal, reed, brass, and percussion parts of the music. Austerlitz’s introductory essay traces the history of Afro-Cuban jazz in New York, a style that exerted a profound impact on leaders of the bebop movement, including Dizzy Gillespie and Charlie Parker, who appears as a guest soloist with Machito on some of the music transcribed here. This is MUSA’s first volume to represent the significant Latino heritage in North American music.https://cupola.gettysburg.edu/books/1106/thumbnail.jp

    AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

    Full text link
    An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the largest corpus which is suitable for conducting the speech recognition research and building speech recognition systems for Mandarin. The recording procedure, including audio capturing devices and environments are presented in details. The preparation of the related resources, including transcriptions and lexicon are described. The corpus is released with a Kaldi recipe. Experimental results implies that the quality of audio recordings and transcriptions are promising.Comment: Oriental COCOSDA 201

    Daniel Mowry Cemetery Condition Reports

    Get PDF
    This cemetery contains 85 burials. Transcriptions here include all of the markings on each stone located within the cemetery. Additionally, if stones were illegible a rubbing of the stone was completed. Both headstones and footstones are included in the transcription report

    Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop

    Get PDF
    We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding the discovery of linguistic units (subwords and words) in a language without orthography. We study the replacement of orthographic transcriptions by images and/or translated text in a well-resourced language to help unsupervised discovery from raw speech.Comment: Accepted to ICASSP 201

    How speaker tongue and name source language affect the automatic recognition of spoken names

    Get PDF
    In this paper the automatic recognition of person names and geographical names uttered by native and non-native speakers is examined in an experimental set-up. The major aim was to raise our understanding of how well and under which circumstances previously proposed methods of multilingual pronunciation modeling and multilingual acoustic modeling contribute to a better name recognition in a cross-lingual context. To come to a meaningful interpretation of results we have categorized each language according to the amount of exposure a native speaker is expected to have had to this language. After having interpreted our results we have also tried to find an answer to the question of how much further improvement one might be able to attain with a more advanced pronunciation modeling technique which we plan to develop
    corecore