22 research outputs found

    Audio Indexing on the Web: a Preliminary Study of Some Audio Descriptors

    Get PDF
    Colloque avec actes et comité de lecture. internationale.International audienceThe "Invisible Web" is composed of documents which can not be currently accessed by Web search engines, because they have a dynamic URL or are not textual, like video or audio documents. For audio documents, one solution is automatic indexing. It consists in finding good descriptors of audio documents which can be used as indexes for archiving and search. This paper presents an overview and recent results of the RAIVES project, a French research project on audio indexing. We present speech/music segmentation, speaker tracking, and keywords detection. We also give a few perspectives of the RAIVES project

    Application of automatic speech recognition technologies to singing

    Get PDF
    The research field of Music Information Retrieval is concerned with the automatic analysis of musical characteristics. One aspect that has not received much attention so far is the automatic analysis of sung lyrics. On the other hand, the field of Automatic Speech Recognition has produced many methods for the automatic analysis of speech, but those have rarely been employed for singing. This thesis analyzes the feasibility of applying various speech recognition methods to singing, and suggests adaptations. In addition, the routes to practical applications for these systems are described. Five tasks are considered: Phoneme recognition, language identification, keyword spotting, lyrics-to-audio alignment, and retrieval of lyrics from sung queries. The main bottleneck in almost all of these tasks lies in the recognition of phonemes from sung audio. Conventional models trained on speech do not perform well when applied to singing. Training models on singing is difficult due to a lack of annotated data. This thesis offers two approaches for generating such data sets. For the first one, speech recordings are made more “song-like”. In the second approach, textual lyrics are automatically aligned to an existing singing data set. In both cases, these new data sets are then used for training new acoustic models, offering considerable improvements over models trained on speech. Building on these improved acoustic models, speech recognition algorithms for the individual tasks were adapted to singing by either improving their robustness to the differing characteristics of singing, or by exploiting the specific features of singing performances. Examples of improving robustness include the use of keyword-filler HMMs for keyword spotting, an i-vector approach for language identification, and a method for alignment and lyrics retrieval that allows highly varying durations. Features of singing are utilized in various ways: In an approach for language identification that is well-suited for long recordings; in a method for keyword spotting based on phoneme durations in singing; and in an algorithm for alignment and retrieval that exploits known phoneme confusions in singing.Das Gebiet des Music Information Retrieval befasst sich mit der automatischen Analyse von musikalischen Charakteristika. Ein Aspekt, der bisher kaum erforscht wurde, ist dabei der gesungene Text. Auf der anderen Seite werden in der automatischen Spracherkennung viele Methoden fĂŒr die automatische Analyse von Sprache entwickelt, jedoch selten fĂŒr Gesang. Die vorliegende Arbeit untersucht die Anwendung von Methoden aus der Spracherkennung auf Gesang und beschreibt mögliche Anpassungen. Zudem werden Wege zur praktischen Anwendung dieser AnsĂ€tze aufgezeigt. FĂŒnf Themen werden dabei betrachtet: Phonemerkennung, Sprachenidentifikation, Schlagwortsuche, Text-zu-Gesangs-Alignment und Suche von Texten anhand von gesungenen Anfragen. Das grĂ¶ĂŸte Hindernis bei fast allen dieser Themen ist die Erkennung von Phonemen aus Gesangsaufnahmen. Herkömmliche, auf Sprache trainierte Modelle, bieten keine guten Ergebnisse fĂŒr Gesang. Das Trainieren von Modellen auf Gesang ist schwierig, da kaum annotierte Daten verfĂŒgbar sind. Diese Arbeit zeigt zwei AnsĂ€tze auf, um solche Daten zu generieren. FĂŒr den ersten wurden Sprachaufnahmen kĂŒnstlich gesangsĂ€hnlicher gemacht. FĂŒr den zweiten wurden Texte automatisch zu einem vorhandenen Gesangsdatensatz zugeordnet. Die neuen DatensĂ€tze wurden zum Trainieren neuer Modelle genutzt, welche deutliche Verbesserungen gegenĂŒber sprachbasierten Modellen bieten. Auf diesen verbesserten akustischen Modellen aufbauend wurden Algorithmen aus der Spracherkennung fĂŒr die verschiedenen Aufgaben angepasst, entweder durch das Verbessern der Robustheit gegenĂŒber Gesangscharakteristika oder durch das Ausnutzen von hilfreichen Besonderheiten von Gesang. Beispiele fĂŒr die verbesserte Robustheit sind der Einsatz von Keyword-Filler-HMMs fĂŒr die Schlagwortsuche, ein i-Vector-Ansatz fĂŒr die Sprachenidentifikation sowie eine Methode fĂŒr das Alignment und die Textsuche, die stark schwankende Phonemdauern nicht bestraft. Die Besonderheiten von Gesang werden auf verschiedene Weisen genutzt: So z.B. in einem Ansatz fĂŒr die Sprachenidentifikation, der lange Aufnahmen benötigt; in einer Methode fĂŒr die Schlagwortsuche, die bekannte Phonemdauern in Gesang mit einbezieht; und in einem Algorithmus fĂŒr das Alignment und die Textsuche, der bekannte Phonemkonfusionen verwertet

    Computational Pronunciation Analysis in Sung Utterances

    Get PDF
    Recent automatic lyrics transcription (ALT) approaches focus on building stronger acoustic models or in-domain language models, while the pronunciation aspect is seldom touched upon. This paper applies a novel computational analysis on the pronunciation variances in sung utterances and further proposes a new pronunciation model adapted for singing. The singing-adapted model is tested on multiple public datasets via word recognition experiments. It performs better than the standard speech dictionary in all settings reporting the best results on ALT in a capella recordings using n-gram language models. For reproducibility, we share the sentence-level annotations used in testing, providing a new benchmark evaluation set for ALT

    Proceedings of the 6th International Workshop on Folk Music Analysis, 15-17 June, 2016

    Get PDF
    The Folk Music Analysis Workshop brings together computational music analysis and ethnomusicology. Both symbolic and audio representations of music are considered, with a broad range of scientific approaches being applied (signal processing, graph theory, deep learning). The workshop features a range of interesting talks from international researchers in areas such as Indian classical music, Iranian singing, Ottoman-Turkish Makam music scores, Flamenco singing, Irish traditional music, Georgian traditional music and Dutch folk songs. Invited guest speakers were Anja Volk, Utrecht University and Peter Browne, Technological University Dublin

    November 21, 2002

    Get PDF
    The Breeze is the student newspaper of James Madison University in Harrisonburg, Virginia

    Conception of giftedness and talent by pre service and in service primary school teachers in Johor, Malaysia: An exploration using a multiphase mixed methods design

    Get PDF
    The main purpose of this study is to investigate the conception of giftedness and talent as perceived by pre service and in service primary school teachers in Malaysia. In addition, this study aims to explore teachers’ perceptions of giftedness on specific issues which are a) sources of information about giftedness, b) confidence in identifying gifted and talented students, c) awareness on identifying assessments, d) perception about the adequacy of teacher training to deal with gifted and talented students, e) relevance of labelling, and f) aspects considered as important in the development of gifted education in Malaysia which are primarily explored using qualitative approaches. To explore those various issues, a mixed methods design was used in this study involving pre service (n = 546) and in service primary school teachers (n = 632). Structured questionnaires were administered to 1178 teachers at various locations of education institutions (e.g. schools, institutes of teacher education and universities). Six female teachers were involved in the qualitative data collection using semi-structured questionnaires and interviews. Two main types of analyses were used. First, the patterns of teachers’ notion of giftedness and talent are examined using principal component analysis. In addition to that, descriptive analysis was used to provide preliminary findings of this study. Also, independent t-test was used to examine any difference between groups. Second, thematic analysis was used to uncover thematic code (variables) from relevant responses. The findings from quantitative and qualitative data were integrated to answer two research questions. Based from the integration of both findings, it was found that teachers’ conception of giftedness and talent is diverse and this reflect on the current situation in which there is no consensus on the conception of giftedness and talent among theorists. In addition, Malaysian teachers reported that ‘gifted’ and ‘talented’ are separate groups of individuals with extraordinary abilities. Giftedness is perceived in relation to intellectual abilities and to certain extent, domain specific to mathematics and science. Talent is perceived in relation to non-intellective abilities such as domain of psychomotor abilities. Even though gifted and talented are perceived as non-unitary concept, both are perceived sharing similar characteristics such as creativity and domain specific of ability. In this study, it was discovered that giftedness is perceived in relation to intellective domain whereas talent is related to non-intellective domain. The qualitative findings suggested that teachers view giftedness and talent somewhat differently. The variations are explored to in this study which could be attributed to the inadequacy of information from various sources and teacher training and/or experience. The nuances of their understanding on varied aspects in relation to this phenomenon called as giftedness call for more exploration as this study only provides preliminary evidence on the existing conception of giftedness and talent as held by teachers in Malaysia

    Rhyme, Rhythm, and Rhubarb: Using Probabilistic Methods to Analyze Hip Hop, Poetry, and Misheard Lyrics

    Get PDF
    While text Information Retrieval applications often focus on extracting semantic features to identify the topic of a document, and Music Information Research tends to deal with melodic, timbral or meta-tagged data of songs, useful information can be gained from surface-level features of musical texts as well. This is especially true for texts such as song lyrics and poetry, in which the sound and structure of the words is important. These types of lyrical verse usually contain regular and repetitive patterns, like the rhymes in rap lyrics or the meter in metrical poetry. The existence of such patterns is not always categorical, as there may be a degree to which they appear or apply in any sample of text. For example, rhymes in hip hop are often imperfect and vary in the degree to which their constituent parts differ. Although a definitive decision as to the existence of any such feature cannot always be made, large corpora of known examples can be used to train probabilistic models enumerating the likelihood of their appearance. In this thesis, we apply likelihood-based methods to identify and characterize patterns in lyrical verse. We use a probabilistic model of mishearing in music to resolve misheard lyric search queries. We then apply a probabilistic model of rhyme to detect imperfect and internal rhymes in rap lyrics and quantitatively characterize rappers' styles in their use. Finally, we compute likelihoods of prosodic stress in words to perform automated scansion of poetry and compare poets' usage of and adherence to meter. In these applications, we find that likelihood-based methods outperform simpler, rule-based models at finding and quantifying lyrical features in text

    Maritime expressions:a corpus based exploration of maritime metaphors

    Get PDF
    This study uses a purpose-built corpus to explore the linguistic legacy of Britain’s maritime history found in the form of hundreds of specialised ‘Maritime Expressions’ (MEs), such as TAKEN ABACK, ANCHOR and ALOOF, that permeate modern English. Selecting just those expressions commencing with ’A’, it analyses 61 MEs in detail and describes the processes by which these technical expressions, from a highly specialised occupational discourse community, have made their way into modern English. The Maritime Text Corpus (MTC) comprises 8.8 million words, encompassing a range of text types and registers, selected to provide a cross-section of ‘maritime’ writing. It is analysed using WordSmith analytical software (Scott, 2010), with the 100 million-word British National Corpus (BNC) as a reference corpus. Using the MTC, a list of keywords of specific salience within the maritime discourse has been compiled and, using frequency data, concordances and collocations, these MEs are described in detail and their use and form in the MTC and the BNC is compared. The study examines the transformation from ME to figurative use in the general discourse, in terms of form and metaphoricity. MEs are classified according to their metaphorical strength and their transference from maritime usage into new registers and domains such as those of business, politics, sports and reportage etc. A revised model of metaphoricity is developed and a new category of figurative expression, the ‘resonator’, is proposed. Additionally, developing the work of Lakov and Johnson, Kovesces and others on Conceptual Metaphor Theory (CMT), a number of Maritime Conceptual Metaphors are identified and their cultural significance is discussed

    Bowdoin Orient v.134, no.1-24 (2004-2005)

    Get PDF
    https://digitalcommons.bowdoin.edu/bowdoinorient-2000s/1005/thumbnail.jp
    corecore