20 research outputs found

    Subword-based Indexing for a Minimal False Positive Rate

    Get PDF
    Subword-based Indexing for a Minimal False Positive Rat

    Evaluation of noisy transcripts for spoken document retrieval

    Get PDF
    Spoken Document Retrieval (SDR) is usually implemented by using an Information Retrieval (IR) engine on speech transcripts that are produced by an Automatic Speech Recognition (ASR) system. These transcripts generally contain a substantial amount of transcription errors (noise) and are mostly unstructured. This thesis addresses two challenges that arise when doing IR on this type of source material: i. segmentation of speech transcripts into suitable retrieval units, and ii. evaluation of the impact of transcript noise on the results of an IR task.\ud It is shown that intrinsic evaluation results in different conclusions with regard to the quality of automatic story boundaries than when (extrinsic) Mean Average Precision (MAP) is used. This indicates that for automatic story segmentation for search applications, the traditionally used (intrinsic) segmentation cost may not be a good performance target. The best performance in an SDR context was achieved using lexical cohesion-based approaches, rather than the statistical approaches that were most popular in story segmentation benchmarks.\ud For the evaluation of speech transcript noise in an SDR context a novel framework is introduced, in which evaluation is done in an extrinsic, and query-dependent manner but without depending on relevance judgments. This is achieved by making a direct comparison between the ranked results lists of IR tasks on a reference and an ASR-derived transcription. The resulting measures are highly correlated with MAP, making it possible to do extrinsic evaluation of ASR transcripts for ad-hoc collections, while using a similar amount of reference material as the popular intrinsic metric Word Error Rate.\ud The proposed evaluation methods are expected to be helpful for the task of optimizing the configuration of ASR systems for the transcription of (large) speech collections for use in Spoken Document Retrieval, rather than the more traditional dictation tasks

    Radio Oranje: Enhanced Access to a Historical Spoken Word Collection

    Get PDF
    Access to historical audio collections is typically very restricted:\ud content is often only available on physical (analog) media and the\ud metadata is usually limited to keywords, giving access at the level\ud of relatively large fragments, e.g., an entire tape. Many spoken\ud word heritage collections are now being digitized, which allows the\ud introduction of more advanced search technology. This paper presents\ud an approach that supports online access and search for recordings of\ud historical speeches. A demonstrator has been built, based on the\ud so-called Radio Oranje collection, which contains radio speeches by\ud the Dutch Queen Wilhelmina that were broadcast during World War II.\ud The audio has been aligned with its original 1940s manual\ud transcriptions to create a time-stamped index that enables the speeches to be\ud searched at the word level. Results are presented together with\ud related photos from an external database

    Neo-Atlantis: Dutch Responses to Five Meter Sea Level Rise

    Get PDF
    What would happen to the Netherlands if, in 2030, the sea level starts to rise and eventually, after 100 years, a sea level of five meters above current level would be reached? Two socio-economic scenarios are developed from a literature review and by interviews with researchers and practicionersin the domains of social sciences, economics, civil engineering, and land use planning. One scenario describes what would happen in a future characterised by a trend towards further globalisation, marketisation and high economic growth, while the other scenario happens in a future under opposite trends. Under both scenarios, the Southwest and Northwest of the Netherlands – already now below seal level - would be abandoned because of sea level rise. Although most experts believe that geomorphology and current engineering skills allow to largely maintain the territorial integrity of the Netherlands, there are some reasons to assume that this is not likely to happen. Social processes that precede important political decisions – such as the growth of the belief in the reality of SLR and the framing of such decision in a proper political context (policy window) – evolve slowly. Although a flood disaster would speed up decision-making, the general expectation is that decisions would come too late in view of the rate of SLR and the possible pace of construction of works.Extreme sea level rise, The Netherlands, flood defences

    Neo-Atlantis: The Netherlands under a 5-m sea level rise

    Get PDF
    What could happen to the Netherlands if, in 2030, the sea level starts to rise and eventually, after 100 years, a sea level of 5 m above current level would be reached? This question is addressed by studying literature, by interviewing experts in widely differing fields, and by holding an expert workshop on this question. Although most experts believe that geomorphology and current engineering skills would enable the country to largely maintain its territorial integrity, there are reasons to assume that this is not likely to happen. Social processes that precede important political decisions - such as the growth of the belief in the reality of sea level rise and the framing of such decisions in a proper political context (policy window) - evolve slowly. A flood disaster would speed up the decision-making process. The shared opinion of the experts surveyed is that eventually part of the Netherlands would be abandoned. © 2008 The Author(s)

    Story Segmentation for Speech Transcripts in Sparse Data Conditions

    Get PDF
    Information Retrieval systems determine relevance by comparing information needs with the content of potential retrieval units. Unlike most textual data, automatically generated speech transcripts cannot by default be easily divided into obvious retrieval units due to a lack of explicit structural markers. This problem can be addressed by automatically detecting topically cohesive segments, or stories. However, when the content collection consists of speech from less formal domains than broadcast news, most of the standard automatic boundary detection methods are potentially unsuitable due to their reliance on learned features. In particular for conversational speech, the lack of adequate training data can present a significant issue. In this paper four methods for automatic segmentation of speech transcriptions are compared. These are selected because of their independence from collection specific knowledge and implemented without the use of training data. Two of the four methods are based on existing algorithms, the others are novel approaches based on a dynamic segmentation algorithm (QDSA) that incorporates information about the query, and WordNet. Experiments were done on a task similar to TREC SDR unknown boundaries condition. For the best performing system, QDSA, the retrieval scores for a tfidf-type ranking function were equivalent to a reference segmentation, and improved through document length normalization using the bm25/Okapi method. For the task of automatically segmenting speech transcripts for use in information retrieval, we conclude that a training-poor processing paradigm which can be crucial for handling surprise data is feasible

    Evaluating ASR Output for Information Retrieval

    Get PDF
    Within the context of international benchmarks and collection specific projects, much work on spoken document retrieval has been done in recent years. In 2000 the issue of automatic speech recognition for spoken document retrieval was declared 'solved' for the broadcast news domain. Many collections, however, are not in this domain and automatic speech recognition for these collections may contain specific new challenges. This requires a method to evaluate automatic speech recognition optimization schemes for these application areas. Traditional measures such as word error rate and story word error rate are not ideal for this. In this paper, three new metrics are proposed. Their behaviour is investigated on a cultural heritage collection and performance is compared to traditional measurements on TREC broadcast news data

    14 Radio Oranje: Enhanced Access to a Historical Spoken Word Collection

    No full text
    Access to historical audio collections is typically very restricted: content is often only available on physical (analog) media and the metadata is usually limited to keywords, giving access at the level of relatively large fragments, e.g., an entire tape. Many spoken word heritage collections are now being digitized, which allows the introduction of more advanced search technology. This paper presents an approach that supports online access and search for recordings of historical speeches. A demonstrator has been built, based on the so-called Radio Oranje collection, which contains radio speeches by the Dutch Queen Wilhelmina that were broadcast during World War II. The audio has been aligned with its original 1940s manual transcriptions to create a time-stamped index that enables the speeches to be searched at the word level. Results are presented together with related photos from an external database

    Speech Transcript Evaluation for Information Retrieval

    Get PDF
    Speech recognition transcripts are being used in various fields of research and practical applications, putting various demands on their accuracy. Traditionally ASR research has used intrinsic evaluation measures such as word error rate to determine transcript quality. In non-dictation-type applications such as speech retrieval, it is better to use extrinsic (or task specific) measures. Indexation and the associated processing may eliminate certain errors, whereas the search query may reveal others. In this work, we argue that the standard extrinsic speech retrieval measure average precision is unpractical for ASR evaluation. As an alternative we propose the use of ranked correlation measures on the output of the speech retrieval task, with the goal of predicting relative mean average precision. The measures we used showed a reasonably high correlation with average precision, but require much less human effort to calculate and can be more easily deployed in a variety of real-life settings
    corecore