6,999 research outputs found
Special Libraries, January 1962
Volume 53, Issue 1https://scholarworks.sjsu.edu/sla_sl_1962/1000/thumbnail.jp
Heat of Discussion: A New Approach to Understanding Parliamentary Discussion
This paper offers an overview of the video retrieval system we have developed for the Japanese Diet. With our video retrieval system one can directly retrieve the video feed segment of interest, gain a visual understanding of the flow of parliamentary debate, and check the facial expressions and body language of the speaker. In this paper, we demonstrate how one can retrieve video streaming on user terminals that do not support Japanese language input, and suggest a variety of ways in which our video retrieval system can be utilized. Also, we report a first systematic analysis on the correspondence between the official minutes and the results of speech recognition of recordings of parliamentary meetings. Departing from tradition of focusing on written official minutes, we investigate the variation in the rate of correspondence and understand complex and multifaceted nature of parliamentary discussion. We believe that our system encourages research on the utilization of visual information in policymaking and marks a step toward the provision of universal access to policy information.This work is supported by JSPS Kakenhi Grant Number 15H05727 and based on a paper prepared for presentation at the 25th IPSA World Congress of Political Science, Brisbane, Australia, July 21 - 26, 2018.http://www.grips.ac.jp/list/jp/facultyinfo/masuyama_mikitaka
Special Libraries, February 1962
Volume 53, Issue 2https://scholarworks.sjsu.edu/sla_sl_1962/1001/thumbnail.jp
Recommended from our members
Chapter 2: The Original ToBI System and the Evolution of the ToBI Framework
In this chapter, the authors will try to identify the essential properties of a ToBI framework annotation system by describing the development and design of the original ToBI conventions. In this description, the authors will overview the general phonological theory and the specific theory of Mainstream American English intonation and prosody that the authors decided to incorporate in the original ToBI tags. The authors will also state the practical principles that led us to make the decisions that the authors did. The chapter is organised as follows. Section 2.2 briefly chronicles how the MAE_ToBI system came into being. Section 2.3 briefly describes the consensus account of English intonation and prosody on which the MAE_ToBI system is based. Section 2.4 catalogues the different components of a MAE_ToBI transcription and lists the salient rules which constrain the relationships between different components. This section also expands upon the theoretical foundations and practical consequences of adopting the general structure of multiple labelling tiers, and particularly the separation of the labels for tones from the labels for indexing prosodic boundary strength. Section 2.5 then describes some of the extensions of the basic ToBI tiers that have been adopted by some sites. This section also compares our decisions about the number of tiers and about inter-tier constraints with the analogous decisions for some of the other ToBI systems described in this book. Section 2.6 discusses the status of the symbolic labels relative to the continuous phonetic records that are also an obligatory component of the MAE_ToBI transcription. Section 2.7 then closes by listing several open research questions that the authors would like to see addressed by MAE_ToBI users and the larger ToBI community
PAPER Special Section on Statistical Modeling for Speech Processing Trigger-Based Language Model Adaptation for Automatic Transcription of Panel Discussions
SUMMARY We present a novel trigger-based language model adaptation method oriented to the transcription of meetings. In meetings, the topic is focused and consistent throughout the whole session, therefore keywords can be correlated over long distances. The trigger-based language model is designed to capture such long-distance dependencies, but it is typically constructed from a large corpus, which is usually too general to derive taskdependent trigger pairs. In the proposed method, we make use of the initial speech recognition results to extract task-dependent trigger pairs and to estimate their statistics. Moreover, we introduce a back-off scheme that also exploits the statistics estimated from a large corpus. The proposed model reduced the test-set perplexity considerably more than the typical triggerbased language model constructed from a large corpus, and achieved a remarkable perplexity reduction of 44% over the baseline when combined with an adapted trigram language model. In addition, a reduction in word error rate was obtained when using the proposed language model to rescore word graphs. key words: speech recognition, language model, trigger-based language model, TF/ID
Spartan Daily, November 21, 1994
Volume 103, Issue 57https://scholarworks.sjsu.edu/spartandaily/8627/thumbnail.jp
Spartan Daily, September 25, 1985
Volume 85, Issue 19https://scholarworks.sjsu.edu/spartandaily/7340/thumbnail.jp
Clearing the Transcription Hurdle in Dialect Corpus Building:The Corpus of Southern Dutch Dialects as Case Study
This paper discusses how the transcription hurdle in dialect corpus building can be cleared. While corpus analysis has strongly gained in popularity in linguistic research, dialect corpora are still relatively scarce. This scarcity can be attributed to several factors, one of which is the challenging nature of transcribing dialects, given a lack of both orthographic norms for many dialects and speech technological tools trained on dialect data. This paper addresses the questions (i) how dialects can be transcribed efficiently and (ii) whether speech technological tools can lighten the transcription work. These questions are tackled using the Southern Dutch dialects (SDDs) as case study, for which the usefulness of automatic speech recognition (ASR), respeaking, and forced alignment is considered. Tests with these tools indicate that dialects still constitute a major speech technological challenge. In the case of the SDDs, the decision was made to use speech technology only for the word-level segmentation of the audio files, as the transcription itself could not be sped up by ASR tools. The discussion does however indicate that the usefulness of ASR and other related tools for a dialect corpus project is strongly determined by the sound quality of the dialect recordings, the availability of statistical dialect-specific models, the degree of linguistic differentiation between the dialects and the standard language, and the goals the transcripts have to serve.</p
- …