1,348 research outputs found

    Heat of Discussion: A New Approach to Understanding Parliamentary Discussion

    Get PDF
    This paper offers an overview of the video retrieval system we have developed for the Japanese Diet. With our video retrieval system one can directly retrieve the video feed segment of interest, gain a visual understanding of the flow of parliamentary debate, and check the facial expressions and body language of the speaker. In this paper, we demonstrate how one can retrieve video streaming on user terminals that do not support Japanese language input, and suggest a variety of ways in which our video retrieval system can be utilized. Also, we report a first systematic analysis on the correspondence between the official minutes and the results of speech recognition of recordings of parliamentary meetings. Departing from tradition of focusing on written official minutes, we investigate the variation in the rate of correspondence and understand complex and multifaceted nature of parliamentary discussion. We believe that our system encourages research on the utilization of visual information in policymaking and marks a step toward the provision of universal access to policy information.This work is supported by JSPS Kakenhi Grant Number 15H05727 and based on a paper prepared for presentation at the 25th IPSA World Congress of Political Science, Brisbane, Australia, July 21 - 26, 2018.http://www.grips.ac.jp/list/jp/facultyinfo/masuyama_mikitaka

    Linking Parliamentary Minutes and Videos in the Japanese Diet

    Get PDF
    This paper offers an overview of the video retrieval system we have developed for the Japanese Diet. With our video retrieval system one can directly retrieve the video feed segment of interest, gain a visual understanding of the flow of parliamentary debate, and check the facial expressions and body language of the speaker. In this paper, we demonstrate how one can retrieve video streaming on user terminals that do not support Japanese language input, and suggest a variety of ways in which our video retrieval system can be utilized. Also, we report a preliminary analysis on the correspondence between the official minutes and the results of speech recognition of recordings of parliamentary meetings. We believe that our system encourages research on the utilization of visual information in policy-making and marks a step toward the provision of universal access to policy information.This work is supported by JSPS Kakenhi Grant Number 15H05727 and the Diet Archives Project funded by the GRIPS Policy Research Center.An earlier version (Masuyama 2016b) was presented at the 2016 General Conference of the European Consortium for Political Research, Charles University, Prague, Czech Republic, September 7-10, 2016.http://www.grips.ac.jp/list/jp/facultyinfo/masuyama_mikitaka

    Clearing the transcription hurdle in dialect corpus building : the corpus of Southern Dutch dialects as case-study

    Get PDF
    This paper discusses how the transcription hurdle in dialect corpus building can be cleared. While corpus analysis has strongly gained in popularity in linguistic research, dialect corpora are still relatively scarce. This scarcity can be attributed to several factors, one of which is the challenging nature of transcribing dialects, given a lack of both orthographic norms for many dialects and speech technological tools trained on dialect data. This paper addresses the questions (i) how dialects can be transcribed efficiently and (ii) whether speech technological tools can lighten the transcription work. These questions are tackled using the Southern Dutch dialects (SDDs) as case study, for which the usefulness of automatic speech recognition (ASR), respeaking, and forced alignment is considered. Tests with these tools indicate that dialects still constitute a major speech technological challenge. In the case of the SDDs, the decision was made to use speech technology only for the word-level segmentation of the audio files, as the transcription itself could not be sped up by ASR tools. The discussion does however indicate that the usefulness of ASR and other related tools for a dialect corpus project is strongly determined by the sound quality of the dialect recordings, the availability of statistical dialect-specific models, the degree of linguistic differentiation between the dialects and the standard language, and the goals the transcripts have to serve

    Semi-Supervised Acoustic Model Training by Discriminative Data Selection from Multiple ASR Systems' Hypotheses

    Get PDF
    While the performance of ASR systems depends on the size of the training data, it is very costly to prepare accurate and faithful transcripts. In this paper, we investigate a semisupervised training scheme, which takes the advantage of huge quantities of unlabeled video lecture archive, particularly for the deep neural network (DNN) acoustic model. In the proposed method, we obtain ASR hypotheses by complementary GMM-and DNN-based ASR systems. Then, a set of CRF-based classifiers is trained to select the correct hypotheses and verify the selected data. The proposed hypothesis combination shows higher quality compared with the conventional system combination method (ROVER). Moreover, compared with the conventional data selection based on confidence measure score, our method is demonstrated more effective for filtering usable data. Significant improvement in the ASR accuracy is achieved over the baseline system and in comparison with the models trained with the conventional system combination and data selection methods

    Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition

    Get PDF
    This article describes an efficient training method for online streaming attention-based encoder-decoder (AED) automatic speech recognition (ASR) systems. AED models have achieved competitive performance in offline scenarios by jointly optimizing all components. They have recently been extended to an online streaming framework via models such as monotonie chunkwise attention (MoChA). However, the elaborate attention calculation process is not robust against long-form speech utterances. Moreover, the sequence-level training objective and time-restricted streaming encoder cause a nonnegligible delay in token emission during inference. To address these problems, we propose CTC synchronous training (CTC-ST), in which CTC alignments are leveraged as a reference for token boundaries to enable a MoChA model to learn optimal monotonie input-output alignments. We formulate a purely end-to-end training objective to synchronize the boundaries of MoChA to those of CTC. The CTC model shares an encoder with the MoChA model to enhance the encoder representation. Moreover, the proposed method provides alignment information learned in the CTC branch to the attention-based decoder. Therefore, CTC-ST can be regarded as self-distillation of alignment knowledge from CTC to MoChA. Experimental evaluations on a variety of benchmark datasets show that the proposed method significantly reduces recognition errors and emission latency simultaneously. The robustness to long-form and noisy speech is also demonstrated. We compare CTC-ST with several methods that distill alignment knowledge from a hybrid ASR system and show that the CTC-ST can achieve a comparable tradeoff of accuracy and latency without relying on external alignment information

    Proceedings of the COLING 2004 Post Conference Workshop on Multilingual Linguistic Ressources MLR2004

    No full text
    International audienceIn an ever expanding information society, most information systems are now facing the "multilingual challenge". Multilingual language resources play an essential role in modern information systems. Such resources need to provide information on many languages in a common framework and should be (re)usable in many applications (for automatic or human use). Many centres have been involved in national and international projects dedicated to building har- monised language resources and creating expertise in the maintenance and further development of standardised linguistic data. These resources include dictionaries, lexicons, thesauri, word-nets, and annotated corpora developed along the lines of best practices and recommendations. However, since the late 90's, most efforts in scaling up these resources remain the responsibility of the local authorities, usually, with very low funding (if any) and few opportunities for academic recognition of this work. Hence, it is not surprising that many of the resource holders and developers have become reluctant to give free access to the latest versions of their resources, and their actual status is therefore currently rather unclear. The goal of this workshop is to study problems involved in the development, management and reuse of lexical resources in a multilingual context. Moreover, this workshop provides a forum for reviewing the present state of language resources. The workshop is meant to bring to the international community qualitative and quantitative information about the most recent developments in the area of linguistic resources and their use in applications. The impressive number of submissions (38) to this workshop and in other workshops and conferences dedicated to similar topics proves that dealing with multilingual linguistic ressources has become a very hot problem in the Natural Language Processing community. To cope with the number of submissions, the workshop organising committee decided to accept 16 papers from 10 countries based on the reviewers' recommendations. Six of these papers will be presented in a poster session. The papers constitute a representative selection of current trends in research on Multilingual Language Resources, such as multilingual aligned corpora, bilingual and multilingual lexicons, and multilingual speech resources. The papers also represent a characteristic set of approaches to the development of multilingual language resources, such as automatic extraction of information from corpora, combination and re-use of existing resources, online collaborative development of multilingual lexicons, and use of the Web as a multilingual language resource. The development and management of multilingual language resources is a long-term activity in which collaboration among researchers is essential. We hope that this workshop will gather many researchers involved in such developments and will give them the opportunity to discuss, exchange, compare their approaches and strengthen their collaborations in the field. The organisation of this workshop would have been impossible without the hard work of the program committee who managed to provide accurate reviews on time, on a rather tight schedule. We would also like to thank the Coling 2004 organising committee that made this workshop possible. Finally, we hope that this workshop will yield fruitful results for all participants

    Automatic Speech Recognition (ASR) and NMT for Interlingual and Intralingual Communication: Speech to Text Technology for Live Subtitling and Accessibility.

    Get PDF
    Considered the increasing demand for institutional translation and the multilingualism of international organizations, the application of Artificial Intelligence (AI) technologies in multilingual communications and for the purposes of accessibility has become an important element in the production of translation and interpreting services (Zetzsche, 2019). In particular, the widespread use of Automatic Speech Recognition (ASR) and Neural Machine Translation (NMT) technology represents a recent development in the attempt of satisfying the increasing demand for interinstitutional, multilingual communications at inter-governmental level (Maslias, 2017). Recently, researchers have been calling for a universalistic view of media and conference accessibility (Greco, 2016). The application of ASR, combined with NMT, may allow for the breaking down of communication barriers at European institutional conferences where multilingualism represents a fundamental pillar (Jopek Bosiacka, 2013). In addition to representing a so-called disruptive technology (Accipio Consulting, 2006), ASR technology may facilitate the communication with non-hearing users (Lewis, 2015). Thanks to ASR, it is possible to guarantee content accessibility for non-hearing audience via subtitles at institutionally-held conferences or speeches. Hence the need for analysing and evaluating ASR output: a quantitative approach is adopted to try to make an evaluation of subtitles, with the objective of assessing its accuracy (Romero-Fresco, 2011). A database of F.A.O.’s and other international institutions’ English-language speeches and conferences on climate change is taken into consideration. The statistical approach is based on WER and NER models (Romero-Fresco, 2016) and on an adapted version. The ASR software solution implemented into the study will be VoxSigma by Vocapia Research and Google Speech Recognition engine. After having defined a taxonomic scheme, Native and Non-Native subtitles are compared to gold standard transcriptions. The intralingual and interlingual output generated by NMT is specifically analysed and evaluated via the NTR model to evaluate accuracy and accessibility

    SHIP Project Review 2003-a

    Get PDF

    Washington University Senior Undergraduate Research Digest (WUURD), Spring 2018

    Get PDF
    From the Washington University Office of Undergraduate Research Digest (WUURD), Vol. 13, 05-01-2018. Published by the Office of Undergraduate Research. Joy Zalis Kiefer, Director of Undergraduate Research and Associate Dean in the College of Arts & Scienc

    Substantive Representation of Women in Asian Parliaments

    Get PDF
    Combining data from nearly 100 interviews with national parliamentarians from ten Asian countries, the contributors to this book analyze and evaluate the advancement of gender equality in Asia. As of the year 2022, no country in Asia has gender parity in its parliament. Meanwhile, the proportion of national-level women parliamentarians in Asia averages a mere 20%. What is more important than simple descriptive representation, however, is whether outcomes for women are improving. Rather than focusing on numerical representation, the chapters in this book focus on the substantive representation of women. In other words, what do women and men parliamentarians do to advance women’s well-being and gender equality? Using semi-structured interviews, the author of each chapter examines these efforts in the context of a specific Asian country. The case studies include Bangladesh, Indonesia, Japan, Malaysia, Nepal, the Philippines, South Korea, Sri Lanka, Taiwan, and Timor-Leste. The book is an essential resource for scholars and students of Asian politics and the politics of gender
    corecore