167,626 research outputs found

    Investigating techniques for low resource conversational speech recognition

    Get PDF
    International audienceIn this paper we investigate various techniques in order to build effective speech to text (STT) and keyword search (KWS) systems for low resource conversational speech. Sub-word decoding and graphemic mappings were assessed in order to detect out-of-vocabulary keywords. To deal with the limited amount of transcribed data, semi-supervised training and data selection methods were investigated. Robust acoustic features produced via data augmentation were evaluated for acoustic modeling. For language modeling, automatically retrieved conversational-like Webdata was used, as well as neural network based models. We report STT improvements with all the techniques, but interestingly only some improve KWS performance. Results are reported for the Swahili language in the context of the 2015 OpenKWS Evaluation

    The Implementation of “Mastering Vocabulary before Teaching”: A Case Study in Intensive English Course of Language and Culture Development Center (PBB)

    Get PDF
    SUUCIANALISMY: THE IMPLEMENTATION OF “MASTERING VOCABULARY BEFORE TEACHING” A CASE STUDY IN INTENSIVE ENGLISH COURSE OF LANGUAGE AND CULTURE DEVELOPMENT CENTER The English teaching as foreign language in Indonesia was done through decades by using Grammar-Translation method. The objective, the content and the evaluation of the teaching were mostly dominated by structure and reading. Rote learning is one of the basic principles of Grammar-Translation method. The main characteristic of rote learning is listed bilingual vocabulary memorization. According to some theorist, rote learning only provides students superficial and short-term learning words. So, it is considered less effective for improving students vocabulary acquisition in order to build their communicative skill. In contrary, the Language and Culture Development Center of Syekh Nurjati State Institute uses the rote learning to be used by students in intensive English course. This situation is necessary to be questioned since the development of language teaching has a very significant development in producing both new and modern methods, techniques and strategies for teaching vocabulary. The research is mainly conducted by qualitative methodology research. It aims to investigate the implementation of the concept of ‘Mastering Vocabulary Before Teaching’ by rote vocabulary learning in the intensive English Course program of Language and Culture Development Center (PBB) of Syekh Nurjati State Institute for Islamic Studies Cirebon. The researcher found that the students feel difficult to memorize the vocabulary. Not only they feel difficult, but also the students did not interest to do the memorization. In other side, because the rote learning only contribute superficial vocabulary understanding, thus, rote learning does not significantly help the students to reach the goal of the implementation of “Mastering Vocabulary before Teaching”. Further, here are some problems in the vocabulary selection that affect strongly to the effectiveness of the vocabulary learning. The problems contribute negatively to the students motivation to meorize the vocabulary. The vocabulary enrichment and evaluation in classroom are employed to help the students’ memorization. There are various techniques used by the lecturers for the vocabulary enrichment activities and evaluation. The research found it helps the students vocabulary learning. The researcher found that the rote learning did not help the students to acquire vocabulary significantly. The students only recognize words not to master the vocabulary. And because the rote learning is projected as the obligatory task so the students have low appreciation and motivation to memorize the vocabulary

    Thesaurus-assisted search term selection and query expansion: a review of user-centred studies

    Get PDF
    This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing on studies which adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken on the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summaries the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections covering, firstly studies on thesaurus-aided search term selection and secondly those dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach

    VoG: Summarizing and Understanding Large Graphs

    Get PDF
    How can we succinctly describe a million-node graph with a few simple sentences? How can we measure the "importance" of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a "vocabulary" of subgraph-types that often occur in real graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the most succinct description of a graph in terms of this vocabulary. We measure success in a well-founded way by means of the Minimum Description Length (MDL) principle: a subgraph is included in the summary if it decreases the total description length of the graph. Our contributions are three-fold: (a) formulation: we provide a principled encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop \method, an efficient method to minimize the description cost, and (c) applicability: we report experimental results on multi-million-edge real graphs, including Flickr and the Notre Dame web graph.Comment: SIAM International Conference on Data Mining (SDM) 201

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    {VoG}: {Summarizing} and Understanding Large Graphs

    Get PDF
    How can we succinctly describe a million-node graph with a few simple sentences? How can we measure the "importance" of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a "vocabulary" of subgraph-types that often occur in real graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the most succinct description of a graph in terms of this vocabulary. We measure success in a well-founded way by means of the Minimum Description Length (MDL) principle: a subgraph is included in the summary if it decreases the total description length of the graph. Our contributions are three-fold: (a) formulation: we provide a principled encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop \method, an efficient method to minimize the description cost, and (c) applicability: we report experimental results on multi-million-edge real graphs, including Flickr and the Notre Dame web graph
    • …
    corecore